Character Encoding

To ensure that all filtered text is output in the same character encoding, KeyView performs character encoding conversion. In most cases, if your license includes advanced character set detection, KeyView can detect the character encoding used in a source file, and automatically outputs filtered text in the encoding you choose. OpenText recommends that you specify your preferred target encoding. In the rare cases where KeyView cannot detect the character encoding used in a source file, you can also specify the source encoding.

Specify a Target Character Encoding

OpenText recommends that you specify a target character encoding when you initialize KeyView, and recommends using UTF-8 or UTF-16 because these are widely supported and can encode a diverse range of characters. To see which encodings you can use as the target encoding, see Coded Character Sets.

To specify a target character encoding

  • In the .NET API, call the method OutputCharset on your session configuration. For example:

    session.Config().OutputCharset(CharSet.KVCS_UTF8);

Performance Considerations

When a file format does not specify a character encoding, KeyView attempts to detect the encoding automatically. Some character encodings, including UTF-8 and UTF-16, can be detected by core KeyView functionality but others can be detected only if your license includes advanced character set detection. Advanced character set detection is enabled by default (if it is included in your license), but can increase the time required to filter some documents.

You can disable advanced character set detection on a file-by-file basis. Before doing this, be aware that KeyView cannot output filtered text in your chosen encoding unless it detects the encoding of the source file, or you specify the source encoding yourself though the API.

To disable advanced character set detection

  • In the .NET API, call the method CharacterSetDetection on your session configuration:

    session.Config().CharacterSetDetection(false);

Specify a Source Character Encoding

In most cases, KeyView can automatically detect the character encoding of an input file and specifying a source encoding is not necessary. You might need to specify the source character encoding if you have disabled advanced character set detection.

To specify the source character encoding

  • In the .NET API, call the method SourceCharset on your session configuration.

    session.Config().SourceCharset(CharSet.KVCS_8859_1);