Character Encoding

To ensure that all filtered text is output in the same character encoding, KeyView performs character encoding conversion. In most cases, if your license includes advanced character set detection, KeyView can detect the character encoding used in a source file, and automatically outputs filtered text in the encoding you choose. OpenText recommends that you specify your preferred target encoding. In the rare cases where KeyView cannot detect the character encoding used in a source file, you can also specify the source encoding.

Specify a Target Character Encoding

OpenText recommends that you specify a target character encoding when you initialize KeyView, and recommends using UTF-8 or UTF-16 because these are widely supported and can encode a diverse range of characters. To see which encodings you can use as the target encoding, see Coded Character Sets.

To specify a target character encoding

  • In the .NET API, specify the target encoding when you instantiate the Filter object. For example:

    Filter myFilter = new Filter("YOUR_LICENSE", Charset.KVCS_UTF8, 0);

    After filtering, you can verify the output encoding by checking the value of the TargetCharSetEN property on the Filter object. If the result is KVCS_UNKNOWN, KeyView was unable to determine the source character encoding and therefore no conversion occurred. If you know the character encoding used in the source file you can specify it through the API - see Specify a Source Character Encoding.

Performance Considerations

When a file format does not specify a character encoding, KeyView attempts to detect the encoding automatically. Some character encodings, including UTF-8 and UTF-16, can be detected by core KeyView functionality but others can be detected only if your license includes advanced character set detection. Advanced character set detection is enabled by default (if it is included in your license), but can increase the time required to filter some documents.

You can disable advanced character set detection on a file-by-file basis. Before doing this, be aware that KeyView cannot output filtered text in your chosen encoding unless it detects the encoding of the source file, or you specify the source encoding yourself though the API.

To disable advanced character set detection

  • In the .NET API, set the property CharSetDetection on the Filter object:

    myFilter.CharSetDetection = false;

Specify a Source Character Encoding

In most cases, KeyView can automatically detect the character encoding of an input file and specifying a source encoding is not necessary. You might need to specify the source character encoding if you have disabled advanced character set detection.

To specify the source character encoding

  • In the .NET API, use the SourceCharSetEN property. For example:

    myFilter.SourceCharSetEN = Charset.KVCS_8859_1;

Disable Character Encoding Conversion

You can completely disable character encoding conversion, and specify that KeyView retain the original character encoding of the document.

To disable character encoding conversion

  • In the .NET API, set the flag FILTERFLAG_NODEFAULTCHARSETCONVERT when you instantiate the Filter object. For example:

    Filter myFilter = new Filter("YOUR_LICENSE", Charset.KVCS_UTF8,
        FilterConstant.FilterFlagsConstant.FILTERFLAG_NODEFAULTCHARSETCONVERT);