Enable Generic Transliteration
The default IDOL Content component configuration file uses generic transliteration. OpenText recommends that you use generic transliteration because it is the best way to ensure that cross-lingual search can happen.
- In the
[LanguageTypes]
configuration section, set the GenericTransliteration parameter toTrue
.
Generic transliteration performs transliteration as described in the following table.
Language or character type | Transliteration |
---|---|
Symbols | All dashes and hyphens to a hyphen character. |
Latin | Accented characters to non-accented characters |
Spanish | Accented vowels áéíóúü to non-accented vowels |
Portuguese | Accented vowels àáâãçéêíòóôõúü to non-accented vowels |
Greek | Accented Greek characters to non-accented characters |
Cyrillic (including Serbian extensions) | All characters mapped to A–Z |
Arabic | Arabic character normalization |
Japanese |
Half width katakana to full width katakana Full width 0–9, A–Z, a–z to single byte 0–9, A–Z, a–z |
Chinese | Full width 0–9, A–Z, a–z to single byte 0–9, A–Z, a–z |
For all other languages, transliteration does not apply, except for hyphen normalization.
NOTE: Languages with a sentence-breaking library might be transliterated as part of the sentence-breaking process.
When you set GenericTransliteration to True
, it applies to all languages, unless you specifically disable transliteration for a language.
You can disable transliteration for an individual language by setting the Transliteration parameter to False
in the individual language configuration section. This option completely disables transliteration for that language.