To run the IDOL Content component in multiple languages, specify the language types that you want Content to process. A language type is a combination of the language and encoding.
NOTE: You must specify languages and language types before you index data into Content.
Open the IDOL Content component configuration file in a text editor.
Find the [LanguageTypes]
section and list the languages that you want Content to process. You must use UTF-8 characters when specifying a language.
For example:
[LanguageTypes] 0=English 1=Afrikaans 2=Albanian 3=Arabic 4=Armenian 5=Azeri
In the [LanguageTypes]
section, set any configuration parameters that you want to apply to all languages. For details of the configuration parameters you can use, refer to the IDOL Server Reference.
NOTE: As well as the general language configuration parameters, you can set any of the individual language configuration parameters in the [LanguageTypes]
section. The value in this section sets the default value for all languages, which you can override in the individual language configuration sections.
For example:
[LanguageTypes] DefaultLanguageType=englishUTF8 DefaultEncoding=UTF8 LanguageDirectory=C:\IDOLserver\IDOL\langfiles GenericTransliteration=True StopWordIndex=1 ProperNames=3 TangibleCharacters=!?
For each language that you use, create a section using the name of the language.
In this section, specify appropriate settings that determine how Content handles this language. For details on the configuration parameters you can use, refer to the IDOL Server Reference.
For each section, add the Encodings
parameter and define the encodings and corresponding language types used by the language.
For example:
[english] Encodings=UTF8:englishUTF8 Stoplist=english.dat IndexNumbers=1 [afrikaans] Encodings=UTF8:afrikaansUTF8 IndexNumbers=1 [albanian] Encodings=UTF8:albanianUTF8 IndexNumbers=1 [arabic] Encodings=ARABIC_ISO:arabicARABIC_ISO,ARABIC:arabicARABIC,UTF8:arabicUTF8 IndexNumbers=1 [armenian] Encodings=UTF8:armenianUTF8 IndexNumbers=1 [azeri] Encodings=UTF8:azeriUTF8 IndexNumbers=1 [general] Encodings=UTF8:generalUTF8,CYRILLIC:generalCYRILLIC IndexNumbers=1
Save the configuration file.
You can now configure Content to associate the language types that you defined with documents.
|