IDOL server can process documents in multiple languages and encodings. For each language that you want to use, you must define the language types in the IDOL server configuration file. You must also configure IDOL server to classify documents, either by automatically detecting the language and encoding, or by reading the language type from a field.
To run HPE IDOL in multiple languages, specify the language types you want HPE IDOL to process. A language type is a combination of the language and encoding.
You must specify languages and language types before you index data into HPE IDOL.
Open the HPE IDOL configuration file in a text editor.
Find the [LanguageTypes]
section. List the languages that you want HPE IDOL to process. You must use ASCII characters to specify the language names.
For example:
[LanguageTypes] 0=English 1=Afrikaans 2=General
For each language, create a configuration section that matches the name you defined in the [LanguageTypes]
section.
In this section, specify appropriate settings that determine how HPE IDOL handles this language. For details on the configuration parameters you can use, refer to the HPE IDOL Server Reference.
For each section, set the Encodings
parameter to a list of the encodings and corresponding language types used by the language. List each encoding and language in the format encoding:languagetype
. Separate multiple language types with commas.
For example:
[english] Encodings=ASCII:englishASCII,UTF8:englishUTF8 Stoplist=english.dat IndexNumbers=1 [afrikaans] Encodings=ASCII:afrikaansASCII,UTF8:afrikaansUTF8 IndexNumbers=1 [general] Encodings=UTF8:generalUTF8,ASCII:generalASCII,CYRILLIC:generalC YRILLIC IndexNumbers=1
Save and close the configuration file.
Restart IDOL server for your changes to take effect.
You can now configure HPE IDOL to associate the language types you defined with documents.
After you define all the language types you want HPE IDOL to process, set up a field process that allows HPE IDOL to associate these language types with documents.
The way you configure the field process depends on the documents that you want to index:
If all the documents contains a field that exactly specifies the language type, configure a field process to define this field as a LanguageType
field. The language types that appear in this field must exactly match the language types that you define in the [LanguageTypes]
configuration section. See Configure Fields.
If the documents contain a field that specifies the language, but does not exactly specify the language type, you can configure field processes to detect the language from this field data.
Open the IDOL server configuration file in a text editor.
In the [FieldProcessing]
section, define a field process for each language that you want to detect.
For example:
[FieldProcessing] 0=DetectArabic 1=DetectEnglish 2=DetectFrench
Create a configuration section with the same name as each of the field processes you defined in the [FieldProcessing]
section.
In this section:
Set Property
to the name of the property for the specified language type.
Set PropertyFieldCSVs
to a comma-separated list of fields that can contain the language data.
Set PropertyMatch
to a comma-separated list of values that this field might contain to identify the specified language type.
For example:
[DetectArabic] Property=SetArabicProperty PropertyFieldCSVs=*/DRELANGUAGETYPE,*/LANG PropertyMatch=arabic [DetectEnglish] Property=SetEnglishProperty PropertyFieldCSVs=*/DRELANGUAGETYPE,*/LANG PropertyMatch=*eng*,uk,*british [DetectFrench] Property=SetFrenchProperty PropertyFieldCSVs=*/DRELANGUAGETYPE,*/LANG PropertyMatch=*fre*,fran*
Create a configuration section for each property that you define in the field processing sections.
In the property configuration section, set the LanguageType
parameter to the language type to use to define documents that match this property (that is, that contain a field with a matching value for the field process). This language type must match one of the language types you configure in the [LanguageTypes]
configuration section.
For example:
[SetArabicProperty] LanguageType=Arabic [SetEnglishProperty] LanguageType=English [SetFrenchProperty] LanguageType=French
Save and close the configuration file.
Restart IDOL server for your changes to take effect.
|