Decompose Compound Words
You can configure the IDOL Content component to automatically separate a compound word into root words at both index and query time. For example, the German word for bicycle pump is a single word Fahrradpumpe that can be divided into Fahrrad and Pumpe.
-
Create a text file. Use the following format:
[UTF8] rollercoaster roller coaster hemidemisemiquaver hemi demi semi quaver
Each line defines the decomposition for one word. The first word on a line is broken into the remaining words on the line.
-
Store the text file in the
langfiles
directory of your IDOL Content component installation. -
Open the IDOL Content component configuration file in a text editor.
-
For each language that the decomposition file applies to, specify the file name in the
Decomposition
configuration parameter. For example:[German] DecompositionFile=german_decomp.txt
NOTE: Each of the terms in the output from the decomposition is also stemmed.