NGramMultiByteOnly
Whether to tokenize all strings into N-grams, or only multi-byte strings. Use this parameter in combination with the NGram parameter, which determines the size of character N-grams.
For example, if you set NGramMultiByteOnly
to True
, if a document that contains both English and Asian text, IDOL Content Component tokenizes the Asian text into N-grams according to the NGram setting. It does not tokenize the English text.
Type: | Boolean |
Default: | False |
Required: | No |
Configuration Section: | LanguageTypes or MyLanguage |
Example: | Ngram=2
|
See Also: | NGram |
NOTE: If you change this setting after you have indexed content into IDOL Server, the new setting applies only to new content, and the server logs a warning. To clear the warning and ensure that your change applies to all your content, you must initialize your index and reindex the content.