The size of the character N-grams to use to tokenize Asian text.
You must not use NGram
with the SentenceBreaking configuration parameter.
If you set NGram
for Japanese, you can use SentenceBreakingOptions for normalization.
Type: | Long |
Default: | 0 (off) |
Required: | No |
Configuration Section: | LanguageTypes or MyLanguage |
Example: | Encodings=UTF8:JapaneseUTF8
In this example, all text is indexed as N-grams of two characters. |
See Also: | NGramMultiByteOnly
NGramOrientalOnly SentenceBreaking SentenceBreakingOptions |
|