CJKNormalization
This parameter allows you to specify how to normalize Chinese, Japanese, and Korean data before extraction.
You can set the following values:
Kana
. Normalize half width kana to full width kana.OldNew
. Normalize old kanji to new kanji.Number
. Normalize Chinese or kanji number characters to ASCII number characters.HWNum
. Normalize full width number characters to ASCII number characters.HWAlpha
. Normalize full width alphabet characters to ASCII alphabet characters.SimpChi
. Normalize traditional Chinese to simplified Chinese.FWJamo
. Normalize half width jamo to full width jamo.
Separate multiple options with a comma.
Type: | String |
Default: | None |
Required: | No |
Configuration Section: |
Eduction |
Example: | CJKNormalization=SimpChi,Kana
|
See Also: |