This parameter allows you to specify how to normalize Chinese, Japanese, and Korean data before extraction, in all Eduction Server components.
You can specify the value of CJKNormalization
as follows:
Kana
. Half width kana to full width kana.
OldNew
. Old kanji to new kanji.
Number
. Chinese or kanji number characters to ASCII number characters.
HWNum
. Full width number characters to ASCII number characters.
HWAlpha
. Full width alphabet characters to ASCII alphabet characters.
SimpChi
. Traditional Chinese to simplified Chinese.
FWJamo
. Half width jamo to full width jamo.
Separate multiple options with a comma.
Note: If you specify the CJKNormalization
action, this overrides any settings you make using the CJKNormalization
configuration parameter.
Action: | EduceFromFile, EduceFromText, RedactFromFile, RedactFromText |
---|---|
Type: | String |
Default: | None |
Example: |
|
See Also: |
|