General Language
You can specify a General language, which the Content component uses for documents with an unconfigured language, but whose encoding is identified. Content classifies documents as the General language when:
-
Content cannot read the language type nor the encoding of a document from a specified field
-
Automatic Language Detection is enabled.
If Content detects an unconfigured language type, it indexes to the equivalent General
language type for that encoding, if it exists. It also logs a warning message in the index log so that you can add an appropriate language type to the configuration file. Content also indexes unknown languages to the General
language type for the encoding, if it exists. If the encoding is unknown, Content indexes the document to the default language.
You can configure Content to discard documents that have unconfigured languages, even if a General language exists for that encoding. See DiscardUnconfiguredLanguagesAtIndex.