General Language

You can specify a General language, which the Content component uses for documents with an unconfigured language, but whose encoding is identified. Content classifies documents as the General language when:

If Content detects an unconfigured language type, it indexes to the equivalent General language type for that encoding, if it exists. It also logs a warning message in the index log so that you can add an appropriate language type to the configuration file. Content also indexes unknown languages to the General language type for the encoding, if it exists. If the encoding is unknown, Content indexes the document to the default language.

You can configure Content to discard documents that have unconfigured languages, even if a General language exists for that encoding. See DiscardUnconfiguredLanguagesAtIndex.