Source Code Identification

When KeyView auto-detects a file that contains source code, it can attempt to identify the programming language that it is written in.

When you do not enable source code identification, files containing source code may be identified as ASCII text files, causing the application to treat them in the same way as ordinary text. However, in many instances, it can be useful to route these files elsewhere or filter them out. For example, indexing source code into an IDOL index has minimal value and could bloat the engine with terms that are of no use in retrieval. You can use source code identification to identify files containing a particular programming language as a more specific format.

You can set source code identification to different levels.

Option Description
KVSOURCECODE_OFF Do not enable source code identification.
KVSOURCECODE_ENABLED Enable source code identification for the most common source code formats.
KVSOURCECODE_EXTENDED Enable source code identification for all supported source code formats. This option might lead to false positives in some cases (for example, a C++ file might get identified as a rarer format).

For the complete list of source code formats supported for both options, see Supported Formats.

To configure source code identification

  • In the Java API, call the setSourceCodeDetection method on the filter object, for example:

    filter.setSourceCodeDetection(Filter.SourceCodeDetection.ENABLED);
  • In formats.ini, set the following parameter to the appropriate level. (This is an alternative approach - you do not need to do this if you have configured this feature through the API).

    [Options]
    SourceCodeDetection=KVSOURCECODE_ENABLED