filter

The filter sample program demonstrates the advanced functionality of the Filter API. It is composed of the following files:

  • filter.c—command line interface
  • filtersupport.c—contains core functionality, such as file filtering, stream filtering, metadata extraction, and format detection.
  • filtersupport.h—structure and variable definitions

To run filter, type the following at the command line:

filter [options] input_file output_file

where:

options is one or more of the options listed in Options for the Filter Sample Program .

input_file is the full path and file name of the source document.

output_file is the full path and file name of the output file.

Options for the Filter Sample Program

Option Description
-i Extract metadata. See Use the Metadata API.
-c Run Filter in the same process as the calling application (in process). See Run Filter In Process.
-e Run Filter in stream mode.
-h Extract headers and footers, as well as the body text.
-d Extract the file format information using the fpGetDocInfoFile() function.
-L Enable error logging. See Enable or Disable Out-of-Process Error Logging. Error logs are not generated when in-process filtering is enabled.
-LN Disable error logging. See Enable or Disable Out-of-Process Error Logging. Error logs are not generated when in-process filtering is enabled.
-AF Include the input file name in an error log. See Report the File Name in Stream Mode.
-r Filter a container file and the subfiles in the container file to a single output file. This option uses the Container API, which is obsolete.
-rm If you set this option, text that was deleted from a document with revision tracking enabled is extracted from the document and included in the filtered output. See Filter Deleted Text.
-x xmlconfigfile

Filter an XML file by using customized extraction settings defined in the kvxconfig.ini file. If you do not enter the full path to the INI file, the program looks for the file in the current working directory.

See Filter XML Files for more information.

-z tempdirectory Specify a temporary directory where temporary files generated by the filtering process are stored. The default is the current working directory.
-ps password Specify a password to open a password-protected PST file. This option uses the Container API, which is obsolete.
-pdfauto Specify that PDF files are output in a logical reading order. The PDF filter determines the paragraph direction (left-to-right or right-to-left) for each PDF page, and then sets the direction accordingly. See Filter PDF Files.
-pdfltr Specify that PDF files are output in a logical reading order, and that the paragraph direction is left to right. See Filter PDF Files.
-pdfrtl Specify that PDF files are output in a logical reading order, and that the paragraph direction is right to left. See Filter PDF Files.
-pdfraw Specify that PDF files are output in an unstructured paragraph flow. This is the default option . If logical reading order is enabled, and you want to return to an unstructured paragraph flow, set this flag. See Filter PDF Files.
-xmp Parse and return XMP metadata as path and value pairs, and include the original XMP packet. See fpGetXmpInfoFile() and fpGetXmpInfo().
-xmpr Return XMP metadata as a raw XMP packet. See fpGetXmpInfoFile() and fpGetXmpInfo().
-embeddedfont If you use this option, text that contains embedded fonts is not filtered from PDF documents. See fpSetConfig().