Package Contents

To get started with the Filter SDK, unzip the package to a directory on your machine.

TIP: The Filter SDK packages for non-Windows platforms store the correct file permissions for the files in the SDK (for example, some files must be executable) . OpenText recommends that you unzip the SDK on the platform where you intend to use it. If you unzip the SDK on Windows and then copy the files to a non-Windows machine, the file permissions might be lost.

The Filter SDK contains:

  • All the libraries and executables necessary for extracting text from a wide variety of formats.
  • The include files that define the C API. These files can be found in the include directory.

  • The Java API implemented in the package com.verity.api.filter contained in the file KeyView.jar.
  • The .NET API implemented in the namespace KeyView.Filter in the library filter_dotnet.dll.
  • The C++ API, which can be found in the cppapi folder.
  • The Python API, which can be found in the pythonapi folder.
  • Sample programs that demonstrate File Extraction and Filter functionality using the APIs.
  • The files necessary to create a custom document reader, and the source for a sample document reader for UTF-8. For more information refer to the section "Develop a Custom Reader" in the Filter SDK C Programming Guide.

    DEPRECATED: The ability to use custom readers is deprecated in File Content Extraction 25.1 and later. This feature is still available for existing implementations, but it might be incompatible with new functionality and might be removed in future. This means that in a future version, File Content Extraction will no longer be able to call new custom readers, or those that have already been developed.

    If you need to process file formats that are not supported out-of-the-box, OpenText recommends that you contact OpenText support.

Directory Structure

The following table describes the contents of the Filter SDK.

Directory Description
PLATFORM\bin Contains File Content Extraction libraries, supporting files, readers, and the formats.ini configuration file.
dotnetapi Contains the source files for the .NET API.
dotnetapi\API_reference Contains the API reference for the .NET API.
dotnetapi\sample Contains the sample programs for the .NET API.
cppapi Contains the source files for the C++ API.
cppapi\sample Contains the sample programs for the C++ API.
Guides Contains the Filter SDK programming guides.
include Contains the header files required for Filter.
javaapi\javadoc Contains the Javadoc for the Java API.
javaapi\sample Contains the source files and sample programs for the Java API.
pythonapi Contains the Python API.
Release Notes Contains the Release Notes.
samples\filter Contains the source code for the filter sample program demonstrating the Filter interface for the C API.
samples\pdfini Contains the initialization file used to extract custom metadata from PDF documents.
samples\tstxtract Contains a C sample program demonstrating the File Extraction interface.
samples\utf8sr Contains the source for the sample document reader for UTF-8 files. You can use this to create your own custom document readers.