ExtractExport

This program demonstrates the File Extraction interface and basic functionality of the Export interface. The HtmlTest sample program demonstrates more advanced functionality of the Export interface. See HtmlTest.

The ExtractExport program demonstrates the following functionality:

  • opens a document

  • extracts subfiles from a document

  • repeats subfile extraction until all subfiles are extracted

  • sets conversion options through a template file

  • converts the subfile (or subfiles) and main file to HTML or XML

  • enables you to specify the command-line options listed in Options for the ExtractExport Sample Program

NOTE: This sample program demonstrates how to export from a java.io.InputStream object. However, OpenText recommends that you implement a com.verity.api.SeekableInputStream and pass this into KeyView instead. OpenText recommends this option because it allows KeyView to seek in the file, only reading the parts it needs to read. For more information, see Input/Output Operations.

To run ExtractExport

  1. Add the location of the javaapi\KeyView.jar file, the javaapi\sample directory, and the Export bin directory to the CLASSPATH environment variable.

  2. Run the program as follows:

    java -Djava.library.path=bin_directory ExtractExport [options] bin_directory inifile input_file output_file

    where:

    • bin_directory is the path to the Export bin directory.
    • options is one or more of the options listed in Options for the ExtractExport Sample Program.
    • inifile is the path and file name of a template file.
    • input_file is the path and file name of the source file.
    • output_file is the path and file name of the output file if the source file is not a container file.

Options for the ExtractExport Sample Program

Option

Description

-extonly This option extracts the subfiles from a source file, but does not convert the files after extraction.
-extdir directory This option sets the suggested directory to which the subfiles are extracted.
-ext-fbody This option extracts the formatted version of the message body (HTML or RTF) from mail files when possible.
-xml

This option converts the files to XML. The default is HTML. To use this option, XML Export must be installed.

-source-cs charset

This option sets the character set of the source file.

charset is a character set defined as a constant in the Export class. See Coded Character Sets.

-target-cs charset

This option sets the character set of the output file.

charset is a character set defined as a constant in the Export class. See Coded Character Sets.

-little-end This option sets the byte order for Unicode text to little endian.
-is This option sets the input as a stream. The default is file.
-os This option sets the output as a stream. The default is file.
-open-user username This option specifies the user name used to open a protected PST or NSF file.
-open-pass password This option specifies the password used to open a protected PST or NSF file.
-open-idfile idfile This option specifies the user ID file used to open a protected PST or NSF file.
-open-createroot This option creates a root directory on which a hierarchy can be based. See Create a Root Node.
-ext-nodir This option specifies that the subfile directory structure is not created.
-ext-noheader This option excludes mail header information from extracted message body text file. See Mail Metadata.
-meta outfile This option extracts default mail metadata and writes it to a file.
-oop This option converts the files in a separate process. See Convert Files Out-of-Process.
-ip This option runs file extraction in the same process as the calling application (in process). See Convert Files Out-of-Process.