Input/Output Operations

Methods in the Filter Java API have signatures that support a variety of input and output methods. The input source can usually be a physical file accessed through a file path, a com.verity.api.SeekableInputStream or a standard java.io.InputStream. You can send the output to a file or java.io.OutputStream, or return it one chunk at a time in a byte array.

You can set the input source by calling the setInputSource method. Alternatively, you can supply it as a parameter when you use the doFilter, canFilter, canFilterEx, getDocFormatInfo, or getSummaryInfo methods.

File Content Extraction needs to access different parts of files while it is filtering. When the input source is a stream, OpenText recommends passing a SeekableInputStream into File Content Extraction, because it allows File Content Extraction to only read the parts of the stream it needs to read. If you use a Java InputStream, File Content Extraction must store the stream as it is received, writing to a temporary file if the stream is large.

If you use a Java InputStream as the source, there are two available method signatures for functions. One method signature allows you to pass in the stream size. If you do not supply the stream size, File Content Extraction reads the entire stream before processing starts. If you can provide the stream size, File Content Extraction might not need to read the whole stream.