SplitDocument

Splits a document into its constituent parts (text content, XML metadata, and associated binary files) and outputs the parts to separate relationships. You can then process the parts separately as required. If you want to recombine the parts into a Knowledge Discovery document, you can route them to a MergeDocument processor.

If you split multiple documents and later route the parts to a MergeDocument processor, the correct parts are automatically merged to reproduce the original document FlowFiles. (There is no need to keep track of which parts originated from each document).

NOTE: The FlowFiles output by this processor are not in Knowledge Discovery document format.

Properties

None

Relationships

Name Description
content FlowFiles that contain text from a document.
externalfile Reserved for custom external file implementations.
failure FlowFiles that were not successfully processed.
file FlowFiles that contain binary file content from a document.
filename FlowFiles that contain a file path from a document.
metadata FlowFiles that contain XML metadata from a document.