The TextToDocs
task splits a file into multiple documents (a main document, and one or more child documents). This task generates a number of metadata and DRECONTENT
documents. The original document is discarded and is not filtered using KeyView. You can specify how a file is divided using regular expressions.
The TextToDocs
task is always configured as a Pre task. The parameters that are passed to the task are specified in a named section of the configuration file. For example:
[ImportTasks] Pre0=TextToDocs:TextToDocsSettings [TextToDocsSettings] //Parameters to configure how to process the documents.
The TextToDocs
task expects documents to use UTF-8 character encoding. If your documents are not encoded in UTF-8 you can use the configuration parameter SourceEncoding
to specify the character set encoding of the source documents, so that they can be converted to UTF-8. If conversion fails, the original encoding is used and CFS adds an error message to the ImportErrorCode
and ImportErrorDescription
document fields.
For information about the parameters that you can use to configure this task, refer to the Connector Framework Server Reference.
|