Reject Documents with Binary Content

The BinaryFileFilter task rejects any documents that have been filtered as binary. This can occur when KeyView filtering fails, for example due to corrupt files.

When CFS detects a non-UTF8 character, it replaces the character with a hexadecimal character code. The BinaryFileFilter task detects these character codes and rejects documents where the proportion exceeds the limit set by the ThresholdPercent parameter.

The BinaryFileFilter task can be configured as a Post task. The parameters that are passed to the task are specified in a named section of the configuration file. For example:

[ImportTasks]
Post0=BinaryFileFilter:BinaryFileFilterSettings

[BinaryFileFilterSettings]
ThresholdPercent=10
OnErrorIndexerSections=IdolErrorServer
IndexDatabase=IdolErrorReview

For information about the parameters that you can use to configure this task, refer to the Connector Framework Server Reference.