Reject Documents by Word Length
The WordLengthFilter
task calculates the average length of words in a document. If the average length of words in the document content (DRECONTENT
) falls outside the limits specified by the MinimumAverage
or MaximumAverage
parameters, the document is rejected.
The WordLengthFilter
task can be configured as a Post task. The parameters that are passed to the task are specified in a named section of the configuration file. For example:
[ImportTasks] Post0=WordLengthFilter:WordLengthFilterSettings [WordLengthFilterSettings] MinimumAverage=3.0 MaximumAverage=10.0 OnErrorIndexerSections=IdolErrorServer IndexDatabase=IdolErrorReview
For information about the parameters that you can use to configure this task, refer to the Connector Framework Server Reference.