Configure a Pre-Filter Task
For each pre-filter task that you want to configure, you set:
-
a regular expression that specifies how to find potential matches, or a resource file that provides a dictionary of terms to use for fast matching.
-
the amount of text Eduction must use on either side of the potential match to find the more detailed match.
NOTE: Eduction runs all your configured pre-filtering tasks for all input text, so ensure that your pre-filter task applies to all your configured grammars and entities. Use a different configuration for any entities that you do not want to pre-filter.
To configure a pre-filter task
-
In the
[Eduction]
section, add aPreFilterTaskN
parameter, whereN
is a number starting from 0 for the first task. Set this parameter to the name of a configuration section where you define your pre-filter task. -
Create the new configuration section.
-
Set one of the following parameters:
-
Regex
to a regular expression value that finds potential matches in your text. -
ResourceFile
to the name of a DPF or JSON file that contains the dictionary of terms to use for pre-matching.
-
-
Set
WindowCharsBeforeMatch
andWindowCharsAfterMatch
to the number of characters before and after the potential match segment to use as the match window. -
Optionally set other parameters to exclude non-valid values or end processing early in certain conditions, such as
Exclusion
,InvalidRegexAfterMatch
,InvalidRegexBeforeMatch
, andPrefilterMaxReturnedBytes
. For more information, see Eduction Parameter Reference. -
Save and close your configuration file.
For example:
[Eduction] PrefilterTask0=AddressPrefilter [AddressPrefilter] Regex=\d{1,7} WindowCharsBeforeMatch=100 WindowCharsAfterMatch=100
For more details about these parameters, see Eduction Parameter Reference.
TIP: To use pre-filtering tasks through the C and Java Eduction APIs, you must create your Eduction engine from a configuration file. See Standalone API Usage (C) or Standalone API Usage (Java).