Introduction

Pre-filtering allows you to narrow down the amount of input text that Eduction processes for a particular set of entities. With pre-filtering, Eduction performs an initial quick matching step that finds sections of text that contain likely matches, rather than running the full match on the whole input.

Pre-filtering text can improve performance for some entities, when there is a broad way to find a potential match without either over-matching too much of the input text, or eliminating potential valid matches.

The quick matching step can either match text by using a regular expression (regex) that you configure, or a dictionary of terms.

For example:

The pre-filtering method is less useful for entities that match a long list of possible words, when there is no simple regular expression or dictionary of terms that matches all your possible entities. For example, for English names the Eduction grammars attempt to match plausible names as well as recognized ones, so there is no way to pre-filter without eliminating potential matches.