Post-processing performs additional processing on the matches that are found by Eduction.
A common reason for post-processing is to validate matches. Some entities, such as credit card numbers, can be validated by calculating a checksum. A match with an invalid checksum can be discarded, because even though it matches the correct format, it cannot be genuine. If a match has a valid checksum then you might increase its score, because it is likely to be valid.
Another reason for post-processing is to normalize the output from Eduction. For example, if you are extracting monetary values Eduction might find matches that look like "£5.3 million" or "£25". You can use post-processing to normalize these values to "£5,300,000" and "£25", so that IDOL Content or another application can compare and sort the values correctly.
A post-processing task passes the matches found by Eduction into a Lua function, either one at a time or en masse (for more information about processing matches en masse, see Write a Lua Script for Post-Processing).
|