Entity Extraction (Eduction)

IDOL Eduction allows you to extract entities from text. You can use Eduction as part of your ingest process to enrich the data before you index it into IDOL, or you can use it separately.

Eduction is available in several different formats and packages, depending on how you want to use it in your IDOL architecture:

  • Eduction in ingest (CFS task or NiFi processor). This component is part of the IDOL Ingest process, and allows you to extract entities from documents, and add the information to document fields before you index. This process can make it easier to search for particular information after you index.

  • Eduction SDK. The SDK is available in C, Java, and .NET implementations, which allow you to program your own applications that use Eduction for entity extraction.

  • Eduction Server. An ACI server, which you can use to perform Eduction with standard ACI requests.

The particular setup you use depends on your usage and your wider IDOL architecture. For more information about which package to use, refer to the Eduction User and Programming Guide.

Eduction Server and the Eduction SDK are each available as a ZIP package, which you can download and install. See IDOL Installation and Setup. The Eduction NiFi processor is a standard part of the NiFi Ingest package, and Eduction in CFS is available as part of a standard CFS installation. for more information about installing these packages, see IDOL Ingest.

In addition to the Eduction package, you must install the Eduction grammars that you want to use. The grammars define the entities that you want to find. These are all available as a single Eduction grammars ZIP package. Your license determines which of the grammars you can use.

For more information about Eduction, refer to the Eduction User and Programming Guide. This guide includes information about the standard grammars. There are also separate guides for the premium grammars, the IDOL PII Package Technical Note, the IDOL PCI Package Technical Note, the IDOL PHI Package Technical Note, and the IDOL Government Eduction Package Technical Note.