Configure Tangible Characters

TangibleCharacters is a configuration parameter that you can set when using the Eduction SDK, the Eduction Server, or the Eduction command-line utility (edktool). It specifies a list of characters to treat as part of a word, rather than as word boundaries.

Some of the entities in the IDOL Government Eduction Package Eduction Grammars require tangible characters to be set in order to perform correctly (see the descriptions of the entities in Eduction Grammar Reference).

When you use Eduction to search for matches, TangibleCharacters applies across all of your chosen entities. If you use multiple entities that have different recommended tangible character sets, you might need to take some extra steps. For example:

  • If you are using the Eduction SDK, create a separate EDK engine for each distinct set of tangible characters, and configure the tangible characters for the engine using the appropriate API call:

    C EdkSetTangibleCharacters
    Java EDKEngine.setTangibleCharacters

    After configuring an engine with the correct tangible characters, you can add the relevant entities. You will need to create a session from each engine to process your input text.

  • If you are using an Eduction Server, send a separate action (EduceFromText or EduceFromFile) for each distinct set of tangible characters. In each action, set the TangibleCharacters and Entities action parameters to specify which set of tangible characters and which entities to use.
  • If you are using the command line edktool, create a separate configuration file for each distinct set of tangible characters and associated entities, and process your input text once with each configuration file.

For more information about the TangibleCharacters configuration parameter, refer to the Eduction User Guide.