Standalone API Usage

The Eduction Software Development Kit (SDK) C API allows C developers to interact directly with the Eduction engine.

The recommended way to create an engine is by using an engine factory. Your application's first call to the Eduction API should be to create a factory with EdkFactoryCreateWithLicenseKey, which requires you to supply a valid license key.

There are several ways of creating an engine. You can create an empty engine, by calling EdkFactoryMakeEngine. Alternatively you can create a pre-configured engine, by calling EdkFactoryMakeEngineFromConfigFile (to use a configuration file on disk) or EdkFactoryMakeEngineFromConfigBuffer (to use a configuration that exists in memory).

If you create an engine from a configuration file, no further configuration is necessary, because it will reflect the settings given in that file. Otherwise, you should configure the engine to set its matching behavior. One or more grammar files must be loaded into the engine. At least one entity exposed by the loaded grammars must be specified, to tell the engine what patterns to search for in the input text.

When no more engines are required, the factory can be disposed of using EdkFactoryDestroy. This releases the memory used internally by the factory. Engines remain valid even after the factory that created them has been destroyed.

The input data is processed in an Eduction session. Multiple sessions can be created for an Eduction engine. All sessions associated with an engine process data using the configuration for that engine, including the selected grammars and entities. You cannot change the engine settings after creating a session. Each session maintains its own state, so the sessions can be run concurrently in a multi-threaded application.

Once created, a session can process multiple documents. Data must be UTF-8 encoded. It can either be pulled (streamed) or pushed (added). A function is called to get the next available match. This can be called repeatedly to cycle through all the matches. The text and properties associated with each match can be retrieved using accessor function calls.

A session can continue for as long as necessary. It must, however, be destroyed before the engine it is associated with is destroyed. The call to destroy the engine should be your application's last call to the Eduction API.

This section describes the basic structure of a stand-alone application using the API. For an example of this process, see the source code in the example files (see Example Programs).

Typically, your application takes the following actions:

  1. Include edk.h.

    Copy
    #include "edk.h"
  2. Create an engine factory.

    Copy
    // Embed the license key into the application
    const char *licensekey = "..."; 
    EdkFactoryHandle factory = NULL;
    EdkError error = EdkFactoryCreateWithLicenseKey(&factory, licensekey);

    // Check the return value of any API call for success or error
    if (EdkSuccess != error)
    {
        printf("Error while creating: %d.\n", error);
        return -1;
    }
  3. Instantiate the engine and obtain an engine handle. You can call EdkFactoryMakeEngineFromConfigFile to create an engine from an appropriate configuration file.

    Copy
    EdkEngineHandle engine = NULL;
    error = EdkFactoryMakeEngineFromConfigFile(factory, &engine, "engine.cfg");

    Alternatively, you can create the engine without a configuration, by calling EdkFactoryMakeEngine. In this case, you must configure the engine. For example:

    • load the grammar files to use for matching (by calling EdkLoadResourceFile one or more times).
    • choose the entities to use for matching (by calling EdkAddTargetEntity).
    • set optional parameters.
    Copy
    // Create engine without configuration
    EdkEngineHandle engine = NULL;
    error = EdkFactoryMakeEngine(factory, &engine);

    // Load grammar file
    error = EdkLoadResourceFile(engine, "test.ecr");

    // Choose entities to use
    error = EdkAddTargetEntity(engine, "myentity");

    // Set optional parameters
    error = EdkSetTokenWithPunctuation(engine, true);
    error = EdkSetMaxMatchLength(engine, 12);
  4. Create a session associated with the engine, and obtain a session handle. You can create and run concurrent sessions in a multi-threaded application. Each session uses the same grammars, but maintains its own state.

    Copy
    EdkSessionHandle session = NULL;
    error = EdkSessionCreate(engine, &session);
  5. Send UTF-8 encoded text to the session.

    Copy
    struct stat infoText;
    stat("test.txt", &infoText);
    off_t lenText = infoText.st_size;
    FILE* fText = fopen("test.txt", "rb");
    char* buffer = (char*)malloc(lenText+1);
    size_t sizeText = fread(buffer, 1, lenText, fText);
    error = EdkAddInputText(session, buffer, sizeText, true);
  6. Call EdkGetNextMatch to obtain an entity match. You can call this method repeatedly to obtain all matches.

    Copy
    while(EdkSuccess == EdkGetNextMatch(session))
    {            
        // For each match found, do this ...
        const char* szMatch = NULL;
        EdkGetMatchText(session, &szMatch);
        printf("Match found: %s\n", szMatch);
    }

    NOTE: If you create your engine from a configuration file that includes post-processing tasks, the post-processing tasks automatically run as part of EdkGetNextMatch and you do not need to run them separately.

  7. To process multiple documents, repeat Step 4 to Step 6.

  8. Release resources when done. You must destroy all session handles before you destroy the engine handle.