Standalone API Usage

This section describes the basic structure of a standalone application using the API. See the source referenced in Example Programs.

Typically, your application takes the following actions:

  1. Load the EDK SDK library.

    Copy
    import edk.sdk

    NOTE: The package relies on the C EDK shared library (edk.dll or libedk.so). This library must be in one of the following locations:

    • a directory on the system library path

    • a directory that you pass to the Python API by setting the EDKLIBPATH environment variable before you import the package from your Python code.

  2. Create an EdkFactory instance, supplying a valid license key.

    Copy
    with edk.sdk.EdkFactory.create_with_license_key("My license key here...") as factory:
        # edk.sdk classes act as context managers; use in a 'with' statement to
        # automatically clean up resources after it finishes, or if an exception 
        # occurs
        # the rest of your processing code goes here
        ...
  3. Use the factory to create an EdkEngine instance.

    Copy
    # You can supply configuration as a file path, a buffer, or configparser.ConfigParser instance
    with factory.engine(configpath="eduction.cfg") as engine:
        # the rest of your processing code goes here
        ...

    A factory can create multiple engines, with different configurations.

    You can specify the options, grammar files, and entities to use in the configuration file.

    Alternatively the SDK provides functions to allow you to set options programmatically.

  4. Use an engine to create an EdkSession instance, which maintains the state of the matching process.

    If your application is multi-threaded then each thread should use its own session.

    Copy
    with engine.session() as session:
        # the rest of your processing code goes here
        ...
  5. You can push data to the session as it becomes available, or read it from a stream.

    The following example demonstrates how to read from a stream.

    Copy
    with open("input.txt", encoding="utf-8") as in_file:
        session.input_stream = in_file
        # the rest of your processing code goes here
        ...
  6. Begin matching. You can iterate over the session, yielding matches until the input is exhausted:

    Copy
    # By default, returned matches are proxy objects,
    # fetching values from EDK on demand, so must not
    # be used beyond the next advance of the iterator.
    # Set session.persistent_matches = True to copy all
    # values to a persistent match object.
    for match in session:
        print(match.text)
        # You can iterate over or index match components
        if match.components:
            for c in match.components:
                print(c.name, c.text)
  7. Release resources when you have finished. If you use Eduction classes in a with block, it releases resources automatically. Alternatively, you can collect class instances using an ExitStack and call its close() method when you have finished using Eduction.