Eduction Grammar Structure

An Eduction grammar defines patterns for matching text in a document. A pattern is a combination of characters and operators. An operator is a sequence of special characters that match text by following the rules associated with the operator.

Pattern

Description

Matches

Smith|John

Match either Smith or John

Smith

John

[0-9]{3}

Match a sequence of three characters in the range 0 through 9

123

456

In this example, the square bracket operators [] are used to match on any of the characters 0 through 9 and the curly braces {} are used to repeat the previous pattern three times.

Grammars are described using XML. The file edk.dtd contains the template that defines the XML that Eduction understands. When writing grammars for Eduction, OpenText recommends that you reference edk.dtd at the start of the XML grammar file using the include statement, and that you use a DTD-compatible XML authoring tool to eliminate syntax errors and save time.

Here is an example of a simple Eduction grammar:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE grammars SYSTEM "edk.dtd">
<grammars>
   <grammar name="mygrammar">
      <entity name="name" type="public">
         <pattern>Smith|John</pattern>
      </entity>
      <entity name="digits" type="public">
         <pattern>[0-9]{3}</pattern>
      </entity>
   </grammar>
</grammars>

This grammar defines two entities: mygrammar/name and mygrammar/digits.

NOTE: The full name of an entity is the name attribute of the parent grammar, followed by a slash, and then the entity name attribute. For example mygrammar/name.

In general, the full name of an entity must be unique, even for entities in different grammar files. This restriction includes private entities that you might reference internally in a grammar. In some cases, non-unique entity names can cause unexpected results when you use multiple grammar files in the same Eduction session.

You can ensure that entity names are unique by making sure that the <grammar> tag in every source file has a unique value for the name attribute.

An exception to this restriction is when you deliberately extend or replace an existing public entity.

For full details of the Eduction grammar XML syntax, and the edk.dtd, see Grammar Format Reference.

For a more extensive set of example Eduction grammar files, see Example Grammar Files.