Eduction Grammar Structure

An Eduction grammar defines patterns for matching text in a document. A pattern is a combination of characters and operators. An operator is a sequence of special characters that match text by following the rules associated with the operator.

Pattern

Description

Matches

Smith|John

Match either Smith or John

Smith

John

[0-9]{3}

Match a sequence of three characters in the range 0 through 9

123

456

In this example, the square bracket operators [] are used to match on any of the characters 0 through 9 and the curly braces {} are used to repeat the previous pattern three times.

Grammars are described using XML. The file edk.dtd contains the template that defines the XML that Eduction understands. When writing grammars for Eduction, Micro Focus recommends that you reference edk.dtd at the start of the XML grammar file using the include statement, and that you use a DTD-compatible XML authoring tool to eliminate syntax errors and save time.

Here is an example of a simple Eduction grammar:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE grammars SYSTEM "edk.dtd">
<grammars>
   <grammar name="mygrammar">
      <entity name="name" type="public">
         <pattern>Smith|John</pattern>
      </entity>
      <entity name="digits" type="public">
         <pattern>[0-9]{3}</pattern>
      </entity>
   </grammar>
</grammars>

This grammar defines two entities: mygrammar/name and mygrammar/digits.

For full details of the Eduction grammar XML syntax, and the edk.dtd, see Grammar Format Reference.

For a more extensive set of example Eduction grammar files, see Example Grammar Files.