Eduction Grammar Structure
An Eduction grammar defines patterns for matching text in a document. A pattern is a combination of characters and operators. An operator is a sequence of special characters that match text by following the rules associated with the operator.
Pattern |
Description |
Matches |
---|---|---|
Smith|John |
Match either Smith or John |
Smith John |
[0-9]{3} |
Match a sequence of three characters in the range 0 through 9 |
123 456 |
In this example, the square bracket operators [] are used to match on any of the characters 0 through 9 and the curly braces {} are used to repeat the previous pattern three times.
Grammars are described using XML. The file edk.dtd
contains the template that defines the XML that Eduction understands. When writing grammars for Eduction, OpenText recommends that you reference edk.dtd
at the start of the XML grammar file using the include statement, and that you use a DTD-compatible XML authoring tool to eliminate syntax errors and save time.
Here is an example of a simple Eduction grammar:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE grammars SYSTEM "edk.dtd"> <grammars> <grammar name="mygrammar"> <entity name="name" type="public"> <pattern>Smith|John</pattern> </entity> <entity name="digits" type="public"> <pattern>[0-9]{3}</pattern> </entity> </grammar> </grammars>
This grammar defines two entities: mygrammar/name
and mygrammar/digits
.
For full details of the Eduction grammar XML syntax, and the edk.dtd
, see Grammar Format Reference.
For a more extensive set of example Eduction grammar files, see Example Grammar Files.