Use Structured Classification

The following sections describe how to create and train structured categories, and add them to a structured classifier, and how to query the classifier with new documents.

For more information about the classification actions, see Structured Classifier Actions.

Create a Structured Category

You create a structured category with a unique name and a set of training documents from your IDOL index.

To create a structured category

  1. Use an IDOL Query action with the StoreState to generate a state token that identifies a list of training documents for the structured category.

    For example, if you want to create a structured category to identify films, query for films, and generate a state token for the query results.

  2. Send a StructuredCategoryCreate action to the IDOL Category component, with the Name parameter set to the name to use for the new category, and the StateID parameter set to the state token for your training documents.

    For example:

    action=StructuredCategoryCreate&Name=Film&StateID=B8UGIK95FKJG-23

    This action creates a Film structured category, which is trained by using the documents in the B8UGIK95FKJG-23 state token.

Create a Structured Classifier

You create a structured classifier with a unique name by providing a list of structured categories.

TIP: You can get a list of available structured categories by using the StructuredCategoryList action.

To create a structured classifier

  • Send a StructuredClassifierCreate action to the IDOL Category component, with the Name parameter set to the name to use for the new category, and the StructuredCategories parameter set to the list of structured categories to use.

    For example:

    action=StructuredClassifierCreate&Name=TypeClassifier&StructuredCategories=Film,Business,Human,City

    This action creates a TypeClassifier classifier, which classifies documents into the Film, Business, Human, and City categories.

Classify Documents

You can use a structured classifier to classify documents by using the StructuredClassifierQuery action.

You can classify one or more documents that exist in the IDOL Content component index, by using the DocID, DocRef, or StateID parameter.

Alternatively you can classify a document by providing a percent-encoded IDX or XML document.

In both cases, the IDOL Category component identifies the parametric fields in the document and compares them to the trained categories in your structured classifier. It then determines which category the document matches.

To classify a document

  • Send a StructuredClassifierQuery action to the IDOL Category component, with the Name parameter set to the name of the structured classifier to use. Set one of the following parameters to identify the document to classify:

    • DocID. The document ID of a document in the IDOL Content component index.

    • DocRef. The document reference of a document in the IDOL Content component index.

    • StateID. A state token from a query.

    • QueryText. A percent-encoded IDX or XML document.

    For example:

    action=StructuredClassifierQueru&Name=TypeClassifier&DocRef=http://www.example.com/documents/casablanca

    This action uses the TypeClassifier structured classifier to classify the document with reference http://www.example.com/documents/casablanca.