Categorize Documents

Categorization analyzes the concepts that exist in a document and, if those concepts match categories in IDOL Server, adds category information to the document. Categorizing documents is useful because you can alert IDOL users to new content that matches their interests, help them find information through taxonomies, and help them to identify similar documents.

To use categorization, you must have created and trained categories in IDOL Server. CFS queries IDOL by sending the CategorySuggestFromText action for each document, and IDOL returns information about any categories that match. If a document does not match any of the categories in IDOL Server, the document is not categorized. For information about how to create and train categories, refer to the IDOL Server Administration Guide.

To categorize documents

  1. Stop CFS.
  2. Open the CFS configuration file.
  3. Create an import task to run the CategorySuggestFromText Lua script that is supplied with CFS. For example:

    [ImportTasks]
    Post0=Lua:./scripts/CategorySuggestFromText.lua
  4. Open the script in a text editor.
  5. Modify the variables in the script so that the script sends actions to your IDOL Server:

    Line Variable name Value
    178 idolCategorizeHost The host name or IP address of your IDOL Server.
    179 idolCategorizePort The ACI port of your IDOL Server. The port argument in the function send_aci_action expects a number, so do not surround the port number with quotation marks.
    184 timeoutMilliseconds The amount of time, in milliseconds, that CFS waits for a response from your IDOL Server. If CFS does not receive a response within this time limit and the number of retries is reached, the document is not categorized. You should not need to modify the default value, which is 60 seconds.
    185 retries The number of times that CFS retries a request to your IDOL Server, if the first attempt is not successful.
    186-192 sslParameters A table of SSL parameters for connecting to your IDOL Server. For more information about the SSL parameters that you can set, refer to the Connector Framework Server Reference.

    For example:

    local idolCategorizeHost = "10.0.0.1"
    local idolCategorizePort = 9000
    
    ...
    
    local timeoutMilliseconds = 30000
    local retries = 3
    local sslParameters =
       {
    	SSLMethod = "SSLV23",
    	--SSLCertificate = "host1.crt",
    	--SSLPrivateKey = "host1.key",
    	--SSLCACertificate = "trusted.crt"
       }
  6. Save and close the script.