Distributed Setup

A distributed IDOL setup involves using a Distributed Index Handler (DIH) and Distributed Action Handler (DAH) to route actions to multiple instances of an IDOL component.

A distributed setup is effective for load-balancing among components, as well as for expanding indexes that no longer fit on one machine. DIH and DAH balance indexing and action requests among the appropriate components. You can either set up the distributed system in mirror mode or non-mirror mode.

  • In mirror mode, each instance of the component that you distribute to is identical. You can use this option for load-balancing or failure tolerance.
  • In non-mirror mode, each instance of the component is different. This option usually applies only to the IDOL Content component, to allow you to expand the size of your IDOL index.

The DIH distributes index actions (for example, to add or remove documents from your IDOL index). DIH can distribute only to components that have an index port.

The DAH distributes ACI actions. In mirror mode, it can distribute any action to any component. In non-mirror mode, it can distribute most IDOL Content component actions and combine the results of queries from multiple components.

The following diagram shows a simple setup with a DAH and DIH distributing to five child Content components.

A distributed system with stand-alone components uses any combination of IDOL components with one or more DIH and DAH components. You configure each component separately with individual component configuration files.

This setup is highly flexible. You can distribute as many or as few of the IDOL components as you need to, and you can scale each component with additional instances as required by your usage. For example, if your system has a high load for categorization, you can increase the number of Category components without adding any other components.

The following sections provide more information about mirror mode and non-mirror mode for the DIH and DAH when distributing to multiple instances of the IDOL Content component.

For more information, refer to the Distributed Index Handler Help, and the Distributed Action Handler Help.

Mirror Mode

In a mirrored setup, you store the same set of data in each instance of the IDOL Content component. Each Content is identical to the others, and you must configure them in the same way.

In mirror mode:

  • DIH sends all index actions to all connected IDOL Content components. If one Content becomes unavailable, DIH continues to index data into its identical copies, which are still available for queries. DIH queues actions to the inoperable Content, and sends them when the Content becomes available again, so that the servers do not become inconsistent.

  • DAH has two options to determine how to distribute ACI actions:

    • Load Balancing. The DAH assigns each incoming action to just one of the connected IDOL Content components (using a cumulative predictive algorithm that spreads the action load efficiently).

    • Failover. The DAH forwards incoming actions to the first Content that you list in the DAH configuration file. If this IDOL Content component stops responding for any reason, the DAH marks it as down and switches to the next Content.

Non-Mirror Mode

In a non-mirrored system, you distribute the data among the Content components.

Use non-mirror mode if the amount of data to index is too large for a single Content. You can separate the resources for each part of the index by setting the Content components up on different machines. This approach can improve the indexing time.

In non-mirror mode: 

  • DIH splits and distributes the index data to all the connected Content components. It can distribute the content randomly, or according to one of several rules that you can configure.

  • DAH sends ACI actions to all connected Content components. You can configure the DAH to combine the results in different ways when it returns them.

Chain Distribution Servers

You can set up multiple DIH and DAH instances in a chained configuration. For example, a parent DIH or DAH distributes actions to child DIH or DAH servers, which in turn distributes to child Content components.

The following diagram shows an example of a chained system, where a parent DIH distributed index actions to two child DIHs, each of which distributes to three IDOL Content components.

In this configuration, the parent DIH and DAH distributes actions to the child DIH and DAH servers in the same way as it distributes to child Content components. Each child DIH or DAH accepts all Content actions and forwards them.

NOTE: Some actions may have a different effect when you send them to a child DIH or DAH server rather than a Content component, because the actions goes to multiple Content components.

Chaining provides an extra level of redundancy both at the DIH or DAH, and the Content level. It also distributes network traffic and system load over a larger number of computers. A chained configuration provides a pool of Content components that are both fault-tolerant for maximum availability and distributed for the best performance.

For more information about chaining distribution servers, and the architectural considerations, refer to IDOL Expert.

Distributed Components Example

The following diagram shows one example scenario for distribution in an IDOL Text system. In this case, there are two Content components, two Community components, and two Category components.

In this example:

  • You send index actions to the DIH, which distributes them between the two Content components.

  • You route ACI actions between the three DAHs (for example, by using a front-end application, or an IDOL Proxy).

    • You send actions for Content (such as Query) to Content DAH, which distributes actions between the two Content components.

    • You send actions for Community (such as UserRead) to Community DAH, which distributes actions between the two Community components.

    • You send actions for Category (such as CategoryQuery) to Category DAH, which distributes actions between the two Category components.

    NOTE: DAH cannot distribute all ACI actions in non-mirror mode, so the Community and Category components in this example must be mirrored.

The Agentstore component is not shown in this example. You might install it in several ways:

  • Use a single Agentstore for each of Category and Community components.

  • Have Category and Community connect to an Agentstore DIH and Agentstore DAH.

    Agentstore indexes are not usually large, so you might not need to split the index into multiple Agentstore components. However, you can create mirrored Agentstores for redundancy.

If you require only basic IDOL indexing and retrieval, or you do not need to distribute the other components, you might only distribute the Content index. In this case, you can set up a number of Content servers, with DIH and DAH servers distributing index and ACI actions between them.

Depending on the size or requirements of the system, you can either use mirror mode for fault tolerance, or non-mirror mode for performance. You can also use chained DIH and DAH servers to allow for both.