System Architecture

The DIH receives index actions (data indexing requests or administrative actions) and distributes them to the connected Content components.

You can run the DIH in either of two modes: mirror mode and non-mirror mode.

You determine the way that the DIH distributes index data by using the MirrorMode parameter in the DIH configuration file. See Set the Distribution Mode.

Mirror Mode

In mirror mode, the DIH indexes all data to all the connected Content components. Each Content is identical.

The following diagram shows how the DIH in mirror mode integrates into a Knowledge Discovery installation.

DIH system architecture (mirror mode)

The DIH sends all the index data that it receives (represented by gray arrows in the diagram) to all the connected Content components. The Content components are exact copies of each other, and must all have the same configuration.

You can run the DIH in mirror mode to ensure uninterrupted service if one of the Content components fails. While one Content is inoperable, its identical copies continue to index data, and are still available to return data for queries.

The DIH periodically checks whether all connected Content components are operating. If a Content component is unavailable, the DIH queues the data that this Content normally receives. When the Content is available again, the DIH indexes the queued data into it.

The DIH sends administrative index actions (represented by black arrows in the diagram) to all connected Content components.

Non-Mirror Mode

In non-mirror mode, the DIH divides the index data among the connected Content components. Each Content receives the same amount of data.

The diagram below shows how the DIH in non-mirror mode integrates into a Knowledge Discovery installation.

DIH system architecture (non-mirror mode)

The DIH distributes the index data that it receives (represented by gray arrows in the diagram above) evenly across the connected Content components. For example, if the DIH connects to four Content components, it indexes approximately one quarter of the data into each Content. It does not split up sections of individual documents.

Run the DIH in non-mirror mode if the amount of data that you want to index is too large for a single Content. If the Content components that the DIH indexes into are on different machines, the indexing process requires less time.

DIH periodically checks whether all the connected Content components are available. If a Content is unavailable, the DIH queues the data that this Content normally receives. When the Content component is available again, the DIH indexes the queued data into it.

NOTE: Mirror mode and non-mirror mode refer only to how DIH distributes index data. DIH always distributes administrative commands to all connected Content components. This behavior is show by the black arrows in the diagram.