Expand IDOL Installations
The default IDOL Server unified installation includes all the main IDOL components, and allows you to set up a small index and query your documents. However, most real-world uses of IDOL outgrow this scenario fairly rapidly. Keeping all documents in a single instance of IDOL Server becomes unsustainable in a growing system.
-
As the amount of data increases, so does the volume of the IDOL Server index on disk. If you keep on indexing data, it might use all the available disk space on the machine. In practice, IDOL Server requires a certain amount of free space to continue normal operations, so it automatically stops indexing more documents before it reaches this point.
-
In normal operation, IDOL Server holds speed-critical components of the index in memory. As the index size increases, so does the amount of physical memory required. At this point, the process swaps out its virtual memory (for example, to a page file), which has a corresponding impact on performance. As the process memory usage increases, the amount of free memory available to the operating system to cache frequently-accessed data from disk also decreases, with a further impact on performance.
-
Query time tends to increase as the number of documents and the number of term occurrences (the volume of data to search) increases. When you add documents, IDOL Server must merge new data with the existing index, so indexing performance can also decrease as the IDOL Server index grows.
Distributed IDOL
You can distribute an IDOL index by running multiple instances of the Content component, each of which indexes a different subset of documents. Each instance is a self-contained index, containing a fraction of the complete body of documents. At query time, results are collated to combine search across the combined set of indexed documents.
The individual indexes are generally spread across multiple physical servers, allowing the disk, memory, and processor footprint of the combined index to expand beyond the limitations of a single machine. Even on a single machine, it is common practice to spread documents across several smaller copies of the Content component, rather than a single large instance. This approach can still improve performance, because the individual instances can work in parallel on the sub-indexes, bringing back results more quickly than a single large IDOL Server instance.
In addition to splitting data between servers, a distributed index can hold multiple copies of a document in different servers, for purposes of load balancing, failover, or disaster recovery.
Next: Scale IDOL: Distributed Architecture