Monitor the Status of IDOL Components
You can submit standard HTTP requests to the IDOL ACI and Service ports to get information about the status of your IDOL components, whether individual components are running, and so on. You can use this information for manual analysis, or to configure automated monitoring tools such as Nagios, which can send and check HTTP responses using the check_http
plugin (for more information, refer to the Nagios documentation).
In addition to submitting actions, you can also use the Overview tab on the Status page of IDOL Admin to view information about the status and availability of your components.
You can also use statistics to monitor your components.
Availability Checks
You can use the following methods to check the availability of your components.
Check That a Component is Running
To check that a component is running, submit an /action=GetStatus
request to the service port of the component. If the component is running, the response body includes <RESPONSE>Running</RESPONSE>
.
Check That an ACI Port is Responding
To check that an IDOL component’s ACI port is responding, and that the component has threads available to handle incoming ACI requests, send an /action=GetPID
request. If the ACI port is responding normally, the response body includes <response>SUCCESS</response>
.
NOTE: This action is available on all ACI servers, but is particularly relevant for components with ACI threads that are processing long-running requests (for example, Content or DAH).
The advantage of this method is that it has very little impact on processing time, performance impact, and network overhead.
Monitor ACI Thread Status From the Service Port
To check the status of ACI threads, use the ACIThreadStatus
service port action. The action returns information on the current state of all ACI threads, even if the ACI port is unavailable (for example, if all threads are servicing long-running actions).
TIP: You can use this action to detect and flag up long-running queries in real time.
The action returns the following information:
-
Whether a thread is in progress (that is, currently servicing a request).
-
Whether a thread is idle.
-
The number of accepted connections (ACI requests which have been accepted, but cannot yet be processed because no ACI thread is available to handle them).
-
In the case of active threads, the action that is being processed, and the currently elapsed time.
Check That the Index Port is Responding
This method is relevant to components with an index port (for example, Content and DIH).
To check that the index port can be contacted and is responding to incoming requests, send the /PING?
request to the index port. If the index port is running normally, this returns the string PING SUCCESS
.
Check That Children are Responding
This method is relevant to components with children (for example, IDOL Proxy, DIH, and DAH).
To check that child components are running normally, submit the ACI GetStatus
action. The response contains the following information on the current state of the child components:
-
IDOL Proxy: The
component/componentname/status
response isRUNNING
if the child component is running normally.TIP: Because the Proxy component allocates port numbers to its children, it might be easier to monitor their state by using the Proxy, rather than by contacting the children directly.
-
DIH: The
engines/engine/status
response isUP
if the engine is running normally. The response reports the last-known state of the engine (that is, it might be cached, with a maximum age dependent on the configured ping time). -
DAH: The
engines/engine/status
response isONLINE
if the engine is running and queryable. DAH collects current information from its children when it receives aGetStatus
request, so child statuses are current; however, if one or more children are down, this might incur a timeout penalty when attempting to connect.
Check the Status of the Content Component
The IDOL Content component’s GetStatus
action provides specific information that can provide an indication of the status of the component. For example:
-
If the number of
documents
is much smaller than the number ofcommitted_documents
, you might need to compact the engine. -
If the number of
total_terms
increases suddenly, or is significantly higher than in comparable engines, this might indicate the presence of documents containing many “junk” terms. This might be because the documents have been incorrectly extracted, or because the OCR process has been unsuccessful. (You can use theTermGetAll
action to investigate terms with unusually low or high document occurrences.) -
If the number of fieldcodes increases suddenly, or is significantly higher than in comparable engines, this might indicate the presence of documents containing many “junk” metadata fields.
-
Unexpected values for
mindatestring
ormaxdatestring
might indicate that the engine’sDateFormatCSVs
parameter is misconfigured, or that incoming documents lack date information, or contain dates and times in an unexpected format. -
You can check the timestamp and outcome of the last validation run on various indexes, as well as the timestamp for the last run-time error (that is, any error returned from that index during the course of a query or index command).