Troubleshoot the Connector
This section describes how to troubleshoot common problems that might occur when you set up the HTTP Connector.
If the connector cannot connect to the Web site that you want to index, check whether the connector machine is behind a proxy server. If this is the case, use the configuration parameters ProxyHost
and ProxyPort
(or ProxyFromLua
) to specify the host name or IP address, and port, of the proxy server.
If pages are not indexed, set the configuration parameter LogVerbose=true
. You can then view the synchronize
log file to see the links that are extracted from pages. Check your configuration to ensure that it does not exclude the pages that you want to index. The connector cannot parse Javascript, so any links contained in Javascript are not found by the connector and those pages are not indexed.
Some Web sites require visitors, and therefore the connector, to log on before they can retrieve content. You must set the LoginMethod
configuration parameter and provide credentials in the connector’s configuration file.
To determine the correct method to use to log in to a Web site, you can:
- View the page source. If the Web site presents an HTML form, view the page source and check whether the form uses the POST or GET method to submit the form data to the Web server.
- Use a packet analyzer to monitor the data sent from the Web browser to the Web server. Compare the data sent by the Web browser, when you log in manually, to the data that is sent by the connector.
If you configure the connector to log on to a Web site by submitting a form, ensure that the connector submits all of the required fields.
You might see this error if the system has allocated all available TCP ports.
Operating systems typically allocate a limited number of ports to applications that need to make outbound connections. Also, when a connection is closed, the operating system waits before a port is released and can be reused.
If you encounter this issue on a Windows system, you can set the Windows registry parameter MaxUserPort
, so that more ports are available for use. You could also shorten the amount of time that the operating system waits before releasing a port by setting the registry parameter TcpTimedWaitDelay
. These are both set in:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters