Maximize the Performance of a Connector

The maximum performance that can be obtained from a connector depends on the performance of the repository. For example, a File System Connector has to aggregate a list of files and is therefore limited by the speed of the file system.

CAUTION: When configuring a connector, be careful to monitor the load placed on the repository. Connectors usually run as a user of the repository but the requests they make are not typical. Make sure that the repository remains usable and does not allocate all of its resources to the connector.

Optimize the Number of Processing Threads

To maximize the performance of a connector, adjust the SynchronizeThreads parameter. This parameter specifies the number of threads that the connector can use for each synchronize task. You should set SynchronizeThreads relative to the performance that the repository can provide.

Some connectors do not support multi-threading for synchronize tasks. For these connectors, you can set the TaskThreads parameter, which allows the connector to run more than one task at a time. When you increase the value of the TaskThreads parameter, the connector can make more than one connection to a repository.

NOTE: Some repositories limit the total number of simultaneous connections that are permitted from a client.

To increase the speed of a single task using the TaskThreads parameter, divide the task into multiple fetch tasks. You must set configuration parameters for each fetch task so that each one retrieves a subset of the required data.

Consider the Configuration Parameters that you use to Retrieve Data

You can improve the performance of many connectors by carefully choosing the configuration parameters that you use to retrieve data. Performance can be reduced significantly if the connector must crawl through a large amount of data that is then ignored.

If you want a File System Connector to ignore large folder structures use the configuration parameter PathNoCrawlRegex. With this parameter,you can ignore a folder and everything it contains. If you use PathCantHaveRegex, the connector must check the path of every file individually.

If you want the SharePoint Connector to ignore a certain list, use the configuration parameter ListCantHaveRegex. You could use the CantHaveRegex parameter, but this would result in additional processing. The CantHaveRegex configuration parameter operates at the list item or document level so the connector would have to expand and search through the entire list before discarding it.

Send Documents to CFS in Large Batches

Connectors send documents to CFS in batches. A connector sends a separate request for each batch, so larger batch sizes are more efficient. The default batch size of 100 documents should provide good performance, but you can modify the batch size using the configuration parameter IngestBatchSize.

Configure Logging

The amount of logging that occurs can have a significant impact on performance. In normal operation, set the LogLevel parameter to Normal. Setting LogLevel to Full will decrease performance.