Maximize the Performance of a Connector
The maximum performance that can be obtained from a connector depends on the performance of the repository. For example, a File System Connector has to aggregate a list of files and is therefore limited by the speed of the file system.
CAUTION: When configuring a connector, be careful to monitor the load placed on the repository. Connectors usually run as a user of the repository but the requests they make are not typical. Make sure that the repository remains usable and does not allocate all of its resources to the connector.
Optimize the Number of Processing Threads
To maximize the performance of a connector, adjust the SynchronizeThreads
parameter. This parameter specifies the number of threads that the connector can use for each synchronize task. You should set SynchronizeThreads
relative to the performance that the repository can provide.
Some connectors do not support multi-threading for synchronize tasks. For these connectors, you can set the TaskThreads
parameter, which allows the connector to run more than one task at a time. When you increase the value of the TaskThreads
parameter, the connector can make more than one connection to a repository.
NOTE: Some repositories limit the total number of simultaneous connections that are permitted from a client.
To increase the speed of a single task using the TaskThreads
parameter, divide the task into multiple fetch tasks. You must set configuration parameters for each fetch task so that each one retrieves a subset of the required data.
Consider the Configuration Parameters that you use to Retrieve Data
You can improve the performance of many connectors by carefully choosing the configuration parameters that you use to retrieve data. Performance can be reduced significantly if the connector must crawl through a large amount of data that is then ignored.
If you want a File System Connector to ignore large folder structures use the configuration parameter PathNoCrawlRegex
. With this parameter,you can ignore a folder and everything it contains. If you use PathCantHaveRegex
, the connector must check the path of every file individually.
If you want the SharePoint Connector to ignore a certain list, use the configuration parameter ListCantHaveRegex
. You could use the CantHaveRegex
parameter, but this would result in additional processing. The CantHaveRegex
configuration parameter operates at the list item or document level so the connector would have to expand and search through the entire list before discarding it.
Send Documents to CFS in Large Batches
Connectors send documents to CFS in batches. A connector sends a separate request for each batch, so larger batch sizes are more efficient. The default batch size of 100 documents should provide good performance, but you can modify the batch size using the configuration parameter IngestBatchSize
.
Configure Logging
The amount of logging that occurs can have a significant impact on performance. In normal operation, set the LogLevel
parameter to Normal
. Setting LogLevel
to Full
will decrease performance.