Full and Incremental Synchronize

The synchronize action demonstrated in the sample code acts the same way every time it is called: the connector ingests every document in the repository. This is called a full synchronize, and a connector would typically do this the first time the synchronize action is used.

When the synchronize action runs again, it is more efficient to send ingest-add commands only for new documents, send ingest-replace commands for modified documents, send ingest-update commands for updates to metadata only, and send ingest-delete commands for deleted documents. This is called an incremental synchronize.

To implement an incremental synchronize it is often useful to store state information, for example a list of files retrieved and the time when they were last modified. This information can then be compared to the files currently in the repository. For this purpose, every FetchTask (specifically SynchronizeTask) provides a datastore file name when you call task.datastoreFilename(). The file name is consistent between actions, and can therefore be used to store persistent state data. You can store the information in a format of your choice.

OpenText recommends that you use the file name returned by task.datastoreFilename() to store all state information for the task. The advantages of using this file name include:

  • ConnectorLib C++ chooses a unique file name that does not conflict with the names used for other tasks. The file name also respects configuration settings regarding the datastore location.
  • The same file name is provided to other actions called for the same task (collect, view, insert, update, and so on.)
  • The file is backed up and restored automatically if the backup and restore actions are used.