Synchronize with a Repository

The primary purpose of a connector is to retrieve data from a repository so that it can be processed and indexed into an IDOL Content component. The connector retrieves new documents, updated documents, and a list of documents that have been deleted so that the IDOL index is kept up-to-date with the data source. IDOL NiFi Ingest processors that synchronize with a repository have names that begin with "Get", such as GetFileSystem, GetWeb, or GetSharePoint.

Some IDOL NiFi Ingest connectors also support a feature called synchronize from identifiers. If a connector supports this feature, you can route documents that were not processed successfully back to the connector, so that the connector retrieves each of the failed documents again.

The behavior of the "Get" processor changes depending on whether you configure an incoming connection.

The following procedure demonstrates how to perform both types of synchronize task using a single processor.

To perform synchronize and synchronize from identifiers with a single processor

  1. Add and configure the "Get" processor for the relevant repository.

    TIP: The same processor will perform both synchronize and synchronize from identifiers. So, if you want to perform a synchronize task every 24 hours but you want to process failed documents every 30 minutes, schedule the processor to run every 30 minutes. (Use the highest common factor of both of your chosen schedules).

  2. Route failed documents back to the "Get" processor.

    At this point the processor only performs the "synchronize from identifiers" task, and will only synchronize documents that are added to its input queue.

  3. Add a GenerateFlowFile processor to the dataflow, and configure it to create a FlowFile that will start a synchronize task.

    1. Add a processor, by dragging the processor icon from the components toolbar to the canvas.

      The Add Processor dialog box opens.

    2. In the Source list, click all groups.

    3. Search for and select the GenerateFlowFile processor and click ADD.

      The processor is added to the canvas.

    4. Create a connection between the GenerateFlowFile processor and the "Get" processor. Hover the mouse over the GenerateFlowFile processor until you see the connection icon - - and then drag the icon to the "Get" processor.

      The Create Connection dialog box opens.

    5. In the For Relationships area, select the success check box and click ADD.

      The connection appears on the canvas. In the following image the GenerateFlowFile processor has been named "StartSynchronize".

    6. Right-click the GenerateFlowFile processor and click Configure.

      The Configure Processor dialog box opens.

    7. Click the SCHEDULING tab.
    8. Configure how often to synchronize with the repository. For example, in the Run Schedule box, type 24 hours.
    9. Click the Properties tab.
    10. Click Add .

      The Add Property dialog box opens.

    11. Type idol.get.action and click OK.

      Another box opens so that you can specify the value.

    12. Type synchronize and click OK.
    13. Click APPLY to close the Configure Processor dialog box.
  4. You can now start both the connector processor and the GenerateFlowFile processor.

_FT_HTML5_bannerTitle.htm