Introduction

This guide has demonstrated how to construct a NiFi dataflow that retrieves information from a repository and indexes it into an IDOL index (see Build a Basic Ingestion Pipeline and Improve the Ingestion Pipeline). Keeping IDOL up-to-date with the information in your data repositories is just one task (synchronize) that IDOL Connectors can perform.

You can use your IDOL Connectors to retrieve documents from a repository (the collect action) or delete documents in a repository (the delete action). A full list of supported actions is included below. IDOL NiFi Ingest Connectors provide a separate NiFi processor for each action, but most repositories do not support every action. For example, you can retrieve pages from the Web but not delete them, so installing the IDOL Web Connector provides a FetchWeb processor but not a DeleteWeb processor.

The synchronize action is usually scheduled to run at regular intervals, but the other actions usually run on-demand in response to a specific request. For example, when a user of a front-end application performs a search they are presented with query results. After selecting one of the results the user might want to view the original file, so the application might send a collect or view action to the relevant connector.

Task Connector Action NiFi Processor Name
Prefix Example
Keep an index, such as IDOL Content, up-to-date with items in a repository. Synchronize Get* GetFileSystem
List the items that exist in a repository. Identifiers List* ListFileSystem
Retrieve one or more items from a repository. Collect Fetch* FetchWeb
Retrieve an item from a repository in order to view the original file. View View* ViewFileSystem
Delete one or more items from a repository. Delete Delete* DeleteFileSystem
Place a legal hold on one or more items in a repository, so that they cannot be deleted, or release an existing hold. Hold/ReleaseHold Hold* HoldSharePointOData
Insert items into a repository. Insert Put* PutFileSystem
Update the metadata of one or more items in a repository. Update Patch* PatchFileSystem

This section describes how to create a dataflow in Apache NiFi that handles requests and performs connector actions on demand, including actions other than synchronize.