After making changes to your NiFi Ingest dataflow, you might want to re-ingest the documents in your IDOL index. For example, if you add a processing step to perform a new type of media analysis you might want to re-ingest image files.
One approach is to clear the index and the state information stored by your connectors (see Connector Datastores) so that all of the items in your data repositories are ingested again. However, re-ingesting all of your documents can be time consuming. Instead, you can send a query to your IDOL Content component and re-ingest documents that match the query.
NOTE: To do this, the connectors you are using must support the synchronize from identifiers feature.
To re-ingest documents that match a query
Add a QueryIDOL processor to the data flow. This processor queries your IDOL Content component and returns documents that match the query.
Drag the processor icon from the components toolbar to the canvas.
The Add Processor dialog box opens.
In the Source list, click idol.nifi.
Select the QueryIDOL processor and click ADD.
The processor is added to the canvas.
Right-click the QueryIDOL processor and click Configure.
The Configure Processor dialog box opens.
Click the Properties tab and set the following properties:
IDOL Host | The host name or IP address of your IDOL Content component. |
IDOL ACI Port | The ACI port of your IDOL Content component. |
Text Field Text Database Match |
These properties set the value of the MATCH{230}:DOCUMENT_KEYVIEW_TYPE_NUMBER |
Add a RouteOnAttribute processor to the data flow. This is necessary if your query returns documents that were originally retrieved by different connectors. (Each document must be routed back to the correct connector. The following steps demonstrate how to do this for a File System Connector. To route documents to multiple connectors you would repeat steps 3F to 3H to create additional output relationships).
Drag the processor icon from the components toolbar to the canvas.
The Add Processor dialog box opens.
In the Source list, click org.apache.nifi.
Select the RouteOnAttribute processor and click ADD.
The processor is added to the canvas.
Right-click the RouteOnAttribute processor and click Configure.
The Configure Processor dialog box opens.
Click the Properties tab.
Click to add a new dynamic property. This creates a new output relationship.
The Add Property dialog box opens.
FileSystem
.Set the value of the property. For example, to select documents that were originally retrieved by a NiFi processor named "MyFileSystemConnector":
${idol.doc.source:getDelimitedField(2,':'):equals('MyFileSystemConnector')}
This is NiFi expression language that reads the second part of the idol.doc.source
FlowFile attribute and checks to see whether the value equals MyFileSystemConnector
. For more information about this attribute, see Introduction to FlowFiles and Documents.
Start the QueryIDOL and RouteOnAttribute processors.
The QueryIDOL processor sends the query to the IDOL Content component. You should be able to see the query in the Content component request log, which is available through action=GRL
. Any result documents that were originally retrieved by the File System Connector are routed to the connector for re-ingestion.