Create the Data Flow
This section describes how to create a dataflow to migrate information from one repository to another. In the following example, a GetWeb processor downloads web pages and a PutFileSystem processor inserts them into a folder in the file system.
To create a dataflow to migrate data between repositories
-
Add and configure a connector to retrieve information from the source repository. The connector must generate the
AUTN_MIGRATION_URI
document field (see Supported Connectors). -
Add a processor to copy the value of the
AUTN_MIGRATION_URI
metadata field to a FlowFile attribute namedidol.put.migrationuri
.-
Add a processor, by dragging the processor icon
from the components toolbar to the canvas.
The Add Processor dialog box opens.
-
In the Source list, click idol.nifi.
The list of processors is filtered to show only IDOL NiFi Ingest processors.
-
Select the UpdateAttributeFromMetadata processor and click ADD.
The processor is added to the canvas.
-
Right-click the processor and click Configure.
The Configure Processor dialog box opens.
- Click the Properties tab.
-
Click
and add a dynamic property:
Property Name Property Value The FlowFile attribute to add or update:
idol.put.migrationuri
An XPath expression, to choose the document metadata field or document metadata field attribute to use to set the value of the FlowFile attribute:
//AUTN_MIGRATION_URI
-
-
Add and configure a connector to insert the information into the destination repository. For information about the connectors that you can use, see Supported Connectors. This example uses a File System Connector, so you would need to add a PutFileSytem processor and set the following dynamic property:
Property Name Example Property Value migration:rootDirectory
Windows:
D:\path\to\migrated_files\
Linux:
/path/to/migrated_files/
-
Connect the processors:
- Connect the success relationship of the
Get*
processor to the UpdateAttributeFromMetadata processor. - Connect the success relationship of the UpdateAttributeFromMetadata processor to the
Put*
processor.
- Connect the success relationship of the
-
You can now start all of the processors in the dataflow. (Go to the Operate palette and click Start
).
The web pages are downloaded and inserted into the file system.