Create the Data Flow

This section describes how to create a dataflow to migrate information from one repository to another. In the following example, a GetWeb processor downloads web pages and a PutFileSystem processor inserts them into a folder in the file system.

To create a dataflow to migrate data between repositories

  1. Add and configure a connector to retrieve information from the source repository. The connector must generate the AUTN_MIGRATION_URI document field (see Supported Connectors).

  2. Add a processor to copy the value of the AUTN_MIGRATION_URI metadata field to a FlowFile attribute named idol.put.migrationuri.

    1. Add a processor, by dragging the processor icon from the components toolbar to the canvas.

      The Add Processor dialog box opens.

    2. In the Source list, click idol.nifi.

      The list of processors is filtered to show only IDOL NiFi Ingest processors.

    3. Select the UpdateAttributeFromMetadata processor and click ADD.

      The processor is added to the canvas.

    4. Right-click the processor and click Configure.

      The Configure Processor dialog box opens.

    5. Click the Properties tab.
    6. Click and add a dynamic property:

      Property Name Property Value

      The FlowFile attribute to add or update:

      idol.put.migrationuri

      An XPath expression, to choose the document metadata field or document metadata field attribute to use to set the value of the FlowFile attribute:

      //AUTN_MIGRATION_URI

  3. Add and configure a connector to insert the information into the destination repository. For information about the connectors that you can use, see Supported Connectors. This example uses a File System Connector, so you would need to add a PutFileSytem processor and set the following dynamic property:

    Property Name Example Property Value
    migration:rootDirectory

    Windows: D:\path\to\migrated_files\

    Linux: /path/to/migrated_files/

  4. Connect the processors:

    • Connect the success relationship of the Get* processor to the UpdateAttributeFromMetadata processor.
    • Connect the success relationship of the UpdateAttributeFromMetadata processor to the Put* processor.
  5. You can now start all of the processors in the dataflow. (Go to the Operate palette and click Start ).

    The web pages are downloaded and inserted into the file system.