Ingest JSON
Many systems export data in JSON format. This section describes how to ingest JSON into IDOL using NiFi Ingest.
The steps in this section assume that:
- You have one or more files each of which contains JSON that should be parsed into one or more IDOL documents.
- The data is not necessarily in IDOL document format.
To ingest data in JSON format
-
Add a GetFileSystem processor to your data flow to retrieve the JSON file(s).
- Configure the location of your JSON files by setting the property "Directory Paths". If the folder contains other files, you could also set "File name pattern" to
*.json
. - If you are running a NiFi cluster, set the dynamic property
adv:FlowFileEmbedFiles
to TRUE. For more information about this property, see Advanced Connector Properties.
- Configure the location of your JSON files by setting the property "Directory Paths". If the folder contains other files, you could also set "File name pattern" to
- Add a ConvertJSONToDocuments processor to the data flow.
- Connect the "success" relationship of the GetFileSystem processor to the ConvertJSONToDocuments processor.
-
Configure the ConvertJSONToDocuments processor.
-
Right-click the processor and click Configure.
The Configure Processor dialog box opens.
- Click the Properties tab.
- Set the JSON Parsing Config property. This accepts a collection of configuration parameters that specify how to construct IDOL documents from the JSON data. The parameters that you can use are the same as those in the
[JSONParsing]
section of the CFS configuration file. (When configuring the NiFi processor, do not include the[JSONParsing]
section header). For more information about these parameters, refer to the Connector Framework Server Reference. - Click APPLY.
-
-
Connect the "extracted" relationship of the ConvertJSONToDocuments processor to your ingestion pipeline.
TIP: After they are processed, the original FlowFiles that were routed to the ConvertJSONToDocuments processor are routed to the "processed" relationship.
To avoid indexing documents representing the original JSON files, you could auto-terminate this relationship. However, if you are using a document registry service to ensure that documents are indexed in the correct sequence, route the "processed" relationship to an UnregisterDocument processor. For more information about the document registry service, see Index Documents in the Correct Sequence.
- Start the GetFileSystem and ConvertJSONToDocuments processors.