Index Overview
You can index only files in XML or IDX format into the Content component. If the data that you want to index is in XML format, you can index it directly, without having to first import it (convert its content and metadata to IDX).
Knowledge Discovery connectors use the DREADD
and DREADDDATA
index actions to index data. You can also use these actions to index data directly into the Content component.
NOTE: Before you index data, review the setup instructions described in Configure Content Storage.
If your data is not in XML format, you must first import it. You can import data using one of three methods.
-
Import with a connector. The connectors (for example, File System Connector, Web Connector, Oracle Connector, and so on) retrieve documents from different repositories and use NiFi Ingest or the Connector Framework Server (CFS) to import them into IDX or XML file format. For further information about how to import documents, refer to the NiFi Ingest documentation, the CFS help, or the appropriate connector guide.
-
Import manually. You can create a text file in either XML or IDX format, which contains the information that you want to index into Content in specific fields.
- Import with Knowledge Discovery Admin. You can use the wizard on the Index tab on the Console page in the Control section of Knowledge Discovery Admin to submit data for Content to index. For more information, refer to the Knowledge Discovery Admin User Guide.
After the documents are in XML or IDX file format, you can index them into Content using one of two methods.
-
Index with a connector. CFS indexes the IDX files that it creates into Content. For information about how to index documents with a connector, refer to the NiFi Ingest documentation, or the appropriate connector guide.
-
Index directly. You can index XML and IDX files into Content by issuing an HTTP request from your Web browser.
TIP: If you index into Content from another host, you must configure the Content component to accept connections from the host, by applying the appropriate authorization roles. See Configure Client Authorization.
If you use CFS, you must assign CFS to the Admin, Index, and Query standard roles, or equivalents.
Depending on where the data to index is located, the indexing steps for each document take place in the following order.
Local document (accessed through the file system) | Remote document (accessed over the indexing port) |
---|---|
|
|