The Import Process
The following chart provides a summary of the import process.
- CFS takes a document from the import queue.
-
CFS performs the pre-import tasks that are configured in its configuration file. Pre-import tasks occur before files are processed by KeyView. You can use pre-import tasks to manipulate and enrich documents (see Manipulate and Enrich Documents). Sometimes it is important to run tasks before KeyView processing. For example, if you send an audio file to Media Server for analysis, you might not want to process it with KeyView.
TIP: Both pre- and post-import tasks can reject a document, so that it is discarded and not indexed. You might configure CFS to reject a document if the associated file does not contain useful content. Documents are not rejected when an import task fails - in that case CFS continues processing the document.
- Unless the document contains the metadata field
AUTN_NO_EXTRACT
, CFS uses KeyView to extract sub-files. Examples of files that have sub-files include e-mail messages (which have attachments) and zip files (which contain other files). CFS creates a new document for each sub-file and adds the new documents to the import queue to be processed separately. -
Unless the document contains the metadata field
AUTN_NO_FILTER
, CFS uses KeyView to filter the associated source file. Filtering extracts the text from a file. An office document is likely to contain useful text, while an archive file (for example a zip file) or a media file is unlikely to have textual content.TIP: Although media files (images, audio, and video) do not contain text, you can extract useful information by sending the files to an IDOL Media Server.
- CFS performs the post-import tasks that are configured in its configuration file.
- Processing is complete and the document is ready to be indexed.