Choose When to Run a Task

Import Tasks run when new documents are processed by CFS, before the documents are indexed. You can run Import Tasks before and/or after KeyView filtering.

  • Pre-import tasks run before KeyView filtering. At this point the document only contains metadata extracted from the repository by the connector.
  • Post-import tasks run after KeyView filtering. At this point the document also contains any content and metadata that was extracted from the file associated with the document.

Index Tasks run when a document’s metadata (but not its content) is updated, or when a document is deleted. When a connector detects that document metadata has been updated or that a document has been deleted from a repository, it sends this information to CFS so that the document can be updated or removed from indexes such as IDOL Server.

  • Update index tasks run when a document’s metadata (but not its content) is updated.
  • Delete index tasks run when a document is deleted from a repository.

You can run some tasks, such as the Lua task, at any point during the import or indexing process.

You can run other tasks only at specific points within the import or indexing process. For example, to validate the content of documents you must use a post-import task. You cannot use a pre-import task because pre-import tasks occur before KeyView filtering, when documents do not contain any content.

The following table shows when you can run each type of task.

Task Import Tasks Index Tasks
Pre Post Update Delete
Run a Lua script
Lua
Write documents to disk
CsvWriter
IdxWriter
JsonWriter
SqlWriter
XmlWriter
Manipulate and enrich documents
Eduction      
EmailAddressNormalisation    
ExtractMetadata      
HtmlExtraction      
ImportFile    
Sectioner      
Standardizer    
TextToDocs      
Validate and reject documents
BadFilesFilter      
BinaryFileFilter      
ImportErrorFilter      
SymbolicContentFilter      
WordLengthFilter      
Media analysis
MediaServerAnalysis    

You can also call many of the tasks from a Lua script, which allows more advanced processing. For example, you might want to run a task only on selected documents. For information about the Lua functions that are provided by CFS, refer to the Connector Framework Server Reference.

Related Topics