CollectMhtFilterLua
The path of a Lua script to use to determine whether documents are collected as MHT files.
By default, the collect
action returns an IDX file that contains the content of the page in its fields. The content is there, but might not be as easy to view as the document in HTML format. When you set this parameter, the specified script is used to determine whether the document is collected in MHT format.
The following is a sample script:
function handler(document) local url = document:getFieldValue(“src”); local extension = string.match(url, "^.*%.([^%.]+)$"); if(extension == nil) then return false end; if(extension == "html" or extension == "htm") then return true end; return false end
When you run the collect
action for a document, the connector calls the specified Lua script. The document is passed to the handler
function. If the function returns true, the connector generates an MHT file of the content of the Web page and this is returned by the collect
along with the IDX. If the function returns false
, only the IDX is returned.
Type: | String |
Default: | |
Required: | No |
Configuration Section: | TaskName or FetchTasks or Default |
Example: | CollectMhtFilterLua=MhtFilter.lua
|
See Also: |