To automatically retrieve content from a repository, create a new fetch task by following these steps. The connector runs each fetch task automatically, based on the schedule that is configured in the configuration file.
To create a new Fetch Task
In the [FetchTasks]
section of the configuration file, specify the number of fetch tasks using the Number
parameter. If you are configuring the first fetch task, type Number=1
. If one or more fetch tasks have already been configured, increase the value of the Number
parameter by one (1). Below the Number
parameter, specify the names of the fetch tasks, starting from zero (0). For example:
[FetchTasks] Number=1 0=MyTask
Below the [FetchTasks]
section, create a new TaskName section. The name of the section must match the name of the new fetch task. For example:
[FetchTasks] Number=1 0=MyTask [MyTask]
In the new section, set one of the following parameters to specify the sites that you want to index.
URLN
|
Specify the URLs where you want to start indexing. |
URLFile
|
Specify the full path to a file that contains a list of URLs. |
For example:
[MyTask] URL0=http://www.autonomy.com URL1=http://www.another-website.com
or
[MyTask] URLFile=C:\autonomy\urls.txt
[TaskName]
section, use further parameters to configure the task. For information about the parameters that you can use, refer to the HTTP Connector (CFS) Reference. For example, you can specify how links are followed or the maximum number of pages that are retrieved.Save and close the configuration file. You can now start the connector.
Note: The connector saves a record of the data that is has retrieved for each fetch task. If you make changes to the configuration and want to reset the connector so that it retrieves all of your data again, delete the data files (connector_[fetchtask_name]_datastore.db
) in the connector’s installation folder.
|