FollowRobotProtocol

A Boolean value that specifies whether the connector respects instructions provided by Web sites to Web crawlers. By default, the connector respects the following instructions:

  • The file robots.txt, if present in the root directory of a Web site.
  • NOINDEX and NOCRAWL instructions in the robots meta element of a Web page.

To ignore these instructions, set FollowRobotProtocol=False.

Type: Boolean
Default: True
Required: No
Configuration Section: TaskName or FetchTasks or Default
Example: FollowRobotProtocol=False
See Also: IgnoreRobotProtocolErrors