MetadataSelector

A list of CSS2 selectors that identify elements in HTML documents. The content of the elements is extracted and added to document metadata fields.

Specify the selectors in a comma-separated list or by using numbered parameters.

To specify the name of the document field(s) to contain the extracted information, set the configuration parameter MetadataFieldName. Both parameters should have the same number of values.

Type: String
Default:  
Required: No
Configuration Section: TaskName or FetchTasks or Default
Example:
MetadataSelector0=h1,h2,h3
MetadataFieldName0=heading
MetadataSelector1=p.important
MetadataFieldName1=important_paragraph

With this example, the connector might extract the following from the HTML document:

<h1>This is a title</h1>
<h2>This is a sub-title</h2>
<p class="important">This is <strong>important</strong> text</p>

...and add the information to the following fields:

#DREFIELD heading="This is a title"
#DREFIELD heading="This is a sub-title"
#DREFIELD important_paragraph="This is <strong>important</strong> text"
See Also: MetadataFieldName

_HP_HTML5_bannerTitle.htm