MetadataSelector
A list of CSS selectors that identify elements in the HTML to extract metadata from. The content of each matching element is extracted and added to the document metadata. If you combine this parameter with MetadataAttribute then the value of the specified attribute is extracted and added to the document metadata instead.
Specify the selectors in a comma-separated list or by using numbered parameters.
To specify the name of the document field(s) to contain the extracted information, set the configuration parameter MetadataFieldName. MetadataSelector
and MetadataFieldName
must have the same number of values.
Type: | String |
Default: | |
Required: | No |
Configuration Section: | TaskName or FetchTasks or Default |
Example: | MetadataSelector0=h1,h2,h3 MetadataFieldName0=heading MetadataSelector1=p.important MetadataFieldName1=important_paragraph With this example, the connector might extract the following from the HTML document: <h1>This is a title</h1> <h2>This is a sub-title</h2> <p class="important">This is <strong>important</strong> text</p> ...and add the information to the following fields: #DREFIELD heading="This is a title" #DREFIELD heading="This is a sub-title" #DREFIELD important_paragraph="This is <strong>important</strong> text" |
See Also: |