MetadataSelector

A list of CSS2 selectors that identify elements in the HTML to extract metadata from. The content of each matching element is extracted and added to the document metadata. If you combine this parameter with MetadataAttribute then the value of the specified attribute is extracted and added to the document metadata instead.

Specify the selectors in a comma-separated list or by using numbered parameters.

To specify the name of the document field(s) to contain the extracted information, set the configuration parameter MetadataFieldName. MetadataSelector and MetadataFieldName must have the same number of values.

Type: String
Default:  
Required: No
Configuration Section:

Any section that you have defined for WkoopHtmlExtraction settings

Example:
MetadataSelector0=h1,h2,h3
MetadataFieldName0=heading
MetadataSelector1=p.important
MetadataFieldName1=important_paragraph

With this example, CFS might extract the following from the HTML document:

<h1>This is a title</h1>
<h2>This is a sub-title</h2>
<p class="important">This is <strong>important</strong> text</p>

...and add the information to the following fields:

#DREFIELD heading="This is a title"
#DREFIELD heading="This is a sub-title"
#DREFIELD important_paragraph="This is <strong>important</strong> text"
See Also:

MetadataFieldName

MetadataAttribute

MetadataSelectorExtractPlainText