ChildMetadataFieldSections
A list of sections, in the configuration file, that contain settings for extracting metadata into structured metadata fields.
When you set this parameter, the field specified by MetadataFieldName will not have a value. Instead it will have sub-fields and the connector extracts metadata into those sub-fields. The sub-field names and values are configured in the configuration sections that you name using this parameter.
Imagine a web page that contains the following HTML.
<div class="comments"> <div class="comment"> <div class="id">Comment1</div> <div class="title">Title</div> <div class="text">The comment text</div> </div> <div class="comment"> <div class="id">Comment2</div> <div class="title">Title</div> <div class="text">The comment text</div> </div> <div class="comment"> <div class="id">Comment3</div> <div class="title">Title</div> <div class="text">The comment text</div> </div> </div>
To extract the comment IDs, titles, and text into structured metadata fields, you could use a configuration similar to the following.
[MyTask] ... MetadataFieldSections=Comments [Comments] MetadataFieldName=comment MetadataSelector=div.comments div.comment ChildMetadataFieldSections0=ExtractCommentID ChildMetadataFieldSections1=ExtractCommentTitle ChildMetadataFieldSections2=ExtractCommentText [ExtractCommentID] MetadataFieldName=id MetadataSelector=:scope > div.id [ExtractCommentTitle] MetadataFieldName=title MetadataSelector=:scope > div.title [ExtractCommentText] MetadataFieldName=text MetadataSelector=:scope > div.text
This would result in the following structured IDOL document metadata.
<xmlmetadata> <comment> <id>Comment1</id> <title>Title<title> <text>The comment text</text> </comment> <comment> <id>Comment2</id> <title>Title<title> <text>The comment text</text> </comment> <comment> <id>Comment3</id> <title>Title<title> <text>The comment text</text> </comment> </xmlmetadata>
Type: | String |
Default: | |
Required: | No |
Configuration Section: | Any section specified by MetadataFieldSections, ChildDocumentMetadataFieldSections, or ChildMetadataFieldSections |
Example: | ChildMetadataFieldSections0=ExtractCommentID ChildMetadataFieldSections1=ExtractCommentTitle ChildMetadataFieldSections2=ExtractCommentText |
See Also: |