The documents in the IDOL Content component store data in fields. These fields can contain many different types of information. For example, one field might contain the bulk of the document content, and another field might contain only the name of the author.
When you index data into the IDOL Content component, it is important to set up field processes so that the IDOL Content component treats each field correctly.
Before you set up fields in IDOL, you must consider the fields in the documents that you want to index.
There are a large number of different field properties that you can apply to IDOL fields. Fields can contain information about the document, or values that you want to retrieve in queries.
For a complete list of the types of fields that you can store in the IDOL Content component, refer to the IDOL Server Administration Guide and IDOL Server Reference.
Fields that contain information about the document | |
---|---|
ReferenceType
|
Reference fields contain a unique document reference, which you can use to remove duplicate documents, and to retrieve a specific document. Each document must contain at least one reference field. The IDOL Content component also uses the reference field to populate the autn:reference metadata field. |
DateType
|
Date fields contain the document date, which the IDOL Content component uses to populate the autn:date metadata field. If a document does not have a date field, Content uses the date that the document was indexed. |
TitleType
|
Title fields contain the document title, which the IDOL Content component uses to populate the autn:title metadata field. |
DatabaseType
|
Database fields contain the IDOL database that the IDOL Content component must index the document into. If the document does not contain a database field, you must specify the database in the index action. |
LanguageType
|
Language fields contain the language type of the document, which the IDOL Content component uses to find the appropriate language configuration. The IDOL Content component also uses this field to populate the autn:languagetype metadata field. |
SecurityType
|
Security fields contain the security type of the document, which the IDOL Content component uses to index the document according to the specified security configuration. |
ACLType
|
ACL fields contain document access control lists (ACLs), which determine the access restrictions for that document. |
ExpireDateType
|
Expire date fields contain the date that the document expires. On this date, IDOL processes the document according to the expiry rules for the database, for example to delete the document or move it to an archive database. The IDOL Content component also uses this field to populate the autn:expiredate metadata field. |
SectionBreakType
|
Section break fields contain the section number for a document section, which the IDOL Content component uses to populate the autn:section metadata field. |
ParametricType
|
Parametric fields contain values that you want to use to restrict queries. In a parametric query, you can return all values that occur in a certain parametric field in all documents. |
ParametricRangeType
|
Parametric fields contain values that you want the IDOL Content component to dynamically analyze to generate numeric ranges in parametric queries. |
Fields that contain document content | |
Index
|
Index fields contain document content. The IDOL Content component processes these fields linguistically and stores terms to allow fast data retrieval. Typically, you store the document content and title field as index fields. Do not store data as index fields if:
|
SourceType
|
Source fields contain document content that the IDOL Content component uses to create document summaries and suggest conceptually similar documents. If you do not configure source fields, the IDOL Content component uses the configured Index fields. |
LangDetectType
|
Language Detection fields contain document content that the IDOL Content component uses to automatically detect the document language. If you do not configure a language detection fields, the IDOL Content component uses the configured source fields. |
HighlightType
|
Highlight fields contain document content that IDOL server can highlight when you send the Highlight action (or a query action with the Highlight parameter set to true ). |
Field properties that optimize FieldText query operators | |
MatchType
|
See Match Fields |
NumericType
|
See Numeric Fields |
NumericDateType
|
See Numeric Date Fields |
GeospatialType
|
See Geospatial Fields |
Field properties that determine how to display fields | |
HiddenType
|
Hidden fields do not appear |
PrintType
|
Print fields are displayed in a Query action when the Print parameter is set to Fields (the default value). |
To identify properties for different fields, you must define field processes. In a field process, you define:
the set of fields that the field process applies to
the property that applies to this process
You can have multiple field processes that share the same property. You must then create a configuration section for each property that you use, and define the field properties.
NOTE: Use the following formats to identify fields:
/FieldName
to match root-level fields
*/FieldName
to match all fields except root-level
/Path/FieldName
to match fields that the specified path points to.
Field names must not contain spaces nor accents, and they must not start with a number. For IDX documents, the IDOL Content component converts these text elements to underscores (_) when it indexes the fields. You must also change any queries that reference these field names to use the modified field name.
To apply processes to fields or documents that contain specific fields
Open the IDOL Content component configuration file in a text editor.
In the [FieldProcessing]
section, list the processes to apply to fields. For example:
[FieldProcessing] 0=IndexFields 1=DateFields 2=DatabaseFields 3=SetReferenceFields
Create a configuration section for each process listed.
PropertyFieldCSVs
to a comma-separated list of fields that this process applies to.Property
to the name of the property configuration section.NOTE: Each property must have a unique configuration section name.
(Optional) Set PropertyMatch
to a comma-separated list of values that the field must have to be processed. For example, you can use this parameter to set up a process that identifies security or language fields.
(Optional) Set PropertyNegativeFieldCSVs
to a comma-separated list of fields that this process does not apply to. For example, if you use a wildcard value in PropertyFieldCSVs
, you can define any exceptions in PropertyNegativeFieldCSVs
.
For example:
[IndexFields] Property=Index PropertyFieldCSVs=*/DRECONTENT,*/DRETITLE [DateFields] Property=Date PropertyFieldCSVs=*/DREDATE,*/harvest_time [DatabaseFields] Property=Database PropertyFieldCSVs=*/DREDBNAME,*/Database* PropertyNegativeFieldCSVs=*/DatabaseNumber [SetReferenceFields] Property=Reference PropertyFieldCSVs=*/DREREFERENCE,*/DRETITLE
Create a section for each property. Specify appropriate configuration parameters for each property. These configuration parameters define the processes that apply to all the fields (or all documents that contain the fields) that you previously associated with the processes.
For example:
[Index] Index=TRUE [Date] DateType=TRUE [Database] DatabaseType=TRUE [Reference] ReferenceType=TRUE TrimSpaces=TRUE
NOTE: For some properties, you must reindex your content if you want to change the property value after you have indexed content into the IDOL Content component.
For details, refer to the IDOL Server Reference.
Save and close the configuration file.
Restart the IDOL Content component for your changes to take effect.
|