ProperNames
The method to use to handle proper names when indexing.
In IDOL, proper name terms are pairs of adjacent words that both begin with a capital letter followed by lower case letters, such as Alex Smith or Dragon Restaurant. Depending on the setting that you use, you can also include adjacent pairs of terms that include stop words, such as The Who, or The Queen.
By default, IDOL Server stems individual terms and discards stop word terms, regardless of capitalization. Use the ProperNames
parameter if you want to store additional terms where capitalized terms occur together. In these cases, IDOL Server compounds the two adjacent terms and indexes the combined unit as a term. For example, Alex Smith
is compounded to ALEXSMITH
. For longer proper name strings, each pair of terms is compounded separately.
This option can improve the relevance of results when you search for the proper name terms. For example, a search for George Washington might return documents that contains the phrase George Bush in Washington, D.C. Indexing proper names terms might improve relevance for documents that include the exact name George Washington. For more information about when to use the ProperNames
parameter, refer to IDOL Expert.
This parameter accepts the following values:
0
|
Do not store proper name terms. |
1
|
Adjacent capitalized terms are compounded, then stemmed and indexed as a unit. For example, Sam James is indexed as SAMJAM. |
2
|
Adjacent terms are compounded (regardless of capitalization), then stemmed and indexed as a unit. For example, bottlenose dolphins is indexed as BOTTLENOSEDOLPHIN. NOTE: This setting considerably increases the number of terms in the IDOL Server index, which can slow down its performance. |
The following ProperNames
options allow you to query for proper name terms that include stop words (for example, The Who, or The Queen).
NOTE: The following settings affect only stop words that start with a capital letter. In all cases, the indexing of individual stop word terms is controlled by the StopWordIndex configuration parameter.
3
|
Adjacent capitalized stop words are compounded, then stemmed and indexed as a unit. For example, And His is indexed as ANDHI. Adjacent capitalized terms are compounded, then stemmed and indexed as a unit. For example, Sam James is indexed as SAMJAM. Capitalized stop words adjacent to capitalized terms are treated as individual terms. For example, The Queen is treated as THE and QUEEN, according to your stop word rules. |
4
|
Capitalized stop words are compounded with adjacent capitalized terms, then stemmed and indexed as a unit. For example, The Bells is indexed as THEBEL, and Calling Will is indexed as CALLINGWIL. Adjacent capitalized stop words are compounded, then stemmed and indexed as a unit. Adjacent capitalized terms are compounded, then stemmed and indexed as a unit. |
5
|
Adjacent capitalized stop words are compounded and indexed unstemmed as a unit. For example, And His is indexed as ANDHIS. Adjacent capitalized terms are compounded and indexed unstemmed as a unit. For example, Sam James is indexed as SAMJAMES Capitalized stop words adjacent to capitalized terms are treated as individual terms. |
6
|
Capitalized stop words are compounded with adjacent capitalized terms, and indexed unstemmed as a unit. For example, The Bells is indexed as THEBELLS, and Calling Will is indexed as CALLINGWILL. Adjacent capitalized stop words are compounded and indexed unstemmed as a unit. Adjacent capitalized terms are compounded and indexed unstemmed as a unit. |
7
|
Capitalized stop words are compounded with adjacent capitalized terms, and indexed unstemmed as a unit. Adjacent capitalized stop words are compounded and indexed unstemmed as a unit. Adjacent capitalized terms are treated as individual terms. For example, Sam James is indexed as SAM and JAME. |
The following table shows the terms that IDOL Server stores for each ProperNames
setting for the sentence Tom Jones And His greatest hits:
Original | Tom | Jones | And | His | greatest | hits | ||||
0
|
TOM
|
JONE
|
GREAT
|
HIT
|
||||||
1
|
TOM
|
TOMJON
|
JONE
|
GREAT
|
HIT
|
|||||
2
|
TOM
|
TOMJON
|
JONE
|
GREAT
|
GREATESTHIT
|
HIT
|
||||
3
|
TOM
|
TOMJON
|
JONE
|
ANDHI
|
GREAT
|
HIT
|
||||
4
|
TOM
|
TOMJON
|
JONE
|
JONESAND
|
ANDHI
|
GREAT
|
HIT
|
|||
5
|
TOM
|
TOMJONES
|
JONE
|
ANDHIS
|
GREAT
|
HIT
|
||||
6
|
TOM
|
TOMJONES
|
JONE
|
JONESAND
|
ANDHIS
|
GREAT
|
HIT
|
|||
7
|
TOM
|
JONE
|
JONESAND
|
ANDHIS
|
GREAT
|
HIT
|
Type: | Long |
Default: | 0 |
Required: | No |
Configuration Section: | LanguageTypes or MyLanguage |
Example: | ProperNames=1
|
See Also: |
|
NOTE: If you change this setting after you have indexed content into IDOL Server, the new setting applies only to new content, and the server logs a warning. To clear the warning and ensure that your change applies to all your content, you must initialize your index and reindex the content.