In query result clustering, IDOL Server selects the most relevant document in the query result set as the basis for the first cluster. It then compares the remaining results to this document, and adds them to the cluster if the relevance to the first document exceeds the configured ClusterThreshold
. IDOL Server then applies this process to the remaining unclustered documents, and continues until all results are assigned to a cluster.
A query returns 10 results, numbered 0 to 9. Document 0 is the basis of the first cluster. When compared to this basis document, documents 1 and 2 have a relevance score higher than 50, so they are added to the cluster. So the first cluster contains documents 0, 1, and 2.
This process continues for the remaining documents. Document 3 is the basis for the second cluster, and documents 4, 5, 6, and 7 have high enough relevance scores and are added to this cluster. Document 8 is the basis of the third cluster, and document 9 is similar to it. No results remain, so the process ends.
IDOL Server creates a title for each cluster according to the best terms and phrases contained in the cluster documents.
|