Glossary

    A
  • See ACL.
  • A technology layer that automates operations on unstructured information for cross-enterprise applications. ACI enables an automated and compatible business-to-business, peer-to-peer infrastructure. The ACI allows enterprise applications to understand and process content that exists in unstructured formats, such as email, Web pages, Microsoft Office documents, and IBM Notes.
  • A server component that runs on the Autonomy Content Infrastructure (ACI).
  • An ACL is metadata associated with a document that defines which users and groups are permitted to access the document.
  • A request sent to an ACI server.
  • A domain controller for the Microsoft Windows operating system, which uses LDAP to authenticate users and computers on a network.
  • See APCM.
  • A process that searches for information about a specific topic. An administrator can create agents for users or allow users to create their own agents. See Also: explicit agent, implicit agent
  • An index in IDOL Server that stores agents and profiles.
  • An IDOL Server field that stores Boolean agents (Boolean or proximity expressions that legacy technologies use to categorize documents). You can then query IDOL Server with text and an AgentBoolean field to return categories whose Boolean agent matches this text.
  • Automatic Language Detection. The process of automatically detecting the language of a particular document, and indexing it into IDOL Server according to the rules for the detected language.
  • An automatic process for alerting users, by e-mail, text, or message, when new content is added to IDOL Server that matches their agents or profiles. See Also: mailing.
  • Adaptive Probabilistic Concept Modelling. A technique whereby terms are given a weight according to their statistical importance in IDOL Server. Terms can have a weight between 0 and 255.
  • Automatic Query Guidance. A set of operations that use the results from query summaries. AQG includes dynamic thesaurus generation, automatic query disambiguation, query refinement, and rapid clustering of a results set. See Also: query summary.
  • The process of checking user credentials (user names, passwords, and PIN codes) against an IDOL Server or external security repository. The authentication process identifies a user, and allows IDOL Server to confirm their access permissions for different documents.
  • An internal IDOL Server document rank, which determines the order in which two or more documents return in a results list when the relevance or other sort option is equal.
  • See ALD.
  • See AQG.
  • See ACI.
  • C
  • A set of criteria that define a particular topic, which you can use to categorize documents that contain content relevant to the topic.
  • The IDOL Server component that manages categorization and clustering.
  • An IDOL Server index that stores categories.
  • Text or documents that define a topic or subject for a particular category. When IDOL Server categorizes documents, it matches document content to similar category training.
  • A hierarchically agglomerated collection of data that has been extracted from snapshots. Each cluster represents a concept area that contains a set of items, which share common properties. Clustering data allows you to make trends and developments in data visible.
  • A query operation that combines two or more query results into a specified smaller number of results. The most usual case is to combine two or more sections of the same document as a single query result. It can also combine results by a reference or metadata field value.
  • All the people in a user network neighborhood. It allows users to find other people in the community who have been looking at similar documents, or have agents that are similar to their agents.
  • The IDOL Server component that manages users and communities.
  • A brief summary of each result document that returns for a query. The concept summary displays a few sentences that are typical of the result content (these sentences can be from different parts of the result document).
  • A type of query that allows you to search for documents that match the concept that your query text defines, rather than matching the particular keywords in your text. See Also: query.
  • An IDOL component (for example File System Connector) that retrieves information from a local or remote repository (for example, a file system, database, or Web site).
  • Connector Framework Server processes the information that is retrieved by connectors. Connector Framework Server uses KeyView to extract document content and metadata from over 1,000 different file types. When the information has been processed, it is sent to an IDOL Server or Distributed Index Handler (DIH).
  • The IDOL Server component that manages the data index and performs most of the search and retrieval operations from the index.
  • A conceptual summary of the result document that is biased by the terms of the query. A context summary includes sentences that are particularly relevant to the terms in the query (these sentences can be from different parts of the result document).
  • The process that Connectors and Web crawlers use to retrieve content from Web resources, by recursively following hyperlinks from an initial page. See Also: spidering
  • D
  • DAH distributes actions to multiple copies of IDOL Server or a component. It allows you to use failover, load balancing, or distributed content.
  • An IDOL Server index that stores content data. You can customize how to store data in the data index by configuring appropriate settings in the IDOL Server configuration file.
  • An IDOL Server data pool that stores indexed information. The administrator can set up one or more databases, and specifies how data is fed to the databases. By default, IDOL Server contains the databases Profile, Agent, Activated, Deactivated, News, and Archive.
  • The default user role in IDOL Server. A default user has only the privileges that have been allocated to this default role.
  • DIH allows you to efficiently split and index extremely large quantities of data into multiple copies of IDOL Server or the Content component. DIH allows you to create a scalable solution that delivers high performance and high availability. It provides a flexible way to batch, route, and categorize the indexing of internal and external content into IDOL Server.
  • See DAH.
  • See DIH.
  • E
  • The process of extracting entities (patterns of text) from documents.
  • In Eduction, an entity is a word, phrase, or block of information that the Eduction component can match and extract from documents. An entity can be a specific text string, such as a name, or it can be a pattern of text such as an address or phone number. You define the pattern in a grammar, which Eduction uses to find the entities in documents.
  • An IDOL Server operation to find groups of users with a particular set of expertise or interests.
  • An IDOL agent that users explicitly create from themselves. See Also: agent, implicit agent
  • See XML.
  • The process of extracting text, metadata, and subfiles from documents. IDOL KeyView performs this extraction process in IDOL.
  • F
  • See: parametric search
  • The process of downloading documents from the repository in which they are stored (such as a local folder, Web site, Lotus Domino server, and so on), importing them to IDX format, and indexing them into IDOL Server.
  • A group of settings that instruct a Connector how to retrieve data from a repository. Connectors can run fetch tasks automatically, or in response to an action.
  • Fields define different parts of content in IDOL documents, such as the title, content, and metadata information.
  • A syntax string that defines a matching criteria in FieldText.
  • A type of query that searches for particular content in a particular document field. See Also: query, field
  • G
  • In Eduction, a grammar is a pattern that defines an entity. See Also: Eduction, entity
  • H
  • The ability for IDOL Server to connect related documents to results, by using suggestions. See Also: suggest
  • I
  • An encoded value that identifies the source of a document in IDOL Server. Connectors and CFS add identifiers to every document that they create for indexing into IDOL Server. They store this value in the AUTN_IDENTIFIER field.
  • The Intelligent Data Operating Layer (IDOL) Server, which integrates unstructured, semi-structured and structured information from multiple repositories through an understanding of the content. It delivers a real-time environment in which operations across applications and content are automated.
  • An IDOL Server component that accepts incoming actions and distributes them to the appropriate subcomponent. IDOL Proxy also performs some maintenance operations to make sure that the subcomponents are running, and to start and stop them when necessary.
  • An IDOL application that allows you to manage IDOL Server content.
  • A structured file format that can be indexed into IDOL Server. You can use a connector to import files into this format, or you can manually create IDX files.
  • An IDOL agent that is created as part of a user profile. When you profile a user, IDOL Server creates these agents for a user, according to the documents and search results that the user views. See Also: agent, explicit agent
  • After a document has been downloaded from the repository in which it is stored, it is imported to an IDX or XML file format. This process is called importing.
  • The IDOL Server data index contains document content and field information for analysis and retrieval.
  • An IDOL Server command to index data, or to maintain and manipulate the data index.
  • Fields that IDOL Server processes linguistically when it stores them. Store fields that contain text that you want to query frequently as Index fields. IDOL Server applies stemming and stop word lists to text in Index fields before it stores them, which allows IDOL Server to process queries for these fields more quickly.Typically DRETITLE and DRECONTENT are fields that are set up as Index fields.
  • The process of storing data in IDOL Server. IDOL Server stores data in different field types (such as index, numeric, and ordinary fields). It is important to store data in appropriate field types to ensure optimized performance.
  • An integrated security solution to protect your data. At the front end, authentication checks that users are allowed to access the system that contains the result data. At the back end, entitlement checking and authentication combine to ensure that query results contain only documents that the user is allowed to see, from repositories that the user has permission to access. For more information, refer to the IDOL Document Security Administration Guide.
  • See IQL.
  • Intelligent Query Logic. Functionality that allows you to set up rules to return a particular set of documents, or to run a secondary query in response to an initial keyword or conceptual query.
  • K
  • The IDOL component that extracts data, including text, metadata, and subfiles from over 1,000 different file types. KeyView can also convert documents to HTML format for viewing in a Web browser.
  • L
  • Lightweight Directory Access Protocol. Applications can use LDAP to retrieve information from a server. LDAP is used for directory services (such as corporate email and telephone directories) and user authentication. See also: active directory, primary domain controller.
  • License Server enables you to license and run multiple IDOL solutions. You must have a License Server on a machine with a known, static IP address.
  • Also referred to as "links". Terms in query text that are also contained in the result documents that IDOL Server returns for this query.
  • An embedded scripting language that you can use to write custom scripts to expand certain IDOL functionality.
  • M
  • An automatic process for sending an email to users when new content is added to IDOL Server that matches their agents or profiles. See Also: alerting
  • A security setup where IDOL Connectors index documents into IDOL Server with an encrypted access control list, which IDOL Server uses to match user permissions for the document. With this method, IDOL Server does not need to check the original data repository to check the security information every time a user attempts to access the document. Compare With: unmapped security. See Also: access control list.
  • The ability for computers to act on the meaning of content. This includes conceptual searching, and also workflows that automatically process documents according to their content.
  • O
  • A server that manages access permissions for your users. It communicates with your repositories and IDOL Server to apply access permissions to documents.
  • P
  • A type of query that returns a list of all possible values of a specified field for documents that match a particular standard query. You can use the values to find matching documents with a particular property. This process is also known as faceted search. Compare With: FieldText
  • Personal Identification Number security feature used in addition to a user ID and password.
  • A server computer in a Microsoft Windows domain that controls various computer resources. See also: active directory, LDAP.
  • Role-based capabilities that determine, for example, whether a user is allowed to access specific data.
  • Information about a user that is based on the concepts in documents that the user reads. Every time a user opens a document, IDOL Server updates their profile. This process allows the administrator to bring new documents that match the interests in a user profile to the attention of the users.
  • Targeted content that you want to display to users but that is not included in the search results, such as advertisements.
  • Q
  • A string that you submit to IDOL Server, which analyzes the concept of the query and returns documents that are conceptually similar to it. You can submit queries to IDOL Server to perform several kinds of search, such as natural language, Boolean, bracketed Boolean, and keyword.
  • A query operation that determines the important topics and phrases in a set of documents. Query summaries are used in Automatic Query Guidance (AQG). See Also: AQG.
  • A brief summary of each result document that returns for a query. The quick summary displays the first few sentences of the result document.
  • R
  • A string that is used to identify a document. This might be a title or a URL, and allows IDOL to identify documents for retrieval, indexing, and deduplication.
  • Fields used to identify documents. At index time IDOL Server can use ReferenceType fields to eliminate duplicate copies of documents. It uses them at query time to filter results.
  • The similarity that a particular query result has to the initial query. IDOL Server assigns results a percentage relevance score according to how closely it matches the query criteria.
  • The process used to increase the accuracy of agents by indicating which of the results that return to you are most relevant to your query. The retrained agent then returns more relevant results.
  • A set of privileges that an administrator can allocate to an IDOL Server user.
  • S
  • The process of separating a document into sections for indexing. The number of sections that a document is split into increases proportionally with the size of the document. This process ensures that when you, for example, query for text that is relevant to a specific part of a book, IDOL Server can find the appropriate section and return it (if the book was not indexed in sections, IDOL Server might not find the text you search for, because it might not be conceptually relevant to the entire book).
  • Security includes anything that makes sure that only authorized users can access or perform actions on data. It includes making sure that only permitted users can view and retrieve documents, user authentication, and secure communications.
  • In clustering with snapshots, a seed is a potential cluster. It contains a document, and suggested conceptually similar documents from the IDOL Server index.
  • A form of Eduction that identifies positive and negative sentiment in text.
  • Internal raw data from which you can extract clusters. You can thus generate cluster information and spectrographs.
  • An IDOL Server query type that allows you to search for a term by using a phonetic spelling.
  • A graphical representation of the results of clustering. The spectrograph displays clusters of documents, and the similarities between different clusters.
  • The process that Connectors use to retrieve content from Web resources, by recursively following hyperlinks from an initial page.
  • The process of extracting the morphological root of a word. In languages, some words have a common morphological root. IDOL includes stemming algorithms that reduce words to this form. This process allows IDOL Server to match concepts regardless of the grammatical use of words. In English, for example, the words 'help', 'helpful', 'helping', and 'helped' all reduce to their stem 'help' without significant loss of meaning. IDOL includes as standard a set of stemming algorithms for the most commonly used languages. IDOL Server applies stemming after stop words have been discarded, both at index time (when content is stored in IDOL Server), and at query time (query text is stopped and stemmed before it is matched).
  • A very common word that occurs too frequently to be useful for searching. Stop words include articles (for example, the) and prepositions (for example, to or from). Stop words are language-specific. You can use a stop word list in IDOL Server to allow it to discard these words at index and query time to save index space and improve retrieval performance.
  • Also called stop list. A list (located in the IDOL Server langfiles directory) that contains common words (stop words) that IDOL Server does not store. Words such as the, and, or a occur too frequently to carry any significance, and IDOL Server does not require them to understand the concept of the text.
  • The process of removing the words listed in the stop word list from documents before they are stored in IDOL Server, and from query text before it is matched against IDOL Server content.
  • A set of query results, which is stored in IDOL Server for you to re-use later in other operations. When you store a state, IDOL Server provides a state token, which you can use to retrieve the stored state.
  • A file that has been extracted from a container (such as a .ZIP archive)
  • A type of query that returns documents that contain similar concepts to a particular document, rather than matching a particular query string. See Also: query
  • A few sentences or paragraphs that describe what a document is about. IDOL can automatically create summaries from document content.
  • A file that allows IDOL Server to handle synonym queries. A synonym query returns results which are conceptually similar to the query terms, and conceptually similar to the synonyms that are available for the query terms. A synonym file contains comma-separated lists of synonym strings for words. You can specify lists for each language type that you have set up in IDOL Server in this file.
  • A type of query that returns documents that contain synonyms for a particular search term, as well as documents that contain the term. See Also: query, synonym file
  • T
  • The process of adding extra information to documents. The tag might be a category, or entities returned from Eduction. Tagging usually adds a field to a document, which you can use to search by the name of a tag.
  • An automatically created hierarchical structure of clusters or other information. A taxonomy provides you with an overview of the information landscape, and an insight into specific areas of the information.
  • The basic entity that IDOL Server indexes (for example, a word in a document after it has been stemmed).
  • See TNW.
  • Terms and Weights. These values are used in categorization to define the most important terms that define a category topic.
  • Text, documents, and query syntax used to define the topic that an agent or category must match.
  • U
  • See UQL.
  • A security setup where IDOL Server checks the security entitlement of a user against the original data repositories in real time when the user attempts to access a document. With this method, IDOL Server always has the current security information, but the response can be slow because of the additional connection to the repository. Compare With: mapped security
  • Universal Query Language. A name for the IDOL Server query syntax, which you can use for keyword, conceptual, Boolean, and Wildcard searches.
  • V
  • An IDOL component that converts files in a repository to HTML formats for viewing in a Web browser.
  • W
  • A character that stands in for any character or group of characters in a query.
  • X
  • Extensible Markup Language. XML is a language that defines the different attributes of document content in a format that can be read by humans and machines. In IDOL Server, you can index documents in XML format. IDOL Server also returns action responses in XML format.