Glossary

    A
  • See ACL.
  • Autonomy Content Infrastructure. A technology layer that automates operations on unstructured information for cross-enterprise applications. ACI enables an automated and compatible business-to-business, peer-to-peer infrastructure.
  • A server component that runs on the Autonomy Content Infrastructure (ACI). ACI servers use the ACI API to accept queries and actions, and return XML responses.
  • Access Control List. A metadata string associated with a document that defines which users and groups are permitted to access the document.
  • A request sent to an ACI server.
  • A domain controller for the Microsoft Windows operating system, which uses LDAP to authenticate users and computers on a network.
  • A process that searches for information about a specific topic. An administrator can create agents for users or allow users to create their own agents.
  • The process of checking user credentials (user names, passwords, and PIN codes) against a Community component or external security repository. The authentication process identifies a user, and allows Knowledge Discovery to confirm their access permissions for different documents.
  • C
  • A component that manages categorization and clustering.
  • Connector Framework Server. A component that processes the information that is retrieved by connectors. CFS uses File Content Extraction to extract document content and metadata from over 1000 different file types. When the information has been processed, it is sent to a Content component index or Distributed Index Handler (DIH).
  • A DAH virtual database that combines results from several non-identical Content component databases. See also: virtual database, distributor database.
  • A component that manages users and communities.
  • A component (for example File System Connector) that retrieves information from a local or remote repository (for example, a file system, database, or Web site).
  • See CFS.
  • A component that manages the data index and performs most of the search and retrieval operations from the index.
  • D
  • Distributed Action Handler. A component that distributes actions to multiple copies of a Knowledge Discovery component. It allows you to use failover, load balancing, or distributed content.
  • A Content component data pool that stores indexed information. The administrator can set up one or more databases, and specify how to feed data to the databases.
  • Distributed Index Handler. A component that allows you to efficiently split and index extremely large quantities of data into multiple copies of the Content component. DIH allows you to create a scalable solution that delivers high performance and high availability. It provides a flexible way to batch, route, and categorize the indexing of internal and external content into the Content component.
  • See DAH.
  • See DIH.
  • A DAH virtual database that retrieves results from several identical Content component databases. For each query, it retrieves results from only one of the identical copies. See also: virtual database, combinator database.
  • An integrated security solution to protect your data. At the front end, authentication checks that users are allowed to access the system that contains the result data. At the back end, entitlement checking and authentication combine to ensure that query results contain only documents that the user is allowed to see, from repositories that the user has permission to access. For more information, refer to the Document Security Administration Guide.
  • F
  • The process of downloading documents from the repository in which they are stored (such as a local folder, Web site, database, Lotus Domino server, and so on), importing them to IDX format, and indexing them into a Content component.
  • A group of settings that instruct a connector how to retrieve data from a repository. Connectors can run fetch tasks automatically, or in response to an action.
  • Fields define different parts of content in Content component documents, such as the title, content, and metadata information.
  • The component that extracts data, including text, metadata, and subfiles from over 1,000 different file types. File Content Extraction can also convert documents to HTML format for viewing in a Web browser.
  • I
  • A structured file format that can be indexed into the Content component. You can use a connector to import files into this format, or you can manually create IDX files.
  • After a document has been downloaded from the repository in which it is stored, it is imported to an IDX or XML file format. This process is called “importing”.
  • The Content component data index contains document content and field information for analysis and retrieval.
  • A Content component command to index data, or to maintain or manipulate the data index.
  • The process of storing data in the Content component. You can store data in different field types (index, numeric, and ordinary fields) or prevent Content from storing it. It is important to store data in appropriate field types to ensure optimized performance. Content can return any fields it stores for queries. However, you can query only for terms in Index fields.
  • K
  • A family of products that allow you to collect, ingest, index, and process unstructure, semi-structured and structured information from multiple sources and repositories.
  • L
  • Lightweight Directory Access Protocol. Applications can use LDAP to retrieve information from a server. LDAP is used for directory services (such as corporate email and telephone directories) and user authentication. See also: active directory, primary domain controller.
  • License Server enables you to license and run multiple Knowledge Discovery solutions. You must have a License Server on a machine with a known, static IP address.
  • M
  • A distribution mode in which DIH distributes to several identical copies of the Content component, for failover or load-balancing.
  • N
  • A distribution mode in which DIH distributes the index between several Content components, which all contain a different segment of the index.
  • O
  • A component that manages access permissions for your users. It communicates with your repositories and components to apply access permissions to documents.
  • P
  • Personal Identification Number security feature used in addition to a user ID and password.
  • A server computer in a Microsoft Windows domain that controls various computer resources. See also: active directory, LDAP.
  • Role-based capabilities that determine, for example, whether a user is allowed to access specific data.
  • Information about a user that is based on the concepts in documents that the user reads. Every time a user opens a document, the Community component updates their profile. This process allows the administrator to alert users to new content that matches the interests in their profiles.
  • Targeted content that you want to display to users but is not included in the search results, such as advertisements.
  • A component that accepts incoming actions and distributes them to the appropriate subcomponent. Proxy also performs some maintenance operations to make sure that the subcomponents are running, and to start and stop them when necessary.
  • Q
  • A document stored in the Promotion Agentstore that defines how QMS manages a query. Rules can return promotion documents, modify the original query, or modify the results of a query. See also: Query Manipulation Server (QMS).
  • A string that you submit to the Content component, which analyzes the concept of the query and returns documents that are conceptually similar to it. You can submit queries to Contentr to perform several kinds of search, such as natural language, Boolean, bracketed Boolean, and keyword.
  • A JavaScript application that manipulates queries and query results.
  • An ACI server that manipulates queries and results according to user-defined rules.
  • R
  • A string that is used to identify a document. This might be a title or a URL, and allows the Content component to identify documents for retrieval, indexing, and deduplication.
  • Fields used to identify documents. At index time the Content component can use ReferenceType fields to eliminate duplicate copies of documents. At query time Content can use ReferenceType to filter results.
  • A set of privileges that the administrator allocates to a Knowledge Discovery user.
  • S
  • Security includes anything that makes sure that only authorized users can access or perform actions on data. It includes making sure that only permitted users can view and retrieve documents, user authentication, and secure communications.
  • A type of query that returns documents that contain similar concepts to a particular document, rather than matching a particular query string. See also: query.
  • T
  • The basic entity that the Content component indexes (for example, a word in a document after Content applies stemming to it).
  • V
  • A component that converts files in a repository to HTML formats for viewing in a Web browser.
  • In the DAH, a virtual database controls the mapping between the DAH and specific databases in the child servers. See also: combinator database, distributor database.
  • W
  • A character that stands in for any character or group of characters in a query.
  • X
  • Extensible Markup Language. XML is a language that defines the different attributes of document content in a format that can be read by humans and machines. In Knowledge Discovery, you can index documents in XML format. Knowledge Discovery can also return action responses in XML format.