Glossary

    A
  • See ACL.
  • Autonomy Content Infrastructure. A technology layer that automates operations on unstructured information for cross-enterprise applications. ACI enables an automated and compatible business-to-business, peer-to-peer infrastructure.
  • A server component that runs on the Autonomy Content Infrastructure (ACI). ACI servers use the ACI API to accept queries and actions, and return XML responses.
  • Access Control List. A metadata string associated with a document that defines which users and groups are permitted to access the document.
  • A request sent to an ACI server.
  • A domain controller for the Microsoft Windows operating system, which uses LDAP to authenticate users and computers on a network.
  • A process that searches for information about a specific topic. An administrator can create agents for users or allow users to create their own agents.
  • An index that stores agents and profiles.
  • A field that stores Boolean agents (Boolean or Proximity expressions). You can then query the Content component with text and an AgentBoolean field to return categories whose Boolean agent matches this text. QMS rules use Boolean agents for matching queries.
  • B
  • A rule that allows you to specify a list of disallowed words for queries, and removes any terms that appear on this list.
  • A rule that modifies a query to include extra FieldText criteria, for example to boost the relevance of certain results.
  • C
  • A QMS rule that allows you to add a specified result to a specified position in a query results list.
  • A component that manages categorization and clustering.
  • Connector Framework Server. A component that processes the information that is retrieved by connectors. CFS uses File Content Extraction to extract document content and metadata from over 1000 different file types. When the information has been processed, it is sent to a Content component index or Distributed Index Handler (DIH).
  • A component that manages users and communities.
  • A type of query that allows you to search for documents that match the concept that your query text defines, rather than matching the particular keywords in your text. See also: query.
  • A component (for example File System Connector) that retrieves information from a local or remote repository (for example, a file system, database, or Web site).
  • See CFS.
  • A component that manages the data index and performs most of the search and retrieval operations from the index.
  • D
  • Distributed Action Handler. A component that distributes actions to multiple copies of a Knowledge Discovery component. It allows you to use failover, load balancing, or distributed content.
  • An index that stores content data. You can customize how data is stored in the data index by configuring appropriate settings in the Content component configuration file.
  • A Content component data pool that stores indexed information. The administrator can set up one or more databases, and specifies how data is fed to the databases. By default, Content component contains the databases News and Archive, and the Community and Category Agentstore component contains the databases Profile, Agent, Activated, and Deactivated.
  • Distributed Index Handler. A component that allows you to efficiently split and index extremely large quantities of data into multiple copies of the Content component. DIH allows you to create a scalable solution that delivers high performance and high availability. It provides a flexible way to batch, route, and categorize the indexing of internal and external content into the Content component.
  • See DAH.
  • See DIH.
  • An integrated security solution to protect your data. At the front end, authentication checks that users are allowed to access the system that contains the result data. At the back end, entitlement checking and authentication combine to ensure that query results contain only documents that the user is allowed to see, from repositories that the user has permission to access. For more information, refer to the Document Security Administration Guide.
  • A query that returns a document or set of documents that you want to promote. See Also: static promotions.
  • E
  • See: XML
  • F
  • Fields define different parts of content in Content component documents, such as the title, content, and metadata information.
  • A syntax string that defines a matching criteria in FieldText.
  • A type of query that searches for particular content in a particular document field. See also: query, field.
  • The component that extracts data, including text, metadata, and subfiles from over 1,000 different file types. File Content Extraction can also convert documents to HTML format for viewing in a Web browser.
  • H
  • A rule that modifies query text to replace specific terms with a more general term. For example, flower is a hypernym of rose and lily.
  • A rule that modifies query text to include terms that are specific instances of the original terms. For example, poodle and labrador are hyponyms of dog.
  • I
  • A structured file format that you can index into the Content component. You can use a connector to import files into this format, or you can manually create IDX files.
  • The Content component data index contains document content and field information for analysis and retrieval.
  • A Content component command to index data, or to maintain or manipulate the data index.
  • Fields that the Content component processes linguistically when it stores them. Store fields that contain text which you want to query frequently as Index fields. Content applies stemming and stop word lists to text in Index fields before it stores them, which allows it to process queries for these fields more quickly. Typically DRETITLE and DRECONTENT are fields that are set up as Index fields.
  • The process of storing data in the Content component. Content stores data in different field types (for example, index, numeric, and ordinary fields). It is important to store data in appropriate field types to ensure optimized performance.
  • K
  • A family of products that allow you to collect, ingest, index, and process unstructure, semi-structured and structured information from multiple sources and repositories.
  • L
  • Lightweight Directory Access Protocol. Applications can use LDAP to retrieve information from a server. LDAP is used for directory services (such as corporate email and telephone directories) and user authentication. See also: active directory, primary domain controller.
  • License Server enables you to license and run multiple Knowledge Discovery solutions. You must have a License Server on a machine with a known, static IP address.
  • Also referred to as “links”. Terms in query text that are also contained in the result documents that the Content component returns for this query.
  • O
  • A component that manages access permissions for your users. It communicates with your repositories and components to apply access permissions to documents.
  • P
  • A QMS rule that allows you to place a specified parametric field in a specified position in a list.
  • A server computer in a Microsoft Windows domain that controls various computer resources. See also: active directory, LDAP.
  • The agent index that stores QMS rules.
  • Targeted content that you want to display to users but that is not included in the search results, such as advertisements.
  • A component that accepts incoming actions and distributes them to the appropriate subcomponent. Proxy also performs some maintenance operations to make sure that the subcomponents are running, and to start and stop them when necessary.
  • Q
  • An ACI server that manipulates queries and results according to user-defined rules.
  • A document stored in the Promotion Agentstore that defines how QMS manages a query. Rules can return promotion documents, modify the original query, or modify the results of a query.
  • A string that you submit to the Content component, which analyzes the concept of the query and returns documents that are conceptually similar to it. You can submit queries to Content to perform several kinds of search, such as natural language, Boolean, bracketed Boolean, and keyword.
  • See: QMS
  • R
  • A string that is used to identify a document. This string might be a title or a URL, and allows the Content component to identify documents for retrieval, indexing, and deduplication.
  • Fields used to identify documents. At index time the Content component can use ReferenceType fields to eliminate duplicate copies of documents. At query time Content can use ReferenceType fields to filter results.
  • The similarity that a particular query result has to the initial query. The Content component assigns results a percentage relevance score according to how closely it matches the query criteria.
  • S
  • A rule that allows you to return results from a promotions query that are relevant to a particular role, department, or other user-defined characteristic.
  • A specific document or set of documents that you want to return as a promotion. See Also: dynamic promotions.
  • A rule that modifies query text to include terms that are synonymous with the original terms.
  • T
  • The process of adding extra information to documents. The tag might be a category, or entities returned from Eduction. Tagging usually adds a field to a document, which you can use to search by the name of a tag.
  • The basic entity that the Content component stores (for example, a word in a document after Content applies stemming).
  • V
  • A component that converts files in a repository to HTML formats for viewing in a Web browser.
  • W
  • A rule that allows you to specify a list of allowed words for queries, and to remove any terms that do not appear on this list.
  • A character that stands in for any character or group of characters in a query.
  • X
  • Extensible Markup Language. XML is a language that defines the different attributes of document content in a format that can be read by humans and machines. In Knowledge Discovery, you can index documents in XML format. Knowledge Discovery can also return action responses in XML format.