Glossary

    A
  • See ACL.
  • Autonomy Content Infrastructure. A technology layer that automates operations on unstructured information for cross-enterprise applications. ACI enables an automated and compatible business-to-business, peer-to-peer infrastructure.
  • A server component that runs on the Autonomy Content Infrastructure (ACI). ACI servers use the ACI API to accept queries and actions, and return XML responses.
  • Access Control List. A metadata string associated with a document that defines which users and groups are permitted to access the document.
  • A request sent to an ACI server.
  • A domain controller for the Microsoft Windows operating system, which uses LDAP to authenticate users and computers on a network.
  • A process that searches for information about a specific topic. An administrator can create agents for users or allow users to create their own agents. See Also: explicit agent, implicit agent
  • Automatic Language Detection. The process of automatically detecting the language of a particular document, and indexing it into IDOL Server according to the rules for the detected language.
  • An automatic process for alerting users, by e-mail, text, or message, when new content is added to IDOL that matches their agents or profiles. See Also: mailing.
  • Automatic Number Plate Recognition, which reads the number/license plate of a vehicle.
  • An IDOL component that processes natural language questions, and returns direct answers.
  • Automatic Query Guidance. A set of operations that use the results from query summaries. AQG includes dynamic thesaurus generation, automatic query disambiguation, query refinement, and rapid clustering of a results set.
  • See ALD.
  • See AQG.
  • B
  • A type of query that uses Boolean terms such as AND OR, and NOT to specify matching criteria.
  • C
  • The process of matching documents against the available IDOL categories, and optionally tagging the document with category information.
  • A set of criteria that define a particular topic, which you can use to categorize documents that contain content relevant to the topic.
  • An IDOL component that manages categorization and clustering.
  • Connector Framework Server. An IDOL component that processes the information that is retrieved by connectors. CFS uses KeyView to extract document content and metadata from over 1000 different file types. When the information has been processed, it is sent to an IDOL index or Distributed Index Handler (DIH).
  • A set of documents that IDOL identifies as being related. Each cluster represents a concept area, which contains a set of items that share common properties. Clustering data allows you to make trends and developments in data visible.
  • The process of grouping documents into sets (clusters) that have related content. Each cluster represents a concept area, which contains a set of items that share common properties. Clustering data allows you to make trends and developments in data visible.
  • An IDOL component that manages users and communities.
  • A type of query that allows you to search for documents that match the concept that your query text defines, rather than matching the particular keywords in your text. See Also: query.
  • An IDOL component (for example File System Connector) that retrieves information from a local or remote repository (for example, a file system, database, or Web site).
  • See CFS.
  • An IDOL component that manages the data index and performs most of the search and retrieval operations from the index.
  • D
  • Distributed Action Handler. An IDOL component that distributes actions to multiple copies of IDOL Server or a component. It allows you to use failover, load balancing, or distributed content.
  • Distributed Index Handler. An IDOL component that allows you to efficiently split and index extremely large quantities of data into multiple copies of IDOL Server or the Content component. DIH allows you to create a scalable solution that delivers high performance and high availability. It provides a flexible way to batch, route, and categorize the indexing of internal and external content into IDOL Server.
  • See DAH.
  • See DIH.
  • An integrated security solution to protect your data. At the front end, authentication checks that users are allowed to access the system that contains the result data. At the back end, entitlement checking and authentication combine to ensure that query results contain only documents that the user is allowed to see, from repositories that the user has permission to access. For more information, refer to the IDOL Document Security Administration Guide.
  • A type of automatic query guidance (AQG) that provides a list of similar terms and concepts for a particular query.
  • E
  • The process of extracting entities (patterns of text) from documents.
  • In Eduction, an entity is a word, phrase, or block of information that the Eduction component can match and extract from documents. An entity can be a specific text string, such as a name, or it can be a pattern of text such as an address or phone number. You define the pattern in a grammar, which Eduction uses to find the entities in documents.
  • see Eduction
  • An IDOL agent that users explicitly create from themselves. See Also: agent, implicit agent
  • F
  • Fields define different parts of content in IDOL documents, such as the title, content, and metadata information.
  • A type of query that searches for particular content in a particular document field. See Also: query, field
  • See parametric search.
  • G
  • A search for a location, based on coordinates. With appropriate content, IDOL can search for a specific location, or locations in a particular area, or within a specified distance of a point.
  • In Eduction, a grammar is a file that defines the entities that you want to extract. It can be a simple list of entiites, or a pattern that defines what the entity looks like. See Also: Eduction, entity
  • H
  • The ability for IDOL Server to connect related documents to results, by using suggestions. See Also: suggest
  • I
  • A family of products that allow you to collect, ingest, index, and process unstructure, semi-structured and structured information from multiple sources and repositories.
  • An IDOL component that accepts incoming actions and distributes them to the appropriate subcomponent. IDOL Proxy also performs some maintenance operations to make sure that the subcomponents are running, and to start and stop them when necessary.
  • An IDOL agent that is created as part of a user profile. When you profile a user, IDOL creates these agents for a user, according to the documents and search results that the user views. See Also: agent, explicit agent
  • When Media Server finds the same object (for example, the same number plate) across multiple video frames, integration aggregates the results to help filter out occasional outliers (for example, if one of the characters on the number plate is read incorrectly in one of the frames).
  • K
  • The first frame following a significant scene change. Keyframes are often used as preview images for video clips.
  • The IDOL component that extracts data, including text, metadata, and subfiles from over 1,000 different file types. KeyView can also convert documents to HTML format for viewing in a Web browser.
  • L
  • The process of identifying the language or languages being spoken in audio.
  • A statistical model that captures word sequence patterns and probabilities. Sometimes you can improve the accuracy of speech-to-text by training a custom language model to supplement a language pack supplied by OpenText.
  • A data file that is required to perform speech-to-text in a single language. Language packs can contain hundreds of megabytes of data, so are not supplied with Media Server but are available as separate downloads.
  • Lightweight Directory Access Protocol. Applications can use LDAP to retrieve information from a server. LDAP is used for directory services (such as corporate email and telephone directories) and user authentication. See also: active directory, primary domain controller.
  • License Server enables you to license and run multiple IDOL solutions. You must have a License Server on a machine with a known, static IP address.
  • M
  • An automatic process for sending an email to users when new content is added to IDOL Server that matches their agents or profiles. See Also: alerting
  • A security setup where IDOL Connectors index documents into IDOL with an encrypted access control list (ACL), which IDOL uses to match user permissions for the document. With this method, IDOL does not need to check the original data repository to check the security information every time a user attempts to access the document. See Also: ACL
  • An IDOL component that analyzes video files and streams, image files, and audio to extract information about their content. Media Server can run analysis operations such as face recognition, number plate recognition, speech-to-text, and speaker identification.
  • N
  • The process of answering a question that is asked in normal speech-style language, rather than in query language. IDOL Answer Server can process natural language questions and return answers.
  • O
  • An IDOL component that manages access permissions for your users. It communicates with your repositories and IDOL components to apply access permissions to documents.
  • P
  • A type of query that returns a list of all possible values of a specified field for documents that match a particular standard query. You can use the values to find matching documents with a particular property. This process is also known as filtering or faceted search. Compare With: FieldText
  • An array of twelve numbers that Media Server can use to convert locations in the scene image into real-world 3D coordinates.
  • A server computer in a Microsoft Windows domain that controls various computer resources. See also: active directory, LDAP.
  • Information about a user that is based on the concepts in documents that the user reads. Every time a user opens a document, IDOL updates their profile. This process allows the administrator to bring new documents that match the interests in a user profile to the attention of the users.
  • Q
  • An IDOL component that modifies user queries and manipulates the results, for example to return promotions, remove particular query terms, or add synonyms.
  • A text string that you submit to IDOL, which analyzes the concept of the query text and returns documents that are conceptually similar to it. You can submit queries to IDOL to perform several kinds of search, such as natural language, Boolean, bracketed Boolean, and keyword.
  • See QMS.
  • A query operation that determines the important topics and phrases in a set of documents. Query summaries are used in Automatic Query Guidance (AQG). See Also: AQG.
  • R
  • A single package of metadata in a track. A record produced by an analysis task might describe a recognized face, a word spoken in the audio, or a number plate detected by ANPR. A record can contain a significant amount of information; for example a record describing a number plate includes timestamps describing when the number plate was detected, the position of the number plate in the video frame, the characters read from the number plate, the confidence score for recognition, and so on.
  • The process of removing sensitive content from output. IDOL supports text redaction through Eduction, and face redaction in Media Server.
  • The ability to analyze audio, images, and videos for additional value and information, such as speech to text, optical character recognition (OCR), face recognition and identification, and object classification.
  • A fixed-size storage area on disk where you can save encoded video on a continuous basis. When the rolling buffer is full, the oldest content is discarded to make space for the latest.
  • S
  • Scene analysis recognizes suspicious activity in video and produces alarms to alert security personnel. Scene analysis can be trained to recognize many suspicious events, including vehicles driving through red lights, people entering restricted areas, and abandoned bags and vehicles.
  • The process of separating long documents into multiple sections for indexing. The number of sections increases in proportion to the size of the document. This process ensures that when you, for example, query for text that is relevant to a specific part of a book, IDOL can find the appropriate section and return it. If the book was not indexed in sections, IDOL might not find the text you searched for, because it might not be conceptually relevant to the entire book.
  • See section breaking.
  • A form of Eduction that identifies positive and negative sentiment in text.
  • A type of query that allows you to search for a term by using a phonetic spelling.
  • The process of identifying known speakers in audio.
  • A graphical representation of the results of clustering. The spectrograph displays clusters of documents, and the similarities between different clusters.
  • The process of converting spoken words, in a media file or stream, into text through speech recognition.
  • The process of extracting the morphological root of a word. In languages, some words have a common root. IDOL includes stemming algorithms that reduce words to this form. This process allows IDOL to match concepts regardless of the grammatical use of words. In English, for example, the words 'help', 'helpful', 'helping', and 'helped' all reduce to their stem 'help' without significant loss of meaning.
  • A type of query that returns documents that contain similar concepts to a particular document, rather than matching a particular query string. See Also: query
  • A few sentences or paragraphs that describe what a document is about. IDOL can automatically create summaries from document content.
  • A type of query that returns documents that contain synonyms for a particular search term, as well as documents that contain the exact term. See Also: query
  • T
  • An automatically created hierarchical structure of clusters or other information. A taxonomy provides you with an overview of the information landscape, and an insight into specific areas of the information.
  • A stream of data produced by a processing task in Media Server. For example, when you ingest video the ingest task produces two tracks: one for video frames and the other for audio packets. Other tasks use these tracks. Analysis tasks read the data and produce tracks that contain analysis results; encoding tasks take the video and audio data to write files to disk. See also record.
  • The process of assigning timestamps to the words in a transcript. Unlike speech-to-text, transcript alignment requires that you already have a transcript of the speech.
  • V
  • An IDOL component that converts files in a repository to HTML formats for viewing in a Web browser.
  • W
  • A character that stands in for any character or group of characters in a query.
  • X
  • Extensible Markup Language. XML is a language that defines the different attributes of document content in a format that can be read by humans and machines. In IDOL Server, you can index documents in XML format. IDOL Server also returns action responses in XML format.