Knowledge Discovery System Architecture
The Knowledge Discovery product suite contains many different components, which you can use and configure in many ways for your own use cases. This section covers some of the basic features that many Knowledge Discovery systems have in common.
Knowledge Discovery Workflow
The following diagram describes the basic Knowledge Discovery Workflow.
-
Connectors retrieve your files in many different formats from various repositories.
-
Ingest detects the formats of these files, and extracts the text by using File Content Extraction.
-
You can configure many different kinds of data enrichment, either directly in Ingest, or separately by using other Knowledge Discovery components. For example, this stage might include:
-
media analysis, such as speech-to-text to get text from audio and video files, or optical character recognition to get written text from images.
-
entity extraction to retrieve useful values from files to store as metadata, or to perform redaction on sensitive information.
-
categorization to add metadata to the document that links it with other similar content.
-
-
After enrichment, you index the data into the Content component, which processes and stores the content for queries and further analysis.
-
You use the content. For example:
-
You can send simple queries to retrieve content related to your interest.
-
You can alert users to content that matches their user profiles and interests.
-
You can cluster the data to show links and trends in the information, and create visualizations of how the data changes over time.
-
At all stages of this workflow, you can apply security to ensure that your documents are accessed only by permitted users.
This section describes only a small portion of what you can do with Knowledge Discovery. For more examples see Knowledge Discovery Feature Overview.