Introduction
Media Server can perform Speech-to-Text, which extracts speech from the audio and converts it into text. When the audio or video source contains narration or dialogue, running Speech-to-Text and indexing the resulting metadata into IDOL Server means that IDOL can:
- search all of the video that you have processed to find clips where a speaker talks about a specific topic.
- categorize video clips.
- cluster together video clips that contain related concepts.