Introduction

Media Server can perform Speech-to-Text, which extracts speech from the audio and converts it into text. When the audio or video source contains narration or dialogue, running Speech-to-Text and indexing the resulting metadata into IDOL Server means that IDOL can:

  • search all of the video that you have processed to find clips where a speaker talks about a specific topic.
  • categorize video clips.
  • cluster together video clips that contain related concepts.