Audio Analysis
IDOL Media Server extracts information from audio, returning the results as XML documents, which you can index into IDOL Server.
Media Server can process audio and video files that are compressed in various formats. You can also stream uncompressed audio directly into Media Server.
Speech-to-Text
Media Server can translate spoken words into text.
Speaker Identification
If trained on sample data from speakers, Media Server can identify the speakers in audio.
Spoken Language Identification
Media Server can identify the language that is being spoken.
Audio Matching
Media Server can detect particular audio sections (for example, melodies, adverts, or jingles) and return their location in the audio.
Audio Classification
Classify audio segments as music, noise, or speech, as well as giving details on the audio quality. See Audio Classification.
Transcript Alignment
Given an audio file and a transcript (for example an audiobook and a copy of the book), align the words in the transcript with positions in the audio.