Audio Analysis

IDOL Media Server extracts information from audio, returning the results as XML documents, which you can index into IDOL Server.

Media Server can process audio and video files that are compressed in various formats. You can also stream uncompressed audio directly into Media Server.

Speech-to-Text

Media Server can translate spoken words into text.

Speaker Identification

If trained on sample data from speakers, Media Server can identify the speakers in audio.

Spoken Language Identification

Media Server can identify the language that is being spoken.

Audio Matching

Media Server can detect particular audio sections (for example, melodies, adverts, or jingles) and return their location in the audio.

Audio Classification

Classify audio segments as music, noise, or speech, as well as giving details on the audio quality. See Audio Classification.

Transcript Alignment

Given an audio file and a transcript (for example an audiobook and a copy of the book), align the words in the transcript with positions in the audio.