Introduction to Speech Processing

IDOL Speech Server encompasses several speech processing functions in a single ACI server. IDOL Speech Server can perform:

All speech operations are asynchronous because this approach is more suited to speech processing, especially for live speech. You can send separate requests to the server to access results from the processing operations. This feature allows IDOL Speech Server to report and flag relevant events immediately, so that you do not have to wait until the entire file is processed.

IDOL Speech Server supports multiple languages. A single instance of IDOL Speech Server can process several languages simultaneously.

In addition to audio files, IDOL Speech Server can process audio as binary data sent as data blocks or streams. Sending data as a binary block is useful for processing a local file on a remote server that cannot view the local file system. Audio streaming makes real-time data processing possible–for example, converting incoming audio to text as it occurs.

IDOL Speech Server lets you put together combinations of speech processing functions to create custom operations, allowing you to perform several processes simultaneously on audio data.

The following sections introduce the speech processing functions.


_HP_HTML5_bannerTitle.htm