Use an Adapted Acoustic Model

Note: In the 10.7 release of IDOL Speech Server, you could use acoustic adaptation to adapt the Gaussian Mixture Model (GMM) acoustic models to match an audio domain. To improve speech-to-text accuracy, IDOL Speech Server now includes Deep Neural Network (DNN) acoustic modeling. DNNs are not currently adaptable, but typically outperform even adapted GMM acoustic models. As a result, HPE does not generally recommend acoustic adaptation. However, in certain scenarios (for example, in cases where the language packs do not have a DNN, or where you are working with a very specific domain and believe that DNN recognition could be improved upon), acoustic adaptation can still be useful. Use the following instructions to perform this process.

Using an adapted acoustic model can substantially improve speech-to-text rates, if the model represents the acoustic properties of the speech data very accurately.

For information on how to adapt an acoustic model, see Adapt Acoustic Models. After you adapt an acoustic model, you must add it to the language pack section in the tasks configuration file, using the TrainedAm parameter. If an adapted acoustic model is specified in the language pack, it overrides the standard acoustic model file for that pack.

Note: Unlike custom language models, the adapted acoustic model replaces the existing model, rather than being used in conjunction with it.

To use an adapted acoustic model for speech-to-text

For example:

http://localhost:13000/action=AddTask&Type=WavToText&File=C:/myData/Speech.wav&Out=SpeechTranscript.ctm&Lang=ENUS&TrainedAm=myAcousticModel.am

This action uses port 13000 to instruct IDOL Speech Server, which is located on the local machine, to perform the WavToText task on the Speech.wav file using the U.S. English language pack and the custom acoustic model myAcousticModel.am, and to write the results to the SpeechTranscript.ctm file.


_HP_HTML5_bannerTitle.htm