Before running speech-to-text you can assess whether a language pack, optionally combined with a custom language model, is suitable for processing your audio. You can check:
To check whether words are present in the vocabulary
AssessSpeechLanguageModel
. Media Server returns statistics and information about unknown words.QuerySpeechLanguageModel
action to check whether the words are present in the vocabulary.To measure perplexity for a language model
AssessSpeechLanguageModel
. Media Server returns a perplexity value. Perplexity values around or below 100 are acceptable for processing call center conversations. Perplexity values around or below 250 are acceptable for television news/broadcast audio. A lower perplexity value is generally better. If the AssessSpeechLangaugeModel
action returns a perplexity value that is much higher, consider training a custom language model.For more information about the AssessSpeechLanguageModel
and QuerySpeechLanguageModel
actions, refer to the Media Server Reference.
|