AssessSpeechLanguageModel
The AssessSpeechLanguageModel
action returns the perplexity and unknown word statistics for some input text. You can use this action to assess whether a language pack, combined with an optional custom language model, is suitable for processing your audio.
The text that you use to assess the language model must be different from the text you used to train the language model. Micro Focus recommends that you create a transcript for some of the speech that you intend to process and assess the language model using that text.
Type: synchronous
Parameter | Description | Required |
---|---|---|
CustomLanguageModel
|
The name and interpolation weight of a custom language model to use to supplement the base language pack. Separate the name and interpolation weight with a colon (: ). |
No |
LanguagePack
|
The base language pack. | Yes |
MaxUnknownWords
|
The maximum number of unknown words to return in the response. | No |
TextData
|
The text to use to assess the language model. Text files must be uploaded as multipart/form-data. For more information about sending data to Media Server, refer to the Media Server Administration Guide. | Set this or textpath |
TextPath
|
The path of a text file that contains the text to use to assess the language model. The path must be absolute, or relative to the Media Server executable file. | Set this or textdata |
Example
curl http://localhost:14000/action=AssessSpeechLanguageModel -F LanguagePack=ENUS -F CustomLanguageModel=ProductNames:0.1 -F TextData=@SomeText.txt
Response
The following XML is an example response:
<autnresponse> <action>ASSESSSPEECHLANGUAGEMODEL</action> <response>SUCCESS</response> <responsedata> <perplexity>93.67</perplexity> <unknownWordRate>1.54</unknownWordRate> <uniqueUnknownWordRate>2.44</uniqueUnknownWordRate> <unknownWord> <word>Eduction</word> <count>1</count> </unknownWord> <unknownWord> <word>OmniGroupServer</word> <count>1</count> </unknownWord> ... </responsedata> </autnresponse>
The response includes the following information:
perplexity
- indicates the average branching factor. Lower values are generally better. Perplexity values around or below 100 are acceptable for call center-like conversations. Perplexity values around or below 250 are acceptable for television broadcasts and news footage.unknownWordRate
- the percentage of words in the input text that are unknown. Unknown words cannot be transcribed correctly during speech-to-text, so you might consider training a custom language model in order to reduce the speech-to-text error rate.uniqueUnknownWordRate
- the percentage of words in the input text that are unknown (when the words in the input text are de-duplicated such that only one instance of each word is considered).
The response also includes an unknownWord
element for each unknown word (up to the limit specified by MaxUnknownWords
). The count
element describes how many times the word appears in the input text.