ScoreCustomSpeechLanguageModel

Runs speech-to-text with a custom language model, compares the output to a transcript, and returns statistics about the accuracy. You can use this action to determine whether a custom language model needs further training or provides the accuracy that you require.

To use this action you must supply:

  • Audio to use to test the custom language model.
  • An accurate transcript of the speech contained in the audio. Media Server automatically normalizes the text for most languages.
  • A session configuration that contains only a speech-to-text task, including the custom language model and other parameters that you intend to use. For example:

    [SpeechToText]
    Type=SpeechToText
    LanguagePack=ENUK
    SampleFrequency=16000
    CustomLanguageModel=MedicalTerms:0.1
    SpeedBias=2

Type: asynchronous

Parameter Description Required
AudioData The audio data to use for testing the custom language model. Set either AudioData or AudioPath
AudioPath The path of the audio file to use for testing the custom language model.
Config A session configuration that contains the settings for the speech-to-text task (base64 encoded, unless you upload it as multipart form data). Set Config, ConfigName, or ConfigPath
ConfigName The name of a session configuration file that contains the settings for the speech-to-text task, when the file is stored in the ConfigDirectory.
ConfigPath The path of a session configuration file that contains the settings for the speech-to-text task.
TranscriptData The text file that contains an accurate transcript of the speech. Set TranscriptData or TranscriptPath
TranscriptPath The path of a text file that contains an accurate transcript of the speech.

Example

The following example uses cURL to send the action to Media Server.

curl http://localhost:14000/action=ScoreCustomSpeechLanguageModel -F AudioData=@speech.wav -F TranscriptData=@transcript.txt -F Config=@speechToTextTask.cfg

Response

This action is asynchronous, so Media Server always returns success accompanied by a token. You can use this token with the QueueInfo action to retrieve the status of your request.

The response from the QueueInfo action includes the elements Fmeasure, precision, and recall.