Speech-To-Text transcribes words spoken in audio into text.
Configuration Parameter | Description |
---|---|
CustomLM | The path and interpolation weight of each custom language model to use. |
CustomLMBuildLabel | The build label and interpolation weight of a custom language model to use. |
CustomLMCheckInterval | The amount of time to wait before checking to see if the language model specified by CustomLMBuildLabel has been updated. |
ErrorMessage | The message that appears in the transcript when HPE Media Server cannot connect to an IDOL Speech Server. |
FilterMusic | Specifies whether to include speech-to-text results for audio segments that Speech Server identifies as music or noise. |
Input | The audio track to process. |
Language | The language pack to use for speech-to-text processing. |
MaxConsecutiveTries | The maximum number of attempts that HPE Media Server makes to connect to the servers listed in the SpeechToTextServers parameter. |
Mode | The mode for speech-to-text analysis (you can prioritize accuracy or speed). |
ModeValue | The processing rate. The meaning of this parameter depends on the value of the Mode parameter. |
SampleFrequency | The sample frequency of the audio to send to the IDOL Speech Server. |
SpeechToTextServers | A list of IDOL Speech Servers to use for speech-to-text. |
Type | The analysis engine to use. Set this parameter to SpeechToText . |
UseFrameDuplication | Allows for greater processing speed without significant change in recognition accuracy. |
Output track | Type | Description |
---|---|---|
Result | SpeechToTextResult | Contains a record for each word. |
Field name | Type | Description |
---|---|---|
id | UUID | A universally unique identifier to identify the section of audio described by the record. |
text | TextData | The spoken word converted to text. |
confidence | Int | The confidence score for the speech-to-text process. |
|