Speech-to-Text Results
The following XML shows a single record produced by speech-to-text.
<output> <record> <timestamp> ... </timestamp> <trackname>SpeechToText.Result</trackname> <SpeechToTextResult> <id>5c6a6fe9-04aa-4ec2-9f06-9c28827a1cb6</id> <text>all</text> <confidence>80</confidence> <alternative> <id>b05a75af-8515-4ed5-845e-caf86e2b25b9</id> <text>fall</text> <score>97</score> <startOffset>-60</startOffset> <endOffset>170</endOffset> </alternative> <alternative> <id>98cfe8e2-a377-4719-a12c-441266cfe657</id> <text>call</text> <score>91</score> <startOffset>-60</startOffset> <endOffset>170</endOffset> </alternative> ... <matched>false</matched> </SpeechToTextResult> </record> </output>
The record contains the following information:
- The
id
element provides a unique identifier for the result. -
The
text
element provides the recognized word (the "primary" word).This element can also have a value of
<SIL>
or<s>
, which indicates a period of audio without speech, such as silence or background noise.<SIL>
indicates silence that probably has no linguistic role.<s>
is more likely to end a chain of words, for example when a speaker begins a new sentence. - The
confidence
element provides the confidence score for the recognized word. -
One or more
alternative
elements might be present, but only if you set the parameterAlternativeWordsThreshold
. The following elements are present for each alternative word:- The
text
element provides the alternative word. - The
score
element provides the score for the alternative word. The scores for alternative words are relative to the primary word. - The
startOffset
andendOffset
elements provide offsets for the start and end times. For example, the alternative choice "fall" begins 60 milliseconds before the record start time and ends 170 milliseconds after the record start time.
An alternative word is included in the result if it overlaps chronologically with the primary word, and has a score that exceeds the threshold specified by the
AlternativeWordsThreshold
parameter. This means that you might see the same alternative word repeated in several records. - The
- The
matched
element indicates whether the primary word is in the list of words specified by theMatchWords
configuration parameter (or an overlapping alternative word is in the list and has a score greater than the value of theMatchWordsThreshold
parameter). You might use this information to perform audio redaction on specific words.