XML Transcript Output

To view the output of a speech-to-text transcription file in XML format, send a GetResults action. The contents of the .ctm output file return in a series of XML tags. For example:

<stt_transcript>
	<stt_record>
		<start>0.000</start>
		<end>3.390</end>
		<label>&lt;SIL&gt;</label>
		<score>0.987</score>
		<rank>0</rank>
	</stt_record>
	<stt_record>
		<start>3.390</start>
		<end>3.780</end>
		<label>hello</label>
		<score>0.765</score>
		<rank>0</rank>
	</stt_record>
	<stt_record>
		<start>3.780</start>
		<end>3.970</end>
		<label>there</label>
		<score>0.875</score>
		<rank>0</rank>
	</stt_record>
</stt_transcript>

This example shows the XML output for a transcript that contains a silent period (with the label <SIL>) followed by the words hello there.

The <stt_transcript> tag represents the start of a recognition sequence. This tag contains <stt_record> nodes that contain the following information for each recognized word.

<start> The start time (in seconds) of the word.
<end> The end time (in seconds) of the word.
<label> The recognized word.
<score> The confidence value of the recognized word.
<rank> Not used for speech-to-text (used for different operation results).

_HP_HTML5_bannerTitle.htm