The TelWavToTextPunct
task converts a telephony audio file into a text transcript. In addition to transcribing speech, the task recognizes and transcribes dial tones including DTMF. Simple sentence-forming punctuation (such as full stops and initial capital letters) is included in the .ctm
output.
Parameter | Description | Required |
---|---|---|
Type | The task name. Set to TelWavToTextPunct . |
Yes |
Conf | Whether to generate word confidence scores. | |
Diag | Whether to generate diagnostic information. | |
DiagFile | The alignment diagnostics file to generate. | |
DnnScale | The DNN output acoustic score scaling factor. | |
DoDialTones | The type of dial tone to identify. | |
EndTime | The end of an audio section to process. | |
File | The audio file to process. | Yes |
FrameDupl | An integer value which allows for greater time efficiency with only a minimal loss of recognition accuracy. | |
Lang | The language pack to use. | Yes |
LatFile | The name of the lattice file that contains word hypotheses. | |
LatScale | The depth of the lattice. | |
LatWinSize | The size (in seconds) of the lattice output window. | |
LatWordFile | A list of words to find. | |
Mode | The algorithm mode for the speech-to-text process. | |
ModeValue | The value of the parameter associated with the speech-to-text algorithm mode. | |
NonSentFinalWords | A list of words that are unlikely to end a sentence. | |
Out | The file to write the transcription to. | Yes |
SilThresh | The threshold between what the module identifies as silence and non-silence. | |
StartTime | The beginning of an audio section to process. | |
SugdInputChannels | The channel layout of the input media file. | |
SugdInputFrequency | The sampling rate of the input media file. |
http://localhost:13000/action=AddTask&Type=TelWavToTextPunct&File=C:/myData/tel.wav&Out=TelTranscript.ctm&Lang=ENUS
This action uses port 13000
to instruct IDOL Speech Server, which is located on the local machine, to perform the TelWavToText
task on the tel.wav
file and write the results to the TelTranscript.ctm
file. The tel.wav
file contains U.S. English dialect speech and punctuation.
|