The LangIdBndWav
task reads in data from an audio file, converts it into language identification features, and determines boundaries where the language changes. The task returns the language identification results between boundaries.
Parameter | Description | Required |
---|---|---|
Type | The task name. Set to LangIdBndWav . |
Yes |
AppDnnBase | The location of the appResources directory, which contains the DNN and .ian files to use.
|
|
Beam | The beam width of the search process. | |
ClassList | A list of language classifiers to use. | |
ClassPath | The path to the directory containing the language classifiers. | |
ClosedSet | Whether to use closed set or open set language identification. | |
DiagFile | The file to write the diagnostic information to. | |
DiagLevel | The level of detail to include in the diagnostic information. | |
DnnFile | The Deep Neural Network acoustic modeling file to use. | |
EndTime | The end of an audio section to process. | |
File | The audio file to process. | Yes |
FrameDupl | The balance between performance and speed for audio preprocessing DNN classification. | |
Lang | The name of the language pack to use. | Yes |
LangList | A subset of languages to use from the classifier list file. | |
MinPhoneRate | The minimum phone rate (phones per second). | |
NBest | The maximum number of language candidates to include in the output file. | |
Out | The file to write language identification results to. | Yes |
OutB | The file to write boundary point information to. | Yes |
SegSize | The maximum results segment size. | |
SegSmoothWin | The size of the smoothing window. | |
SegStep | The step size in phones of the analysis window. | |
SilThresh | The threshold between what the task identifies as silence and non-silence. | |
SpeechThresh | The threshold between speech and non-speech (music or noise). | |
StartTime | The beginning of an audio section to process. | |
SugdInputChannels | The channel layout of the input media file. | |
SugdInputFrequency | The sampling rate of the input media file. |
The ClassList
parameter is required only if you want to change the audio sample rate, or if you want to use your own custom classifiers. You might also need to specify the ClassPath
parameter, depending on the location of the classifier files.
http://localhost:13000/action=AddTask&Type=LangIdBndWav&File=C:\Data\Speech.wav&ClassList=ListManager/OptClassSet&ClassPath=C:\LangID\&Out=SpeechLang1.ctm&OutB=SpeechBnd.ctm
This action uses port 13000
to instruct HPE IDOL Speech Server, which is located on the local machine, to identify languages and language boundaries in the Speech.wav
file using the language classifiers specified in the OptClassSet
list. The action instructs HPE IDOL Speech Server to write the identification results to the SpeechLang1.ctm
file and the boundary information to the SpeechBnd.ctm
file.
|