After you have trained a set of speaker templates, use the IvSpkIdEvalStream
task to run iVector-based identification of any sections of an audio stream where the trained speakers are present.
To process an audio file, use the IvSpkIdEvalWav task.
Parameter | Description | Required |
---|---|---|
Type | The task name. Set to IvSpkIdEvalStream .
|
Yes |
AllowEmpty | Whether to produce gender labels as output if no speakers are specified. | |
DiagFile | The name of the file to write diagnostic information to. | |
DiagLevel | The level of detail to include in the diagnostic information. | |
DiscardShort | Exclude segments shorter than a specific duration from further analysis. | |
FrameDupl | The balance between performance and speed for audio preprocessing DNN classification. | |
Out | The file to write the results to. | |
Sfreq | The sample frequency of the audio stream to process. | |
The file extension to use for template files. | ||
TemplateList | A list file that lists multiple speaker template files to use. | |
TemplatePath | The path to the directory containing the speaker templates. | |
TemplateSet | An audio template set file. | |
ThreshScale | The rate at which to scale the thresholds. |
http://localhost:15000/action=AddTask&Type=IvSpkIdEvalStream&File=C:\Data\Speech.wav&TemplateSet=speakers.ivs&Out=results.ctm
This action uses port 15000
to instruct HPE IDOL Speech Server, which is located on the local machine, to search the audio stream for speakers based on the iVector-based template set file speakers.ivs
, and to write the identification results to the results.ctm
file.
|