Use Your Content > Improve > Speaker Identification > Create Speaker Templates > Speaker Section Labeling

Speaker Section Labeling

In many cases you can simply present audio files to Speech Server, and have the system process all of the audio contained within those files. However, you can also specify a specific section within a file by using the Start and End action parameters.

Alternatively, you can specify a label file for each input data file that lists the different speaker sections in the audio. For template training, Speech Server uses only sections labeled as being from the relevant speaker. When calculating score statistics for threshold development, the label determines whether each segment is a true or false example for each speaker template.

This approach is particularly useful for iterative speaker training, because you might have start and end times of speaker segments derived from running speaker identification, and want to process just these segments. For example, if you have the following speaker identification results for a file named test1.wav:

1     A     0.000      0.520     NonSpeech_     UNKNOWN     0.000
1     A     0.520      10.030    Brown 	 MALE	     3.540
1     A     10.550     0.080	  NonSpeech_	 UNKNOWN     0.000
1     A     10.630     9.460	  Unknown_	 FEMALE	     0.000
1     A     20.090	6.150	  Brown	         MALE	     6.983

After you validate that the results are true, you have two audio sections for the speaker Brown (0.520 seconds to 10.550 seconds, and 20.090 seconds to 26.24 seconds). You can then add these two segments to the training set by adding the test audio file to your training set, and using the speaker identification results file as an input label file for the next iteration. If you decide that a section in the results is incorrect, you can simply change the label in the file (to Unknown_, for example) before using it as input for training. See Iterative Training for more information.

Use the following procedure to enable the use of input label files:

  1. Set the LabType parameter to the type of labels to use (the default value is NONE):

  2. Specify the label files to use:

Note: When you train speaker templates, you also need to specify the term in the label file that identifies the regions to train. For example:

http://localhost:15000/a=AddTask&Type=SpkIdTrainWav&File=Brown.wav&LabType=SID&LabFile=test1.sid&LabTerm=Brown&Out=Brown.atf

This task trains a template from a single audio file (Brown.wav), but only on sections specified in the label file test1.sid which have the label Brown. The LabTerm parameter is not required for development tasks, because the label associated with each template is already known.


_HP_HTML5_bannerTitle.htm