Generate Speaker Thresholds

This section describes how to generate the speaker thresholds that are used to distinguish between known and unknown speakers.

You only need to complete the steps in this section if you want to process audio that contains unknown speakers (people who you have not trained and do not exist in the database). For more information about training speaker identification, see Train Speaker Identification.

To generate speaker thresholds

  1. For each of the speakers you have trained, add additional audio samples for generating speaker thresholds. To add these audio samples, use the AddSpeakerAudio action but set the parameter training=false. For example:

    curl http://localhost:14000 -F action=AddSpeakerAudio
                                -F database=news
                                -F identifier=newsreader
                                -F audiodata=@sample3.wav,sample4.wav
                                -F audiolabels=sample3,sample4
                                -F training=FALSE
  2. Add audio samples that represent unknown speakers, using the action AddUnknownSpeakerAudio. Set the following parameters:

    database The name of the database to add the audio samples to.
    audiodata (Set this or audiopath, but not both) The audio data to add. Files must be uploaded as multipart/form-data. For more information about sending data to Media Server, see Send Data by Using a POST Method.
    audiopath (Set this or audiodata, but not both) A comma-separated list of paths to the audio files to add. The paths must be absolute, or relative to the Media Server executable file.
    audiolabels (Optional) A comma-separated list of labels to identify the audio samples that you are adding (maximum 254 bytes for each label). Every audio sample representing unknown speakers must have a unique label, so the number of labels must match the number of samples provided using either audiodata or audiopath. If you do not set this parameter, Media Server generates labels automatically.

    For example:

    curl http://localhost:14000 -F action=AddUnknownSpeakerAudio
                                -F database=news
                                -F audiodata=@UnknownSpeakers.wav
  3. Calculate speaker thresholds, by running the action EstimateAllSpeakerThresholds. Set the database parameter to specify the name of the database. For example:

    curl http://localhost:14000 -F action=EstimateAllSpeakerThresholds
                                -F database=news