Speaker Identification Results

The following XML shows a single record produced by speaker identification.

<output>
    <record>
        <timestamp>
            ...
        </timestamp>
        <trackname>SpeakerId.Result</trackname>
        <SpeakerIdResult>
            <id>3543fda6-8fdb-4cda-b061-8b3765d24429</id>
            <identity>
                <identifier>newsreader3</identifier>
                <database>news</database>
                <confidence>35</confidence>
                <metadata>
                    <item>
                        <key>key1</key>
                        <value>value1</value>
                    </item>
                    <item>
                       <key>key2</key>
                       <value>value2</value>
                    </item>
                </metadata>
            </identity>
            <speakerName>newsreader3</speakerName>
            <gender>MALE</gender>
        </SpeakerIdResult>
    </record>
</output>

The record contains the following information:

  • The id element provides a unique identifier for the section of audio.
  • The identity element describes the speaker who was recognized. It contains the following information:

    • identifier - the identifier of the speaker who was recognized.
    • database - the name of the database that contains the speaker.
    • confidence - the confidence score (from 0 to 100).
    • metadata - any custom metadata associated with the speaker. (You can add custom metadata to speakers in your training database).

    This element can be empty when a speaker is not recognized.

  • The speakerName element provides the name of the speaker. If the speaker is unknown, this element contains the name Unknown_. If the section of audio does not contain speech this element contains NonSpeech_.
  • The gender element provides the gender of the speaker (FEMALE, MALE, or NonSpeech_).