Audio Categorization Results

The following XML shows a single record produced by audio categorization.

<record>
  <timestamp>
    <startTime iso8601="1970-01-01T00:00:01Z">1000000</startTime>
    <duration iso8601="PT00H00M00.500000S">500000</duration>
    <peakTime iso8601="1970-01-01T00:00:01Z">1000000</peakTime>
    <endTime iso8601="1970-01-01T00:00:01.500000Z">1500000</endTime>
  </timestamp>
  <trackname>AudioCategorize.Result</trackname>
  <AudioCategorizeResult>
    <id>e8d84838-bdf2-4b9b-9a92-e7e42b249103</id>
    <category>Music</category>
    <confidence>80</confidence>
  </AudioCategorizeResult>
</record>

The record contains the following information:

  • The id element contains the identifier for the audio segment.

  • The category element shows how the audio segment was classified. The categories are pre-defined and this value can be:

    • DialTone
    • DTMF-*, DTMF-0, DTMF-1, DTMF-2, and so on. These values indicate that the audio contains a DTMF tone. For example, DTMF-2 indicates the tone for the "2" button.
    • Music
    • Noise
    • Silence
    • Speech

    NOTE: Dial tone and DTMF tone detection are enabled only when you process audio with a sample rate of 8KHz.

  • The confidence element provides the confidence score for the classification, from 0 to 100, where 100 indicates the greatest confidence.