The following schema describes how to run speaker identification given a set of templates.
[SpkIdEvalWav] 0 = a <- wav (MONO, input) 1 = w <- speakerid (GENDER_DETECT, a) 2 = f <- frontend (_,a) 3 = nf <- normalizer (SEGMENT_BLOCK, f, w) 4 = sid <- audiotemplatescore (SEGMENT, nf, w) 5 = output <- sidout (_, sid)
0
|
The wav module processes the mono audio data. |
1
|
The speakerid module takes the audio data (a ) and outputs speaker turn segments. |
2
|
The frontend module converts audio data (a ) into front-end frame data. |
3
|
The normalizer module normalizes frame data from 2 (f ). |
4
|
The audiotemplatescore module takes audio features (nf ) and speaker segment information (w ), and produces a set of speaker scores for each segment. |
5
|
The sidout module takes the speaker ID score information (sid ) and writes this information into a results file. |
|