The following schema describes the preprocessing of an audio file to create a phoneme time track file required for phonetic phrase search.
[WavToFmd] 0 = a ← wav(MONO, input) 1 = f ← frontend(_, a) 2 = nf ← normalizer(_, f) 3 = output ← phraseprematch(WRITE, nf)
0 |
The wav module processes the mono audio. |
1 |
The frontend module converts the audio data into speech front-end frame data. |
2 |
The normalizer module normalizes the frame data from 1 (f ). |
3 |
The normalized frame data from 2 (nf ) is written to an .fmd file for phonetic phrase search. |
|