The following diagram shows the modules in HPE IDOL Speech Server that enable spoken language identification in a single step.
|
The a is the audio window series.
The f is the feature vector series.
The nf is the normalized feature vector series.
The lf is the language identification feature.
The w is the output time-marked word series.
The |
The schema that implements this feature is:
[MyLangId] a ← wav (MONO, input) f ← frontend (_, a) nf ← normalizer (_, f) lf ← lidfeature (_, nf) lid ← langid (CUMULATIVE, lf) output ← lidout (_, lid)
|