Speech-to-text is supported out-of-the-box and does not require training, but you can often improve accuracy by creating a custom language model. A language model is just one part of the training for a language. It describes the vocabulary and contains information about how sentences are composed from individual words. This section explains when you might want to use a custom language model and how to build one.
Using a custom language model can improve accuracy when:
Building a language model requires a lot of text - millions or billions of words. The language models supplied with Media Server are trained with billions of words across a wide range of topics. This is a significant training burden, but you can build a small, focused, language model and use it to supplement one of the standard models.
Standard language model | Custom language model |
---|---|
|
|
|