ModelVersion

The model to use to convert speech into text.

  • Legacy uses the IDOL 12.x speech-to-text models. When you choose this option you can configure custom language models and custom word databases.
  • The following options use a new speech-to-text algorithm that was introduced in IDOL 23.2. These provide much better out-of-the-box accuracy than the legacy models, especially for English speech. Custom language models and custom word databases are not supported, because the vocabulary of these models is not limited by their training.

    • The micro model is the fastest of the new models. Use this model if you want to prioritize speed over accuracy, or if you need to process live streams without GPU acceleration.
    • The small model provides a good balance between accuracy and performance for English speech.
    • The medium model provides a significant increase in accuracy for non-English speech and a modest increase in accuracy for English, at the cost of greater memory requirements and longer processing times.
    • The large model provides maximum accuracy for non-English speech. This model is not available for English. If you specify English (for example by setting LanguagePack to ENUK), Media Server uses the medium model instead.
Type: String
Default: Legacy
Required: No
Configuration Section: TaskName
Example: ModelVersion=small
See Also: