Configure OCR
OCR has many configuration options that allow you to fine-tune its operation to improve accuracy. This section describes the basic settings that you need to consider before running OCR.
You must specify all the languages that you expect the text to be in using the Languages
parameter. Media Server restricts its identification attempts to characters that are used by the specified languages. You can add extra characters to this character list (for example, rarer punctuation) using the ExtraEnabledCharacters
parameter. You can also further restrict the possible character choices, for example to a single case or to digits only, using the CharacterTypes
parameter. In many cases, you know in advance that only a limited subset of characters will occur in the images (for clarity, many forms use only upper case, digits, and limited punctuation). In this situation, reducing the list of characters that Media Server considers improves accuracy and speed.
Images
The parts of an image that are likely to be text depends on the context. To reflect this, Media Server has the following OCR modes for processing images:
- Document. Computer-generated document-style images (or scans of printed documents), which typically contain mainly blocks of text, diagrams, or separate pictures. For example: reports, newspapers, and forms.
- Scene. Photos or pictures of objects with text on them. For example: a smartphone photo of a street sign, or product packaging.
Specify the mode using the OcrMode
parameter in the configuration file.
Video
Media Server supports two types of subtitle. By default, Media Server searches for single color text against a plain, single color background. You can also configure Media Server to search for black-bordered white letters that have been superimposed directly onto the background TV image, which is a widely used type of subtitling. The Media Server configuration file refers to this type of subtitle as 'hollow text'.