Optical Character Recognition

Optical Character Recognition (OCR) recognizes text in media. This includes text that appears in images, video, and text embedded in PDF files and Office document file formats.

Configuration Parameter Description
Blacklist Characters to exclude from the character set used for recognition.
CharacterTypes The types of characters to include in the character set used for recognition.
ContextCheck Specifies whether to use context checking to improve OCR results
DetectAlphabet Specifies whether to detect the alphabet for each image or page.
FontType The basic character type of the text that you want to recognize
HollowText Specifies whether to look for outlined text.
Input The image track to process.
KeepOnly Keep only particular types of words and discard all others.
Languages The languages to use, which affects the character set and dictionaries used.
MaxInputQueueLength Can be used to place a limit on latency.
NumParallel The maximum number of video frames to analyze simultaneously.
OcrMode The OCR mode to use when you ingest images or documents.
Orientation The orientation of text in the ingested media.
OutputTablesByColumn Specifies how to order the records produced when OCR encounters a table.
ProcessTextElements Specifies whether to merge the content of text elements into the OCR results.
Region A region of the image or video frame to restrict processing to.
SampleInterval The interval at which frames are selected to be analyzed.
Spacing Specifies whether to allow multiple spaces between words in the output from OCR.
Type The analysis engine to use. Set this parameter to OCR.
UserDictionary A comma-separated list of dictionaries to use in addition to the standard dictionaries.
Whitelist Extra characters to add to the character set.
WordRejectThreshold The minimum confidence level required to include a word in the output.

Output Tracks

Output track Description Output1
Data Contains one record, describing the analysis results, per line of text, per video frame. No
DataWithSource

The same as the Data track, but each record also includes the source frame.

No
Result Contains one record, describing the analysis results, for each line of text. When a line of text appears in many consecutive frames, Media Server produces a single result. Yes
ResultWithSource

The same as the Result track, but each record also includes the best source frame.

No
CharResult (Image/document ingest only) Contains one record, describing the analysis results, for each line of text. However, the records in this track also provide detail about individual characters. No
PageResult (Image/document ingest only) Contains one record for each page, describing the orientation of the page, and the alphabet(s) and OCR mode that were used. No
TableResult (Image/document ingest only) Contains one record for each table that is detected. No
WordData Contains one record for each word, describing the analysis results, per video frame. Words from text elements are not output to this track. No
WordResult Contains one record for each word, describing the analysis results. Words from text elements are not output to this track. This track does not support scrolling text. No
Start

The same as the Data track, except it contains only the first record of each event.

No
End

The same as the Data track, except it contains only the last record of each event.

No

For more information see OCR Results or use the action GetExampleRecord.