Optical Character Recognition
Optical Character Recognition (OCR) recognizes text in media. This includes text that appears in images, video, and text embedded in PDF files and Office document file formats.
Configuration Parameter | Description |
---|---|
Blacklist | Characters to exclude from the character set used for recognition. |
CharacterTypes | The types of characters to include in the character set used for recognition. |
ContextCheck | Specifies whether to use context checking to improve OCR results |
DetectAlphabet | Specifies whether to detect the alphabet for each image or page. |
FontType | The basic character type of the text that you want to recognize |
HollowText | Specifies whether to look for outlined text. |
Input | The image track to process. |
KeepOnly | Keep only particular types of words and discard all others. |
Languages | The languages to use, which affects the character set and dictionaries used. |
MaxInputQueueLength | Can be used to place a limit on latency. |
NumParallel | The maximum number of video frames to analyze simultaneously. |
OcrMode | The OCR mode to use when you ingest images or documents. |
Orientation | The orientation of text in the ingested media. |
OutputTablesByColumn | Specifies how to order the records produced when OCR encounters a table. |
ProcessTextElements | Specifies whether to merge the content of text elements into the OCR results. |
Region | A region of the image or video frame to restrict processing to. |
SampleInterval | The interval at which frames are selected to be analyzed. |
Spacing | Specifies whether to allow multiple spaces between words in the output from OCR. |
Type | The analysis engine to use. Set this parameter to OCR . |
UserDictionary | A comma-separated list of dictionaries to use in addition to the standard dictionaries. |
Whitelist | Extra characters to add to the character set. |
WordRejectThreshold | The minimum confidence level required to include a word in the output. |
Output Tracks
Output track | Description | Output1 |
---|---|---|
Data
|
Contains one record, describing the analysis results, per line of text, per video frame. | No |
DataWithSource
|
The same as the |
No |
Result
|
Contains one record, describing the analysis results, for each line of text. When a line of text appears in many consecutive frames, Media Server produces a single result. | Yes |
ResultWithSource
|
The same as the |
No |
CharResult
|
(Image/document ingest only) Contains one record, describing the analysis results, for each line of text. However, the records in this track also provide detail about individual characters. | No |
PageResult
|
(Image/document ingest only) Contains one record for each page, describing the orientation of the page, and the alphabet(s) and OCR mode that were used. | No |
TableResult
|
(Image/document ingest only) Contains one record for each table that is detected. | No |
WordData
|
Contains one record for each word, describing the analysis results, per video frame. Words from text elements are not output to this track. | No |
WordResult
|
Contains one record for each word, describing the analysis results. Words from text elements are not output to this track. This track does not support scrolling text. | No |
Start
|
The same as the |
No |
End
|
The same as the |
No |
For more information see OCR Results or use the action GetExampleRecord.