Media Server connection
You can use your existing OpenText IDOL Media Server implementation to perform optical character recognition (OCR) on documents managed by OpenText Core Data Discovery & Risk Insights.
OpenText Core Data Discovery & Risk Insights includes a default OCR configuration file that defines a Media Server analysis task for basic OCR and uses a custom XSL to manipulate the Media Server response to only that which OpenText Core Data Discovery & Risk Insights requires. This configuration does not apply orientation correction, cropping, or template or region detection. You can create custom Media Server configurations to perform OCR analysis tasks.
If you plan to connect to Media Server to facilitate optical character recognition (OCR), additional tasks are required
Requirements
Prior to beginning the connection tasks, you must have the following in place.
-
OpenText Core Data Discovery & Risk Insights processing agent fully installed and configured, following practices and procedures defined in "Agent installation and configuration".
NOTE: Each processing agent host can connect to a single Media Server host.
-
Media Server 25.3.x fully installed and configured, following standard Media Server practices and procedures.
-
OpenText Core Data Discovery & Risk Insights processing agents and any Media Server host machines they interact with must be within the same network and can not have a proxy between them.
- Ensure the Media Server host machine IP or server name is resolvable to the OpenText Core Data Discovery & Risk Insights processing agent host machine it will be connected to.
-
If you plan to use existing or new Media Server configurations to perform OCR analysis tasks, ensure that these configurations have been verified and tested in Media Server and conform to the necessary format for compatibility with OpenText Core Data Discovery & Risk Insights. For configuration format information, see Custom Media Server configurations for OCR task analysis.
Update Media Server
You must update your Media Server implementation to include the necessary components to interact with OpenText Core Data Discovery & Risk Insights
-
On a Media Server host machine that will be connected to a processing agent host, ensure the necessary parameters and values are present in the Media Server configuration file,
mediaserver.cfg.-
Under
[Paths], verify the following parameters and values are present. If not present, add them.-
LuaDirectory=configurations/lua -
XslDirectory=configurations/xsl -
TemplateDirectory=configurations/xsl/response_templates
-
-
Under
[Modules], verify theEnableparameter includes theocrandobjectrecognitionmodules in the list of values. -
Under
[Channels], verify at least one at least one channel of type Visual is enabled.Example:
VisualChannels=1
-
-
Copy the required files from the OpenText Core Data Discovery & Risk Insights processing agent host machine to the Media Server host machine.
-
From the installation path for the processing agent, copy
getMediaBoundary.luafrom themediaserver/luadirectory to the/configurations/luadirectory in the installation path for Media Server. -
From the installation path for the processing agent, copy all files from the
/mediaserver/xsldirectory to the/configurations/xsldirectory in the installation path for Media Server.
-
-
Restart the Media Server service.
-
Repeat for each Media Server host that will be connected to a processing agent host for OCR.
Update OpenText Core Data Discovery & Risk Insights
You must update your OpenText Core Data Discovery & Risk Insights processing agent implementation to include the necessary components to use Media Server for OCR.
-
Update Media Server connection information in the agent Administration UI.
-
On the agent host server that will connect to a Media Server host, open the agent Administration UI.
From the Start menu, click OpenText Core Data Discovery & Risk Insights Agent > Agent Admin.
-
In the navigation pane, click Advanced Settings.
-
In the Category list on the Advanced Settings page, click Analyze.
-
Complete the following options for the Media Server you want to connect to.
Option Description Media Server Host Type the IP address or fully qualified domain name of the Media Server host that this agent will connect to.
Defaults to
localhost.Media Server Port Verify the port number for the Media Server host that this agent will connect to.
Defaults to
14000. This is the default port for Media Server.Media Server Configuration Directory Verify the relative path to the directory on this agent server host to the Media Server configuration files.
Defaults to
[Common\Base Path]\Config, where[Common\Base Path]is the agent installation path on this agent host. The Base Path can be seen on this page.OCR configuration files for Media Server analysis tasks must be in this directory.
CAUTION: Do not change this unless you moved the directory where the Media Server configuration files for OCR analysis tasks are placed on this agent host.
You can create a subfolder if desired; the subfolder would then be considered the new root when defining configuration files when creating a dataset.
Media Server Default Configuration File Name Verify the name of the default Media Server configuration file for OCR analysis tasks.
Defaults to
OCR_Only.cfg.CAUTION: Do not change this unless you have placed a valid alternate configuration file in the directory defined by the Media Server Configuration Directory.
If you have alternate configuration files, see Custom Media Server configurations for OCR task analysis.
Media Server Set Https Verify this matches your Media Server implementation. Set to
Trueonly if your Media Server is set to use Https.Defaults to
False.
-
-
If applicable, copy any custom Media Server configuration files for OCR analysis tasks to the agent host.
Place the configuration files in the path defined by the Media Server Configuration Directory. By default this is
C:\Program Files\\OpenText\Core Data Discovery and Risk Insights\Agent\Config. -
Repeat for each processing agent host that will connect to a Media Server host to perform OCR.
Custom Media Server configurations for OCR task analysis
If you choose to create your own Media Server configurations for OCR task analysis, follow all procedures, guidelines, and suggestions for OCR as documented in the Media Server documentation. The Media Server Administration Guide contains the necessary details and includes information for improving OCR and optimizing performance for object recognition.
If defining regions for OCR, you can use Media Server Visual Training to develop a template. A template consists of a series of regions which contain a name and set of region coordinates (in pixels). Each region defined in Media Server must have two respective engines in the processing configuration: a SetRectangle type engine and an ocr type engine. It is not a requirement for either engine to be named exactly the same as the respective Media Server metadata field, though there is a naming convention for the ocr type engine that must be followed to work correctly with the OpenText Core Data Discovery & Risk Insights agent. During document ingestion, OpenText Core Data Discovery & Risk Insights uses the name of each ocr type engine to define the respective value leveraged.
The format of the ocr engine type region name must use the format <OBJECT_TYPE>-<REGION_NAME>-<REGION_TYPE>. The <OBJECT_TYPE>- must always exist and come first. There must be a separating hyphen '-' between each portion with a minimum of one (<OBJECT_TYPE>-<REGION_TYPE>) and maximum of two (<OBJECT_TYPE>-<REGION_NAME>-<REGION_TYPE>).
[Session] # Image preparation Engine0 = Source Engine1 = DetectAnchor Engine2 = RotateIdCard Engine3 = IdCardRegion Engine4 = CropIdCard Engine5 = DetectCroppedImageAnchor # Drivers License fields Engine6 = DriversLicense-TextRegion_Address Engine7 = DriversLicense-Address Engine8 = DriversLicense-TextRegion_Forename Engine9 = DriversLicense-Forename Engine10 = DriversLicense-TextRegion_Surname Engine11 = DriversLicense-Surname # Results output Engine12 = CombineResultsWithSources Engine13 = Response # # Ingest # [Source] Type = image # # Analysis # [DetectAnchor] Type = ObjectRecognition Database = IDCardTemplates # Consider 2-dimensional rotations only Geometry = SIM2 # # Transform image on-the-fly # [RotateIdCard] Type = Rotate Input = DetectAnchor.ResultWithSource LuaLine = function getAngle(x) return -x.ObjectRecognitionResultAndImage.inPlaneRotation end [IdCardRegion] Type = SetRectangle Input = RotateIdCard.Output LuaScript = getMediaBoundary.lua [CropIdCard] Type = Crop Input = IdCardRegion.Output [DetectCroppedImageAnchor] Type = ObjectRecognition Input = CropIdCard.Output Database = IDCardTemplates # Consider 2-dimensional rotations only Geometry = SIM2 # # OCR by region # [DriversLicense-TextRegion_Address] Type = SetRectangle Input = DetectCroppedImageAnchor.ResultWithSource LuaScript = getTurkey_DriverLicense_OCR_Address.lua [DriversLicense-Address] Type = ocr OCRMode = document Input = DriversLicense-TextRegion_Address.Output Region = input Languages = tr CharacterTypes = alpha,digit [DriversLicense-TextRegion_Forename] Type = SetRectangle Input = DetectCroppedImageAnchor.ResultWithSource LuaScript = getTurkey_DriverLicense_OCR_Forename.lua [DriversLicense-Forename] Type = ocr OCRMode = document Input = DriversLicense-TextRegion_Forename.Output Region = input Languages = tr CharacterTypes = alpha [DriversLicense-TextRegion_Surname] Type = SetRectangle Input = DetectCroppedImageAnchor.ResultWithSource LuaScript = getTurkey_DriverLicense_OCR_Surname.lua [DriversLicense-Surname] Type = ocr OCRMode = document Input = DriversLicense-TextRegion_Surname.Output Region = input Languages = tr CharacterTypes = uppercase # # Output # [CombineResultsWithSources] Type=or Input0 = DriversLicense-Address.ResultWithSource Input1 = DriversLicense-Forename.ResultWithSource Input2 = DriversLicense-Surname.ResultWithSource [Response] Type=response Input=CombineResultsWithSources.Output
Additional engine types, such as passport, can be added as desired. Ensure engine names are unique and use sequential ordering where appropriate.
For grammar value extraction using a template or region-based configuration and metadata fields within OCR results need to be in a language context other than English, the ocr engines need to be URL encoded throughout the configuration file. As an example, a Turkish driver's license includes "Sürücü belgesi", which translates to "driver's license" in English. The engine type must be declared with the non-English characters in "Sürücü" URL encoded as follows:
Engine12 = DriversLicense-S%C3%BCr%C3%BCc%C3%BC_belgesi
ocr engine type configuration sections with URL encoding for Turkish driver's license
ocr type engine definition section
[DriversLicense-S%C3%BCr%C3%BCc%C3%BC_belgesi] Type = ocr OCRMode = document Input = DriversLicense-TextRegion_Code5.Output RestrictToInputRegion = true Languages = tr CharacterTypes = digit
ocr type engine usage section for event processing results
[CombineResultsWithSources] Type=or Input0 = DriversLicense-Address.ResultWithSource Input1 = DriversLicense-Code4d.ResultWithSource Input2 = DriversLicense-S%C3%BCr%C3%BCc%C3%BC_belgesi.ResultWithSource Input3 = DriversLicense-Date_Of_Birth.ResultWithSource
The LuaScript file of the SetRectangle type engine includes the actual name of the respective Media Server metadata field that is used and therefore serves as the mapping between the configuration and the Media Server metadata region field. The region's coordinates defined in Media Server are used to get the region rectangle on the on-the-fly cropped image and the subsequent ocr engine uses this to OCR scan only the contents of that region. For each SetRectangle type engine or field, you must create a distinct lua file. Except for the fieldName parameter value when calling getOcrRegion (second function parameter), the lua code remains the same for each file. Save the lua files to the path on the configurations/lua path on the Media Server defined in the first step of the procedure to Update Media Server.
SetRectangle type
CAUTION: In the following lua example, the output[i] line must not contain line breaks. Any breaks in the output[i] line in the example below are a result of line wrapping for viewing.
If you copy and paste the contents below into your lua, you must manually remove line breaks in the output[i] line.
-- create new rectangles relative to recognized item's region
function getOcrRegion(record, fieldName)
local w = record.ImageData.width
local h = record.ImageData.height
local output = {}
local i = 0
for key,value in pairs(record.IdentityData.metadata) do
log(key, value)
if key == fieldName then
local roi = {}
local j = 0
for token in string.gmatch(value, "%S+") do
j = j + 1
roi[j] = tonumber(token)
if not roi[j] then break end
end
log(i+1, roi)
if roi[4] then
i = i + 1
output[i] = { left = w * roi[1] / 100, top = h * roi[2] / 100, width = w * roi[3] / 100, height = h * roi[4] / 100 }
end
end
end
if next(output) == nil then
return null
end
return output
end
function rectangle(record)
return getOcrRegion(record, 'OCR_Address')
end
If fields in the layout of the card/template/image are generally required to be together as one field for successful grammar value extraction, such as forename and surname, you can group these fields. Within the configuration, group these fields together into a single field (or type engine) and use the group output as a single entry input used for the Media Server response.
# Drivers License fields Engine19 = DriversLicense-TextRegion_Forename Engine20 = DriversLicense-Forename Engine21 = DriversLicense-TextRegion_IssueDate Engine22 = DriversLicense-Issue_Date Engine23 = DriversLicense-TextRegion_Surname Engine24 = DriversLicense-Surname # National Id fields Engine29 = NationalId-TextRegion_Forename Engine30 = NationalId-Forename Engine31 = NationalId-TextRegion_Surname Engine32 = NationalId-Surname # Drivers License sub-groups (combines) Engine43 = DriversLicense-Name # National Id sub-groups (combines) Engine44 = NationalId-Name [DriversLicense-Name] Type=or Input0 = DriversLicense-Forename.ResultWithSource Input1 = DriversLicense-Surname.ResultWithSource [NationalId-Name] Type=or Input0 = NationalId-Forename.ResultWithSource Input1 = NationalId-Surname.ResultWithSource [CombineResultsWithSources] Type=or Input0 = DriversLicense-Address.ResultWithSource Input1 = DriversLicense-Code4d.ResultWithSource Input2 = DriversLicense-S%C3%BCr%C3%BCc%C3%BC_belgesi.ResultWithSource Input3 = DriversLicense-Date_Of_Birth.ResultWithSource Input4 = DriversLicense-Place_Of_Birth.ResultWithSource Input5 = DriversLicense-Expiry_Date.ResultWithSource Input6 = DriversLicense-Name.Output Input7 = DriversLicense-Issue_Date.ResultWithSource Input8 = DriversLicense-Vehicle_Types.ResultWithSource Input9 = NationalId-Identity_No.ResultWithSource Input10 = NationalId-Name.Output Input11 = NationalId-Date_Of_Birth.ResultWithSource Input12 = NationalId-Gender.ResultWithSource Input13 = NationalId-Document_No.ResultWithSource Input14 = NationalId-Nationality.ResultWithSource Input15 = NationalId-Valid_Until.ResultWithSource [Response] Type=response Input=CombineResultsWithSources.Output
For the sub-grouping to be picked up and handled correctly, copy the updated toAgentDataResponse.xsl file on the agent host (<agentInstallDir>\mediaserver\xsl\response_templates\toAgentDataResponse.xsl) and paste in the \response_templates directory on each Media Server ‘host managed by this agent (<MediaServerInstallDir>\ configurations\xsl\response_templates), overwriting the existing file if necessary. It is within this XSL transform that the detected OCR text value for each item in a group is concatenated in the order specified in the respective group or type engine (input0, input1…inputN). An average confidence value for all the engine results in the group is also generated for the single record output.
For further guidance on custom Media Server configuration, contact Support.