ObjectClassRecognition
Runs object class recognition on the file(s) associated with an IDOL document FlowFile, and adds information about any recognized objects to the IDOL document.
To run object class recognition you must have a recognizer. OpenText provides some pre-trained recognizers, and you can create your own. For more information about object class recognition, including how to train a recognizer, refer to the Media Server Administration Guide.
The processor can handle video files.
The processor can handle the following image formats:
- TIFF
- JPEG
- JPEG 2000
- PNG
- GIF
- BMP (compressed BMP files are not supported) and ICO
- PBM, PGM, and PPM
- WebP
Additionally, if you configure your MediaServiceImpl controller service to use a KeyView Export Service, the processor can handle document formats, including:
- Adobe PDF
- Microsoft Word Document (.DOC and .DOCX)
- Microsoft Excel Sheet (.XLS and .XLSX)
- Microsoft PowerPoint Presentation (.PPT and .PPTX)
- OpenDocument Text (.ODT)
- OpenDocument Spreadsheet (.ODS)
- OpenDocument Presentation (.ODP)
- Rich Text (RTF)
Properties
Name | Default Value | Description |
---|---|---|
IDOL License Service | An IdolLicenseServiceImpl that provides a way to communicate with an IDOL License Server. | |
Media Service | A MediaServiceImpl that manages media analysis resources. | |
Video Sample Interval | 100 | The interval between video frames that are selected to be analyzed, in milliseconds. |
Recognizer File | The path of a file that contains the recognizer to use. Set this property to use a recognizer that you exported from Media Server, using the action ExportObjectClassRecognizer . |
|
Shared Recognizer | The name of the recognizer to use for object class recognition. Set this property to use a recognizer that is stored in the external database specified by the Media Service (see the "Media Service" property). | |
Detection Threshold | 30 | The minimum confidence score necessary to output a result. |
World Coordinate Units | When you set this property, the processor performs perspective analysis and outputs real-world 3D coordinates for recognized objects. The processor expects each object class in the recognizer to have a metadata field named dimensions(UNITS) . For example, if you set this parameter to "m" the processor expects each class in the recognizer to have a metadata field named dimensions(m) . For more information about setting object dimensions, refer to the Media Server Administration Guide. |
|
Camera View Angle | The horizontal angle of view of the camera, in degrees. If you want to use perspective analysis, OpenText recommends setting this parameter from the camera specifications (if known). | |
Image Output Format | When set, the processor saves the full image or best video frame for each recognized object, in the specified format (BMP, GIF, JPEG, PDF, PNG, PPM, TIFF or WEBP). | |
Video Clip Quality | When set to "low", "medium", or "high", the processor saves a video clip of the recognized object at the specified quality level. | |
Video Clip Height | 300 | The height of the video clip that is generated when you set Video Clip Quality, in pixels. |
Relationships
Name | Description |
---|---|
success | Processing was successful. |
failure | Processing failed. |
Example Output
The following examples show the metadata that is added to an IDOL document by object class recognition.
<idol_media> <objectclasses> <objectclass page="1"> <recognizer>surveillance</recognizer> <class>car</class> <bestRegion> <region height="507" left="119" page="1" top="108" width="710"/> <imagefile>/opt/nifi/idol_repository/Files/ObjectClassRecognition_d696c556-0180-1000-2a61-7c313ce09bf8/0359bf72-2025-4b65-95c3-4ce8c4c59ab5.bmp</imagefile> </bestRegion> <region height="507" left="119" page="1" top="108" width="710"/> </objectclass> </objectclasses> </idol_media>
<idol_media> <objectclasses> <objectclass duration="3.88" start="3.28"> <recognizer>surveillance</recognizer> <class>car</class> <videoFile>/opt/nifi/idol_repository/Files/ObjectClassRecognition_d696c556-0180-1000-2a61-7c313ce09bf8/89954d48-98cd-43f9-b027-4e4f6845ddaf.mp4</videoFile> <bestRegion> <region height="92" left="487" timestamp="3.92" top="-1" width="130" x="3.96" y="83.92" z="0"/> <imagefile>/opt/nifi/idol_repository/Files/ObjectClassRecognition_d696c556-0180-1000-2a61-7c313ce09bf8/e5c819c5-1dcd-4929-af2b-3bf4e8acd1a0.bmp</imagefile> </bestRegion> <region duration="0.04" height="86" left="479" start="3.28" top="-31" width="122" x="3.94" y="89.17" z="0"/> <region duration="0.04" height="89" left="486" start="3.6" top="-16" width="126" x="4.01" y="86.47" z="0"/> <region duration="0.04" height="92" left="487" start="3.92" top="-1" width="130" x="3.96" y="83.92" z="0"/> <region duration="0.04" height="99" left="494" start="4.24" top="17" width="136" x="4.01" y="80.61" z="0"/> <region duration="0.04" height="106" left="501" start="4.56" top="35" width="142" x="4.05" y="77.55" z="0"/> <region duration="0.04" height="113" left="508" start="4.88" top="53" width="148" x="4.09" y="74.7" z="0"/> ... </objectclass> ... </objectclasses> </idol_media>
The XML contains an objectclass
element for each object that is recognized.
- The
recognizer
element provides the name of the recognizer that was used to recognize the object. - The
class
element provides the name of the object class. -
The
region
element describes the position of the recognized object.When you analyze an image or document the
page
attribute specifies the page on which the object was detected. When you analyze a video and an object is tracked across multiple frames, there can be multipleregion
elements withstart
andduration
attributes that provide video timestamps, in seconds.The
left
,top
,width
, andheight
attributes provide the position and size of the region in pixels (left
specifies the distance from the left side of the image to the left side of the region, andtop
specifies the distance from the top of the image to the top of the region). In the video example you can see that the car'sheight
andwidth
are increasing as thestart
time increases. This indicates that the car is moving towards the camera.When you analyze a video and set World Coordinate Units the
region
element can includex
,y
, andz
attributes that provide the position of the object in real-world 3D coordinates. The measurements are in the same units that were used to specify the object size (metres, in this example). In the example you can see that the car is traveling on a flat road (z=0
). The x-coordinate is relatively constant and the y-coordinate is decreasing, because the car is moving in a straight line towards the camera. -
The
bestRegion
element represents the best video frame in which the object was visible. It includes the following elements:region
- The position of the recognized object in the selected frame.imageFile
- The path of the image that is generated when you set the property Image Output Format.
-
The
videoFile
element contains the path of the video file that is generated when you set the property Video Clip Quality.