Analyze Media

Your connectors might retrieve images, audio, and video from data repositories. File Content Extraction can extract metadata from the media but cannot process its content. To enrich documents that represent media files, you can use Knowledge Discovery Rich Media analytics.

For example, you can run optical character recognition to extract text from scanned documents, or face recognition to recognize faces in photographs. You might want to tag all images and documents that contain your company logo by running object recognition. You can use speech-to-text to convert spoken words into text, and speaker identification to recognize the person who is speaking. After this information has been added to the document, it can be used by other Knowledge Discovery operations. The information extracted can be added to the document's metadata or content.

Some types of analysis work out-of-the-box, but others require training. For example, to use face recognition or object recognition, you must train Knowledge Discovery to recognize specific faces or objects.

There are several different ways to configure media analysis. In NiFi Ingest, set up a KeyViewRouteOnFileType processor to identify documents that represent rich media files. You can route these documents to one of the following processors:

  • An AnalyzeMedia processor, which performs media analysis within NiFi. If you use this processor you do not need to set up a separate Media Server, but media analysis will consume CPU cycles and memory on the NiFi host.
  • A MediaAnalysis processor, which sends the associated media files to your Media Server (which might be deployed on a separate machine). Media Server processes the files and extracts useful information. The MediaAnalysis processor retrieves analysis results from the ACI response (so your session configuration must output data using the ACI response output engine) and writes the data into the Content document.