Extract Speech
You can use CFS and Media Server to extract speech from audio and video.
When you index audio and video files, KeyView extracts metadata but cannot process the content. The documents produced by CFS therefore have no content. To enrich these documents, you can send the audio and video files to a Media Server. Media Server can extract speech from the audio and add a transcription to the document content.
To extract speech, use the MediaServerAnalysis
import task in CFS. For this task, you must identify documents that require speech-to-text processing, and then CFS sends these documents to Media Server. You can use a Media Server configuration to determine the analysis tasks to run, for example if you want to run speech-to-text and speaker identification on the document.
When you use CFS and Media Server to extract speech, Media Server returns the transcription to CFS, which adds the transcription to the document content.
NOTE: Media Server runs all the configured tasks on your input document in parallel.
If you want to use the result of one media analysis task in another, you might need to use Lua functions to run the tasks. For example, if you want to run language identification to determine the language to use for speech-to-text transcription, you must use a Lua function to run the tasks one after the other.
For more information about how to configure Media Server analysis tasks, refer to the CFS Administration Guide.
Identify Documents to Process
CFS only runs media analysis (including extracting speech) from documents that have the AUTN_NEEDS_MEDIA_SERVER_ANALYSIS
field. You must add this field to any document that you want to process.
Micro Focus recommends that you add this field using a Lua script. You can include conditions in the script to filter documents based on the document source, file type, or metadata extracted by KeyView.