Introduction
Media Server uses IDOL KeyView to process images and documents. Media Server can ingest most image files, including multi-page images, and also accepts documents such as PDF files and PowerPoint presentations.
There are some differences in the way that these files are ingested:
- When you ingest an image the ingest engine creates a single record that contains the image.
- When you ingest a multi-page image the ingest engine creates a record for each page, and each record contains the image from the relevant page.
- When you ingest a presentation (.PPT, .PPTX, or .ODP) file, the ingest engine creates a record for each slide contained within the presentation. Each record contains an image of the slide.
- When you ingest a PDF file, the ingest engine creates a record for each page of the document. Each record contains an image of the full page. Each record also contains information about any text elements that are associated with the page (PDF files can include embedded text, which is stored as encoded data, such as UTF8, instead of as an image of the text). For each text element, Media Server extracts the text, its position on the page, and its orientation.
- For other document formats, Media Server does not extract information about pages or the position of text elements. In some cases this is because the file formats do not contain this information. When you ingest a document in a format other than PDF, the ingest engine creates a record for each image that is embedded in the document. Each record contains the embedded image and any text that follows the image, as extracted by KeyView. In this case, Media Server does provide co-ordinates for text elements contained in the document, but this is only so that the information has a consistent form for all input file formats, and the co-ordinates do not represent the position of the text in the original document.
When you ingest images and documents, Media Server does not perform tracking between images. Images are considered to be independent and are not considered as a sequence.
For information about which analysis operations you can run on images and documents, see Image and Video Processing.