Render an Image, Thumbnail, or PDF for Ingested Pages
The Web Connector can render an image, thumbnail image, or PDF of each Web page that it ingests.
By default, when you configure the connector to create one or more of these files they are each indexed as the document content of a separate document, alongside the indexed Web page. Alternatively, you can configure the connector to write the files to a folder.
If you write rendered images, PDF files, and thumbnails to a folder the connector adds metadata fields to each document that contain the paths of associated files. The fields are named:
RENDITION_FILE_IMAGE
RENDITION_FILE_PDF
RENDITION_FILE_THUMBNAIL
When the connector sends ingest-removes for deleted documents, the connector deletes any rendered images, PDF files and thumbnails associated with those documents.
To render an image, thumbnail, or PDF for each ingested page
- Stop the connector and open the configuration file.
-
Modify your fetch task by adding the following parameters:
CreateImageRendition
To render an image for each ingested page, set this parameter to true
.CreateThumbnailRendition
To render a thumbnail image for each ingested page, set this parameter to true
.CreatePDFRendition
To render a PDF copy of each ingested page, set this parameter to true
.RenditionsFilePath
The path of the folder to write rendered images, PDF files, and thumbnails to. The folder must already exist, and the user running the connector must have permission to write files to the folder. If you don't set this parameter, the connector indexes each file as the content of a separate document. FullPageRender
Specifies whether images and thumbnail images show the full page ( true
), or only the top part of the page (false
), that you would see when viewing the page in a web browser.RenditionImageFormat
The image format for images and thumbnail images. RenditionImageQuality
The image quality for JPEG and PNG images and thumbnail images. Specify an integer value from 0 to 100, where lower values represent higher compression (usually resulting in a smaller file size), and higher values represent higher quality. ThumbnailRenditionWidth
The maximum width for thumbnail images, in pixels. ThumbnailRenditionHeight
The maximum height for thumbnail images, in pixels. - Save and close the configuration file.
Example
The following example renders a thumbnail for each page, as a PNG image that has a maximum width of 350 pixels:
[MyTask] Url=http://www.example.com/ ... CreateThumbnailRendition=true RenditionImageFormat=png ThumbnailRenditionWidth=350 FullPageRender=FALSE