RenderHTML

A processor that replaces an HTML file, in an IDOL document FlowFile, with an image of the rendered HTML. You can choose the output image format.

TIP: Use this processor after a KeyViewExportToHtml processor to obtain an image or preview of any document (as long as the document format is supported by the KeyView HTML Export SDK).

Properties

Name Default Value Description
Idol License Service   An IdolLicenseServiceImpl that provides a way to communicate with an IDOL License Server.
Proxy Configuration Service   A ProxyConfigurationServiceImpl that specifies the proxy server to use.
Allow Navigations false Specifies whether to allow navigation away from the source HTML document.
Clip CSS select on failure SelectAll Specifies what to do when the CSS selector specified by Clip Page Using CSS: Select does not match any elements on the page. "Fail" means that processing fails and the page is not ingested. "SelectAll" means that the page is not clipped and all of the page content is ingested.
Clip CSS unselect on failure SelectNone Specifies what to do when the CSS selector specified by Clip Page Using CSS: Unselect does not match any elements on the page. "Fail" means that processing fails and the page is not ingested. "SelectNone" means that clipping removes any element not selected by Clip Page Using CSS: Select.
ClippingMode NONE

Clipping removes uninteresting parts of a page such as navigation bars and advertisements, to prevent irrelevant information being added to the IDOL index. Choose one of the following options:

  • NONE - Do not clip pages.
  • CSSCLIPPING - Clip pages using CSS selectors.
  • READABILITY - Clip pages using the Mozilla readability library. You can configure the behavior of this library by setting options in a JSON file specified by "Readability options file".

    NOTE: This option is not available on FIPS-compliant platforms.

  • SMARTPRINT - Clip pages using the SmartPrint algorithm.

    DEPRECATED: The SMARTPRINT option is deprecated in NiFi Ingest 23.3 and later. It will be removed in a future major release.

Clip Page Using CSS: Select   A CSS selector to specify the parts of a page to keep when the page is clipped with ClippingMode=CSSCLIPPING. The processor also keeps all descendants of these elements.
Clip Page Using CSS: Unselect  

A CSS selector to specify the parts of a page to remove when the page is clipped with ClippingMode=CSSCLIPPING. The processor also removes all descendants of these elements.

The Clip Page Using CSS: Select property is applied before Clip Page Using CSS: Unselect, so you can use this property to remove unwanted descendants of elements identified by Clip Page Using CSS: Select.

Full Page Render false Specifies whether to render the full page, or just the initial viewport.
Page timeout 120s The maximum amount of time to spend processing a page. Specify a time duration, for example "15 seconds".
Render Format png The output image format.
Render Quality 100 The quality of the output image, for image formats that support compression. The value can range from 0 (smallest file size, greatest compression) to 100 (largest file size, better image quality).
Max Render Height 25000 The maximum height (in pixels) for rendered images. Any content that overflows the maximum size is not included in the output.
Max Render Width 2500 The maximum width (in pixels) for rendered images. Any content that overflows the maximum size is not included in the output.
WKOOP Path The path to an included version of WKOOP The path to the embedded web browser that is used to render the HTML.
The processor supports many additional parameters that you can use to configure WKOOP (the embedded web browser that is used to render the HTML). For more information about these parameters, right-click the processor and click View Usage, or refer to the documentation for the IDOL Web Connector.

Relationships

Name Description
rendered Receives new FlowFiles that contain the rendered images. If you connect this relationship, the input FlowFiles are transferred to the "success" relationship unmodified.
success Processing was successful. When the "rendered" relationship is connected, the input FlowFiles are sent to this relationship unmodified. If you do not connect the "rendered" relationship, the HTML file in the input FlowFile is replaced with an image of the rendered HTML.
failure Processing failed.