KVHTMLConfig()

This function is called directly and provides a way to configure options prior to document conversion. You can use this function to:

  • Enable PDF conversion to JPEG or PNG

    Enable the graphic-based PDF reader kppdf2rdr to convert PDF documents to JPEG files.

  • Configure PDF bookmarks

    Specify whether bookmarks in a PDF file are used to create a table of contents in the HTML output.

  • Configure rotated text

    Specify whether rotated text is displayed in its original position or at the bottom of the page. Currently, this option applies only to PDF files.

  • Designate temporary directory

    Specify a directory in which temporary files created during the conversion process are stored.

  • Configure XML conversion

    Specify the elements and attributes extracted from an XML document based on a file's document type.

  • Enable PDF logical reading order

    Convert paragraphs in PDF files in the order in which they appear on the PDF page and with left-to-right or right-to-left paragraph direction. See Convert PDF Files to a Logical Reading Order.

  • Configure PDF soft hyphens

    Specify whether soft hyphens in a PDF file are removed from the HTML output. See Control Hyphenation.

  • Enable revision marks

    Convert text and graphics that were deleted from a document with revision tracking enabled and include revision information in the HTML output. See Include Revision Information.

  • Enable empty image tags

    Prevent graphics from being converted and generate image tags with empty src attributes. This makes the conversion faster, but, because placeholders are generated for the graphics, maintains the text flow of the original document. This is similar to the bNoPictures parameter; however, bNoPictures does not generate an image tag. See bNoPictures.

  • Toggle hidden data output from Microsoft Word, Excel, and PowerPoint documents

    Show or hide information from hidden sources such as comments or slides. See Show Hidden Data.

  • Enable a PDF invisible text toggle button

    Enable a JavaScript button that toggles the display of invisible text and regular content in exported PDF documents. Toggle Invisible Text.

  • Specify opacity of invisible text in PDFs

    Specify the opacity of invisible text in exported PDF documents, from 0 (invisible) to 100 (fully visible). See Specify Opacity of Invisible Text.

  • Protected file password

    Specify the password to use to open a password-protected file for export.

  • Specify output character set for summary information

    Specify the output character set for the document's metadata, when using fpGetSummaryInfo().

  • Enable tabbed spreadsheet view

    Enables a tabbed navigation view for spreadsheets.

  • Enable previews for large spreadsheets

    Limits the number of rows, columns, and sheets that are exported to HTML.

  • Enable or disable Optical Character Recognition (OCR)

    KeyView can perform Optical Character Recognition (OCR) on raster image files - see KVCFG_OCR in the table below.

Syntax

KVErrorCode pascal KVHTMLConfig( 
    void    *pContext,
    int      nType,
    int      nValue,
    void    *p );

Arguments

pContext

A pointer to a KeyView Export session that you initialized by calling fpInit().

nType

The configuration flag. This is a symbolic constant defined in kvtypes.h. The available options are described in Configuration Flags for KVHTMLConfig().

nValue

The integer value defined for the flags above. This is TRUE or FALSE for all flags except:

  • KVCFG_LOGICALPDFnValue is one of the paragraph direction options defined in the LPDF_DIRECTION enumerated type in kvtypes.h. See LPDF_DIRECTION.
  • KVCFG_SETTEMPDIRECTORYnValue is not set.
  • KVCFG_SETXMLCONFIGINFOnValue is not set.
  • KVCFG_SETINVISTEXTTOGGLEnValue is not set.
  • KVCFG_SETINVISTEXTOPACITYnValue is an integer that specifies text opacity, from 0 (invisible) to 100 (fully visible).
  • KVCFG_SETMETADATACHARSETnValue is a character set enumerated in KVCharSet in kvcharset.h. See Convert Character Sets.

p

The data for the configuration flag. This is NULL for all flags except:

  • KVCFG_SETTEMPDIRECTORY—This is a pointer to a path to the directory where temporary files are stored.

  • KVCFG_SETXMLCONFIGINFO—This is a pointer to the KVXConfigInfo structure. See KVXConfigInfo.

  • KVCFG_INCLREVISIONMARK—This is a pointer to the KVRevisionMark structure. See KVRevisionMark.

  • KVCFG_SETINVISTEXTTOGGLE—This is a null-terminated string that determines the toggle button name.

  • KVCFG_SETPASSWORD—This is the source file password.

Returns

The return value is one of the error codes defined in KVErrorCode in kverrorcodes.h.

Discussion

  • You must call this function after the call to fpInit() and before the call to fpConvertStream() or KVHTMLConvertFile().

  • This function runs in-process or out of process. See Convert Files Out of Process.

  • When converting out-of-process, this function must be called after the call to KVHTMLStartOOPSession() and before the call to KVHTMLEndOOPSession(). The exception is when setting KVCFG_SETTEMPDIRECTORY - in which case, call this function before the call to KVHTMLStartOOPSession().

  • The configuration flags are described in the following table.

Configuration Flags for KVHTMLConfig()

Flag Description

KVCFG_SETHIFIPDF

Enables the graphic-based PDF reader kppdf2rdr to convert PDF documents. However, to convert the pages of a PDF file into raster images, OpenText recommends using the pdf2sr reader instead, as described in Use the pdf2sr Reader.

To see which platforms kppdf2rdr and pdf2sr are available on, see the platform differences section.

KVCFG_SETMETADATACHARSET This flag enables you to specify the output character set for metadata when using fpGetSummaryInfo(). nValue is a character set enumerated in KVCharSet in kvcharset.h. See Convert Character Sets. You should call this function before fpGetSummaryInfo().

KVCFG_SUPPRESSTOCPRINTIMAGE

If you set KVCFG_SUPPRESSTOCPRINTIMAGE, bookmarks in a PDF file are not used to generate a table of contents in the HTML output. By default, the table of contents is generated from bookmarks within the PDF file. See Generate a Table of Contents from PDF Bookmarks.

KVCFG_SETTEXTROTATE

If you set KVCFG_SETTEXTROTATE, rotated text in a file is displayed at 0 degrees at the bottom of the page on which it appears. The page is enlarged to accommodate the text.

By default, rotated text in a file is displayed in its original position, at the original font size, and at 0 degrees rotation. Because the text is the original size, but might be displayed in a smaller space, the text might overlap adjacent text in the HTML output. You use the KVCFG_SETTEXTROTATE option to avoid this problem. See Convert Rotated Text.

HTML markup does not support text rotation.

KVCFG_SETTEMPDIRECTORY

This flag enables you to specify the directory in which temporary files created during conversion processes are stored. By default, the system temporary directory is used.

To define a directory for temporary files generated during an out-of-process conversion, set the tempfilepath parameter in the formats_e.ini file. Convert Files Out of Process.

On Windows, p must be in the local Windows code page.

To set KVCFG_SETTEMPDIRECTORY when converting out-of-process, call this function before you call KVHTMLStartOOPSession().

KVCFG_SETXMLCONFIGINFO

This flag enables you to define which elements and attributes are extracted from XML documents with a specified format ID or root element. You can use this to override the default settings for the supported XML formats (see Convert XML Files), or to define settings for custom XML document types.

The settings are defined in the KVXConfigInfo structure (seeKVXConfigInfo). To set custom settings for more than one document type, call the KVHTMLConfig() function once for each type.

You can also modify element extraction settings by using the kvxconfig.ini file. See Configure Element Extraction for XML Documents.

KVCFG_LOGICALPDF

This flag converts paragraphs in a PDF file in the order in which they appear on the page (logical reading order). The nValue argument specifies the paragraph direction. See Convert PDF Files to a Logical Reading Order.

KVCFG_DELSOFTHYPHEN

If you set this flag, soft hyphens in the source document are removed, and the hyphenated words are joined in the HTML output. By default, soft hyphens are maintained. See Control Hyphenation.

OpenText recommends that you remove soft hyphens if you use Export to generate text output for an indexing engine or are not concerned with maintaining the document's layout. See fpConvertStream() or KVHTMLConvertFile() for more information on running Export in index mode.

KVCFG_INCLREVISIONMARK

If you set this flag to TRUE, text and graphics that were deleted from a document with revision tracking enabled are converted, and revision information (revision title, reviewer name, and revision date and time) is included in the HTML output.

To reset the flag and exclude deleted content and revision information from the HTML output, set the flag to FALSE. See Include Revision Information.

The default is FALSE.

KVCFG_BLANKPICTURE

If you set this flag to TRUE, graphics in a document are not converted, but an image tag is generated with an empty src attribute, creating an empty placeholder for the graphic. For example:

<img src="" height="136" width="101">

This allows you to generate output without graphics, but still maintain the text flow of the original document.

This option applies to word processing formats only. The default is FALSE.

KVCFG_WP_NOCOMMENTS

Set this flag to TRUE not to export text from comments in Microsoft Word documents. Comment text is exported by default from Microsoft Word 97 to 2003 files.

You can also toggle the display of comment output by modifying the formats_e.ini file. See Show Hidden Data.

KVCFG_WP_SHOWHIDDENTEXT

Set this flag to TRUE to export hidden text from Microsoft Word documents.

KVCFG_WP_SHOWDATEFIELDCODE

Set this flag to TRUE to export date field codes from Microsoft Word documents.

KVCFG_WP_SHOWFILENAMEFIELDCODE

Set this flag to TRUE to export the file name field code from Microsoft Word documents.

KVCFG_SS_SHOWHIDDENINFOR

Set this flag to TRUE to export hidden information from Microsoft Excel files.

KVCFG_SS_SHOWCOMMENTS

Set this flag to TRUE to export comments from Microsoft Excel files.

KVCFG_SS_SHOWFORMULA

Set this flag to TRUE to export formulas from Microsoft Excel files.

KVCFG_PG_HIDEHIDDENSLIDE

Set this flag to TRUE not to export hidden slides from Microsoft PowerPoint files.

KVCFG_PG_HIDECOMMENT

Set this flag to TRUE not to export comments from Microsoft PowerPoint files. Comments are exported by default from PowerPoint 97 to 2000 files.

KVCFG_PG_SHOWCOMMENTSSLIDE

Set this flag to TRUE to export comments slides from Microsoft PowerPoint 2003 and 2007 files.

KVCFG_PG_SHOWSLIDENOTES

Set this flag to TRUE to export slide notes from Microsoft PowerPoint files.

You can also toggle slide note output by modifying the formats_e.ini file. See Show Hidden Data.

KVCFG_SETPDFINVISTEXTTOGGLE

This flag enables a JavaScript button in exported PDF documents, which you can use to show and hide invisible text.

Invisible text is hidden by default. See Toggle Invisible Text.

KVCFG_SETPDFINVISTEXTOPACITY

This flag allows you specify the degree of invisible text opacity in exported PDFs, from 0 (invisible) to 100 (opaque). Use this option if you want to view both the invisible text and the rasterized image in the document.

Invisible text opacity is set to 0 by default. See Specify Opacity of Invisible Text.

KVCFG_SETPASSWORD

This flag enables you to define a password used to open a password-protected file for export. See Export Password Protected Files. For a list of supported file types, see Supported Password Protected File Types.

nValue is TRUE.

p is the source file password, which can have a maximum length of 255 characters (the final byte is null).

KVCFG_TABNAVIGATION

If you set this flag to TRUE, it enables a tabbed navigation view for spreadsheets. A row of tabs is displayed at the bottom of the browser window, and enables the user to switch between multiple sheets in a workbook.

NOTE: JavaScript must be enabled.

KVCFG_SS_PREVIEW Specifies whether to export a preview for large spreadsheets rather than exporting all content. Web browsers might take a long time, or fail completely, to render spreadsheets with large numbers of cells. If you set this flag to TRUE, KeyView limits the numbers of rows, columns, and sheets that are exported to HTML.
KVCFG_OCR

Specifies whether to perform Optical Character Recognition (OCR) on raster image files, to extract machine-printed text from the image. The output from HTML Export includes the original image, exported to the format you specify in KVHTMLOptionsEx, and a layer containing the extracted text. This means that when the output is viewed in a web browser, you can search for words and copy the text.

OCR is available only on certain platforms (see Optical Character Recognition in the platform differences section). OCR processes only standalone raster images and not subfiles, such as images embedded in a Word document.

If your license includes OCR, it is enabled by default. To disable OCR, set this flag to FALSE.

Examples

  • To specify that the graphic-based PDF reader is used to convert PDF files:

    (*fpHTMLConfig)(pKVHTML, KVCFG_SETHIFIPDF, TRUE, NULL);
  • To specify that bookmarks in a PDF file are not used to generate a table of contents in the HTML output:

    (*fpHTMLConfig)(pKVHTML, KVCFG_SUPPRESSTOCPRINTIMAGE, TRUE, NULL);
  • To specify that rotated text in a file is displayed at 0 degrees at the bottom of the page on which it appears:

    (*fpHTMLConfig)(pKVHTML, KVCFG_SETTEXTROTATE, TRUE, NULL);
  • To set a directory for temporary files:

    char    tmpDir[250];
    strcpy (tmpDir, "c:\\temp\\htmlexport");
    (*fpHTMLConfig)(pKVHTML, KVCFG_SETTEMPDIRECTORY, 0, tmpDir);
  • To specify custom extraction settings for conversion of an XML file:

    KVXConfigInfo    xinfo;
    (*fpHTMLConfig)(pKVHTML, KVCFG_SETXMLCONFIGINFO, 0, &xinfo);
    
  • To specify that PDF files are converted to a logical reading order, and the paragraph direction for the PDF output is left to right:

    (*fpHTMLConfig)(pKVHTML, KVCFG_LOGICALPDF, LPDF_LTR, NULL);
  • To specify that PDF files are converted to a logical reading order, and the paragraph direction for the PDF output is right to left:

    (*fpHTMLConfig)(pKVHTML, KVCFG_LOGICALPDF, LPDF_RTL, NULL);
  • To specify that PDF files are converted to a logical reading order, and the paragraph direction for the PDF output is determined automatically for each page:

    (*fpHTMLConfig)(pKVHTML, KVCFG_LOGICALPDF, LPDF_AUTO, NULL);
  • To specify that soft hyphens are removed from the HTML output:

    (*fpHTMLConfig)(pKVHTML, KVCFG_DELSOFTHYPHEN, TRUE, NULL);
  • To convert text and graphics that are identified by revision marks:

    KVRevisionMark    RMark;
    (*fpHTMLConfig)(pKVHTML, KVCFG_INCLREVISIONMARK, TRUE, &RMark))
    
  • To generate a placeholder for all pictures:

    (*fpHTMLConfig)(pKVHTML, KVCFG_BLANKPICTURE, TRUE, NULL);
  • To toggle hidden data output from Microsoft Word documents, use one of the KVCFG_WP flags:

    (*fpHTMLConfig)(pKVHTML, KVCFG_WP_NOCOMMENTS, TRUE, NULL);
  • To toggle hidden data output from Microsoft Excel documents, use one of the KVCFG_SS flags:

    (*fpHTMLConfig)(pKVHTML, KVCFG_SS_SHOWHIDDENINFOR, TRUE, NULL);
  • To toggle hidden data output from Microsoft PowerPoint documents, use one of the KVCFG_PG flags:

    (*fpHTMLConfig)(pKVHTML, KVCFG_PG_HIDEHIDDENSLIDE, TRUE, NULL);
  • To enable an invisible text toggle button in exported PDF documents:

    (*fpHTMLConfig)(pKVHTML, KVCFG_SETPDFINVISTEXTTOGGLE, 0, szButtonName);

    where szButtonName is a null-terminated string that determines the button name.

  • To specify the opacity of invisible text in exported PDF documents:

    (*fpHTMLConfig)(pKVHTML, KVCFG_SETPDFINVISTEXTOPACITY, iInvisOpacity, NULL);

    where iInvisOpacity is an integer from 0 (invisible) to 100 (fully visible).

  • To specify a password to open a password-protected file for export:

    (*fpHTMLConfig)(pKVHTML, KVCFG_SETPASSWORD, TRUE, password);

    where password is a null-terminated string of 255 or fewer characters.

  • To produce summary information in UTF8:

    (*fpHTMLConfig)(pKVHTML, KVCFG_SETMETADATACHARSET, KVCS_UTF8, NULL);
  • To export only a preview of spreadsheets to HTML:

    (*fpHTMLConfig)(pKVHTML, KVCFG_SS_PREVIEW, TRUE, NULL);