KVXMLConfig()

This function is called directly and provides a way to configure options prior to the document conversion. Currently, the function is used for the following configurations:

Syntax

KVErrorCode pascal KVXMLConfig( 
    void    *pContext,
    int      nType,
    int      nValue,
    void    *p );

Arguments

pContext

A pointer returned from fpInit() or fpInitWithLicenseData().

nType

The configuration flag. This is a symbolic constant defined in kvtypes.h. The available options are described in Configuration Flags.

nValue

The integer value defined for the flags above.

This is TRUE or FALSE for all flags except KVCFG_LOGICALPDF, KVCFG_SETMETADATACHARSET, KVCFG_SETTEMPDIRECTORY, and KVCFG_SETXMLCONFIGINFO.

For KVCFG_LOGICALPDF, this is one of the paragraph direction options defined in the LPDF_DIRECTION enumerated type in kvtypes.h. See LPDF_DIRECTION.

For KVCFG_SETTEMPDIRECTORY and KVCFG_SETXMLCONFIGINFO, this is not set.

  • For KVCFG_SETMETADATACHARSET, nValue is a character set enumerated in KVCharSet in kvcharset.h. See Convert Character Sets.

  • p

    The data for the configuration flag.

    This is NULL for all flags except KVCFG_SETTEMPDIRECTORY and KVCFG_SETXMLCONFIGINFO.

    For KVCFG_SETTEMPDIRECTORY, this is path to the directory where temporary files are stored.

    For KVCFG_SETXMLCONFIGINFO, this is a pointer to the KVXConfigInfo structure. See KVXConfigInfo.

    For KVCFG_SETPASSWORD, this is the source file password.

    Configuration Flags

    The following flags are available for the nType argument in KVXMLConfig(). These flags are defined in kvtypes.h.

    Flag

    Description

    KVCFG_SUPPRESSIMAGES

    If you set KVCFG_SUPPRESSIMAGES, the XML output includes verbose markup, but no images. If you do not set this option, embedded images in a document are regenerated as separate files and stored in the output directory. To generate output with minimal markup (ID and style paragraph attributes) and without images, set the bIndexOnly member of the KVXMLOptions structure to TRUE. KVXMLOptions.

    KVCFG_ENABLEPOSITIONINFO

    If you set KVCFG_ENABLEPOSITIONINFO, a position element is included in the markup for PDF documents. The position element defines the absolute position of the text relative to the bottom left corner of the page, and includes additional information such as font and color.

    KVCFG_SETMETADATACHARSET This option enables you to specify the output character set for metadata when using fpGetSummaryInfo(). nValue is a character set enumerated in KVCharSet in kvcharset.h. See Convert Character Sets. This function should be called before fpGetSummaryInfo().

    KVCFG_SUPPRESSTOCPRINTIMAGE

    If you set KVCFG_SUPPRESSTOCPRINTIMAGE, bookmarks in a PDF file are not converted to simple XLinks in the XML output. By default, PDF bookmarks are converted to source and destination anchors. For example,

    <a xmlns:xlink="http://www.w3.org/TR/xlink" xlink:href="#bmk1">Highlight File Format</a>
    <a xmlns:xlink="http://www.w3.org/TR/xlink" name="bmk1"><img src="pdf14640.jpg"/>
    

    KVCFG_DISABLEZONE

    If you set KVCFG_DISABLEZONE, the conversion of Microsoft Word bookmarks to zone elements (<zone name ="xxx">) in the output XML is disabled.

    A bookmark in Microsoft Word documents is a name given to a selected area of the document. The bookmark might enclose words, paragraphs, tables, table cells, lists, list items, or the entire document. In XML Export, bookmarks are converted to zone elements (<Zone name="xxx">) by using the KeyView KVT_ZONE token.

    Depending on how bookmarks are defined in the original document, the creation of zone elements might result in malformed XML. In this case, you can disable zone creation to avoid these validity errors. Zone element creation is enabled by default.

    KVCFG_SETTEMPDIRECTORY

    The KVCFG_SETTEMPDIRECTORY flag enables you to specify the directory in which temporary files created during conversion processes are stored. By default, the system temporary directory is used.

    To define a directory for temporary files generated during an out-of-process conversion, set the tempfilepath parameter in the formats_e.ini file. See Convert Files Out of Process.

    On Windows, p must be in the local Windows code page.

    KVCFG_SETXMLCONFIGINFO

    The KVCFG_SETXMLCONFIGINFO flag enables you to define which elements and attributes are extracted from XML documents with a specified format ID or root element. You can use this to override the default settings for the supported XML formats (see Convert XML Files), or to define settings for custom XML document types.

    The settings are defined in the KVXConfigInfo structure (see KVXConfigInfo). To set custom settings for more than one document type, call the KVXMLConfig() function once for each type.

    You can also modify element extraction settings by using the kvxconfig.ini file. See Configure Element Extraction for XML Documents.

    KVCFG_LOGICALPDF

    The KVCFG_LOGICALPDF flag converts paragraphs in a PDF file in the order in which they appear on the page (logical reading order). The nValue argument specifies the paragraph direction. See Convert PDF Files to a Logical Reading Order.

    KVCFG_DELSOFTHYPHEN

    If you set KVCFG_DELSOFTHYPHEN, soft hyphens in the source document are removed, and the hyphenated words are joined in the XML output. By default, soft hyphens are maintained. See Control Hyphenation.

    Micro Focus recommends that you remove soft hyphens if you use Export to generate text output for an indexing engine or are not concerned with maintaining the document's layout. See fpConvertStream() or KVXMLConvertFile() for more information on running Export in index mode.

    KVCFG_INCLREVISIONMARK

    If you set this flag to TRUE, text and graphics that were deleted from a document with a revision tracking feature enabled are converted, and revision tracking information is included in the XML output.

    To reset the flag and exclude deleted content and revision tracking information from the XML output, set the flag to FALSE. See Convert Revision Tracking Information. The default is FALSE.

    KVCFG_WP_NOCOMMENTS

    Set KVCFG_WP_NOCOMMENTS to TRUE not to export text from comments in Microsoft Word documents. Comment text is exported by default from Microsoft Word 97 to 2003 files.

    You can also toggle comment output by modifying the formats_e.ini file. See Show Hidden Data.

    KVCFG_WP_SHOWHIDDENTEXT

    Set KVCFG_WP_SHOWHIDDENTEXT to TRUE to export hidden text from Microsoft Word documents.

    KVCFG_WP_SHOWDATEFIELDCODE

    Set KVCFG_WP_SHOWDATEFIELDCODE to TRUE to export date field codes from Microsoft Word documents.

    KVCFG_WP_SHOWFILENAMEFIELDCODE

    Set KVCFG_WP_SHOWFILENAMEFIELDCODE to TRUE to export the file name field code from Microsoft Word documents.

    KVCFG_SS_SHOWHIDDENINFOR

    Set KVCFG_SS_SHOWHIDDENINFOR to TRUE to export hidden information from Microsoft Excel files.

    KVCFG_SS_SHOWCOMMENTS

    Set KVCFG_SS_SHOWCOMMENTS to TRUE to export comments from Microsoft Excel files.

    KVCFG_SS_SHOWFORMULA

    Set KVCFG_SS_SHOWFORMULA to TRUE to export formulas from Microsoft Excel files.

    KVCFG_PG_HIDEHIDDENSLIDE

    Set KVCFG_PG_HIDEHIDDENSLIDE to TRUE not to export hidden slides from Microsoft PowerPoint files.

    KVCFG_PG_HIDECOMMENT

    Set KVCFG_PG_HIDECOMMENT to TRUE not to export comments from Microsoft PowerPoint files. Comments are exported by default from PowerPoint 97 to 2000 files.

    KVCFG_PG_SHOWCOMMENTSSLIDE

    Set KVCFG_PG_SHOWCOMMENTSSLIDE to TRUE to export comments slides from Microsoft PowerPoint 2003 and 2007 files.

    KVCFG_PG_SHOWSLIDNOTES

    Set KVCFG_PG_SHOWSLIDNOTES to TRUE to export slide notes from Microsoft PowerPoint files.

    You can also toggle slide note output by modifying the formats_e.ini file. See Show Hidden Data.

    KVCFG_SETPASSWORD

    This flag enables you to define a password used to open a password-protected file for export. See Export Password Protected Files.

    nValue is TRUE.

    p is the source file password, which can have a maximum length of 255 characters (the final byte is null).

    KVCFG_POSITIONINFOOUTPUTTYPE This flag enables you to extend the existing <p> tags to include bounding box information.
    KVCFG_OCR

    Specifies whether to perform Optical Character Recognition (OCR) on raster image files, to extract machine-printed text from the image. The output from XML Export includes the original image, exported to the format you specify in KVXMLOptions, and any text extracted by OCR inside <ocr> tags. If OCR detects that some of the text forms a table, it will be included in the output as a <table>.

    OCR is available only on certain platforms (see Optical Character Recognition in the platform differences section). OCR processes only standalone raster images and not subfiles, such as images embedded in a Word document.

    If your license includes OCR, it is enabled by default. To disable OCR, set this flag to FALSE.

    Returns

    The return value is one of the error codes defined in KVErrorCode in kverrorcodes.h.

    Discussion

    Examples