Convert Files Out-of-Process

Export can run independently from the calling application. This is called out-of-process. Out-of-process conversions protect the stability of the calling application in the rare case when a malformed document causes Export to fail. You can also run Export in the same process as the calling application. This is called in process. However, it is strongly recommended you convert documents out-of-process whenever possible.

The Export out-of-process framework uses a client-server architecture. The calling application sends an out-of-process conversion request to the Service Request Broker in the main Export process. The Broker then creates, monitors, and manages a Servant process for the request—each request is handled by one independent Servant process. Data is exchanged between the application thread and the Servant through TCP/IP sockets. The source data is sent to the Servant process as a data stream or file, converted in the Servant, and then returned to the application thread. At that point, the application can either terminate the Servant process or send more data for conversion.

Multiple conversion requests can be sent from multiple threads in the calling application simultaneously. All requests sent from one thread are processed by the Servant mapped to that thread. In other words, each thread can only have one Servant to process its conversion requests.

Any standard conversion errors generated by the Servant are sent to the application.

NOTE: Currently, the main Export process and Servant processes must run on the same host.

The following are requirements for running Export out-of-process:

  • Internet Protocol (TCP/IP) must be installed
  • Multithreaded processing must be supported on the operating system platform
  • The user application must be built with a multithreaded runtime library

The following methods run in-process or out-of-process:

  • convert

  • convertTo

  • getSummaryInfo

    NOTE: When converting out-of-process, these methods must be called after the call to start an out-of-process session and before the call to end an out-of-process session.

Other HTML Export methods and the File Extraction methods always run in-process.

Configure Out-of-Process Conversions

Although most components of the out-of-process conversion are transparent, the following parameters are configurable:

  • File-size threshold/temporary file location

  • Conversion time-out

  • Listener port numbers and time-out

  • Connection time-out and retry

  • Servant process name

These parameters are defined internally, but you can override the default by defining the parameter in the formats_e.ini file. The formats_e.ini file is in the directory install\OS\bin, where install is the path name of the Export installation directory and OS is the name of the operating system.

To set the parameters, add the following section to the formats_e.ini file:

[KVExportOOPOptions]
TempFileSizeMark=
TempFilePath=
WaitForConvert=
WaitForConnectionTime=
ListenerPortList=
ListenerTimeout=
ConnectRetryInterval=
ConnectRetry=
ServantName=

Each parameter is described in the following table.

The default values for these parameters are set to ensure reasonable performance on most systems. If you are processing a large number of files, or running Export on a slow machine, you might need to increase some of the time-out and retry values.

Parameters for Out-of-Process Conversion

Parameter Description

TempFileSizeMark

unit = megabytes

default=10

The file-size threshold. If the input file received by the Servant is larger than this value, temporary files are created to store the data. The directory in which the temporary files are stored is defined by the TempFilePath parameter. If the file received is smaller than this value, the data is stored in memory in the Servant. This applies only when the input is a stream.

TempFilePath

type = file path

default = current working directory

The directory in which temporary files are stored. Temporary files are created when the input file surpasses the file-size threshold (TempFileSizeMark). If the Servant cannot access the file path, an error is generated.

This applies only when converting in stream mode.

WaitForConvert

unit = seconds

default = 1800

range = 30~3600

The length of time to wait for a Servant to convert a file. If the conversion is not completed within the specified time, the error code "Wait for child process failed" is generated.

WaitForConnectionTime

unit = seconds

default = 180

range = 15~600

The length of time to wait for the Servant to connect to the application thread after the application has sent a conversion request to the Broker. If the Servant does not connect within the specified time, the error code "Wait for child process failed" is generated. If there are many Servant processes running simultaneously, you might need to increase this value.

ListenerPortList

type = integer

default = 9985, 9986, 9987, 9988, 9989

The TCP/IP port number used for communication between the calling application and the Servant. You can specify a single port number, or a series of numbers separated by commas.

ListenerTimeout

unit = seconds

default = 10

range = 5~30

The length of time to wait for the Servant listener thread to get a process ID from the Servant after the connection is established. If the ID is not obtained within the specified time, the error code "Wait for child process failed" is generated. During this time, no other Servant can connect with the application.

ConnectRetryInterval

unit = microseconds

default = 0.1

range = 50000~500000

The length of time to wait after a Servant has failed to connect to the application before it retries the connection. A Servant might be unable to connect because the application is waiting for another Servant to send a process ID.

To calculate the total retry interval, the value set here is added to the platform-specific TCP retry value (on Windows, this is 1 second).

ConnectRetry

type = integer

default = 120

range = 30~600

The number of attempts the Servant makes to connect to the calling application. This value and the total retry interval determine the total delay time. The total delay is calculated as follows:

ConnectRetryInterval + platform-specific_TCP_retry_value * ConnectRetry

For example, if the ConnectRetryInterval is set to 2 seconds, and the Export process is running on Windows (the default TCP retry value on Windows is 1 second), the total delay would be:

2 + 1 * 120 = 360

The Servant would attempt to connect to the application every 3 seconds for 120 attempts for a total of 360 seconds.

ServantName

type = string

default = servant

The name of the Servant process. To move the Servant to another location, enter a fully qualified path.

Run Export Out-of-Process — Overview

To convert files out-of-process

  1. If required, set parameters for the out-of-process conversion in the formats_e.ini file. See Configure Out-of-Process Conversions.

  2. Instantiate an HtmlExport object.

  3. Define the conversion options.

  4. Initialize an out-of-process session.

  5. Convert the input and/or call other functions that can run out-of-process.

  6. Shut down the out-of-process session.

  7. Repeat Step 3 to Step 6 for additional files.

  8. Terminate the out-of-process session and the Servant process.

  9. Shutdown the Export session.

Recommendations

  • To ensure that multithreaded conversions are thread-safe, you must create a unique context pointer for every thread by instantiating an HtmlExport object. In addition, threads must not share context pointers, and the same context pointer must be used for all API calls in the same thread. Creating a context pointer for every thread does not affect performance because the context pointer uses minimal resources.

  • All methods that can run in out-of-process mode must be called within the out-of-process session (that is, after the call to initialize the out-of-process session and before the call to end the out-of-process session).

  • When terminating an out-of-process session, persist the Servant process by setting the Boolean flag bKeepServantAlive in the endOOPSession method. If the Servant process remains active, subsequent conversion requests are processed more quickly because the Servant process is already prepared to receive data. Only terminate the Servant when there are no more out-of-process requests.

  • To recover from a failure in the Servant process, start a new out-of-process session. This creates a new Servant process for the next conversion.