Control Hyphenation

There are two types of hyphens in a PDF document:

  • A soft hyphen is added to a word by a word processor to divide the word across two lines. This is a discretionary hyphen and is used to ensure proper text flow in justified text.
  • A hard hyphen is intentionally added to a word regardless of the word's position in the text flow. It is required by the rules of grammar or word usage. For example, compound words (such as three-week vacation and self-confident) contain hard hyphens.

By default, KeyView skips the source document's soft hyphens in the Filter output to provide more searchable text content. However, if you want to maintain the document layout, you can keep soft hyphens in the Filter output.

To output soft hyphens from PDF documents

  • In the Java API, call the method setKeepSoftHyphen(boolean) with the value true. For example:

    objFilter.setKeepSoftHyphen(true);

    The FilterTest sample program demonstrates this method. See FilterTest.

  • In the formats.ini configuration file, set the following parameter:

    [pdf_flags]
    keepsofthyphen=TRUE

    (This is an alternative approach - you do not need to do this if you have configured this feature through the API).