Use the pdf2sr Reader

The pdf2sr reader is an alternative that can be used instead of pdfsr for filtering PDF files. It uses a different parsing technology and may yield better results for some files.

The pdf2sr reader has the following features:

  • supports standard and custom metadata (non-XMP)
  • supports basic text extraction
  • supports password protected PDFs
  • supports table detection (see Table Detection for PDF Files)

The pdf2sr reader has the following limitations:

  • does not support logical order
  • does not support bidi PDFs
  • does not extract subfiles
  • does not extract bookmarks from PDFs
  • does not give estimations on percent embedded fonts match with display glyphs
  • does not support XMP metadata
  • does not support headers or footers
  • supports annotations only in the raster output, not as searchable text
  • does not support content access stream
  • does not support tagged content (PDFs)
  • does not filter text from XFA-based PDF forms
  • does not report document restrictions (see Document Restrictions)
  • cannot reconstruct missing information from Arabic text in converted PDFs (when you use Microsoft Print to PDF to convert Word documents that contain Arabic text in Calibri font to PDF, the resulting file is often incomplete because information that is required to interpret the text content is missing. The pdfsr reader can reconstruct the missing information, but pdf2sr does not do this).

To use the pdf2sr reader

  1. Open the formats.ini file with a text editor.
  2. In the [Formats] section, set the following:

    230=pdf2