Filter XML Files

KeyView can detect many types of XML file, including:

  • Generic XML
  • Microsoft Office 2003 XML (Word, Excel, and Visio)
  • StarOffice/OpenOffice XML (text document, presentation, and spreadsheet)

When you filter XML, you can tell KeyView which elements to treat as content and metadata, or to treat the whole file as plain text. The default configuration has the following behavior: 

  • For Star Office and Microsoft Office 2003 XML formats, KeyView has special handling to extract content and metadata, based on knowledge of how those formats use XML.

  • For Generic XML, KeyView ignores element names and attributes, and filters all other text. This approach attempts to minimize the amount of structural data that it filters, in favor of useful data.

You can customize how KeyView filters text and metadata from XML, including from the known formats such as Microsoft Office XML.

Alternatively, you might want to configure KeyView to filter XML files as plain text, with all markup unchanged. For this case, you must change the formats.ini configuration file to use the plain text reader (af) instead of the XML reader (xml). You can choose whether to do this for all formats that are routed to the XML reader, or just particular formats. For more information about how to route detected formats to the desired readers, see File Formats and Document Readers.