Input and Output Methods

The previous examples show KeyView taking input from a file (keyview::io::InputFile) and sending output to a file (keyview::io::OutputFile). The C++ API allows you to take input from and write output to other data sources by making use of generic types for input and output. For example, the signature of the Session::open method is:

template <typename InputType>
Document open(InputType& input)

...and the signature of the Document::filter method is:

template <typename OutputType>
void filter(OutputType& output);

Some input and output types are defined in Keyview_IO.hpp. These are InputFile, OutputFile, and InMemoryFile. You can create your own input and output types to read from and write to any data source you like.

To create a custom input type, create a class with read, seek, and tell methods, which reads from your data source. The methods you write must conform to the example signatures in the keyview::InputFile class defined in Keyview_IO.hpp. You can then pass instances of your class into Session::open.

To create a custom output type, create a class with a write method that writes to your data source. Your write method must conform to the example signature in the keyview::OutputFile class, also defined in Keyview_IO.hpp. You can then pass instances of this class into any KeyView function that takes an OutputType (such as Document::filter or Subfile::extract).

For example: 

class MyOutput
{
   public:
      int64_t write(const char* ptr, int64_t count)
      {
         // process the output
         return count;
      }
};

You then pass this class in when you call filter, for example; 

MyOutput output; // Create your custom output object
auto doc = session.open(input);
doc.filter(output);

KeyView calls the write function you have implemented once for each chunk of data that it filters. The process is:

  1. Your code calls session.open then doc.filter.

  2. KeyView opens the file and starts to read it.

  3. KeyView finds some text and calls output::write (in your custom code).

  4. Your code now has control again. The write call tells you how many bytes of text you have, and what the text is. You can do any processing you want to on the text, and then return to KeyView. You can either request more text by returning the number of bytes written, or return 0 to stop the filtering process (see Partial Filtering).

  5. If there is more text to filter, and output::write() requested more, KeyView returns to step 2. Otherwise, it returns from doc.filter().

A class can be valid as both an InputType and OutputType.

Partial Filtering

In some cases you might not want to filter all the text from a file, for example because the information you require is in the first half of the file. In this case you can save time by stopping the filtering process after you have what you need.

If you are reading text from the std::istream returned by Document::text, you can perform partial filtering by reading as much text as you need: KeyView only processes as much of the document as it needs to output the text you read.

If you are writing the document's text to an output using the Document::filter method, you can enable partial filtering by implementing a custom OutputType as shown in the previous example. KeyView calls the write method you implement for each block of text that it filters. The count argument is the number of bytes in each chunk. Chunks can vary significantly in size.

After you have the text you need, you can return 0 to stop the filtering process.