Prepare the Vertica Database
Indexing documents into a standard database is problematic, because documents do not have a fixed schema. A document that represents an image has different metadata fields to a document that represents an e-mail message. Vertica databases solve this problem with flex tables. You can create a flex table without any column definitions, and you can insert a record regardless of whether a referenced column exists.
You must create a flex table before you index data into Vertica.
When creating the table, consider the following:
- Flex tables store entire records in a single column named
__raw__
. The default maximum size of the__raw__
column is 128K. You might need to increase the maximum size if you are indexing documents with large amounts of metadata. - Documents are identified by their
DREREFERENCE
. OpenText recommends that you do not restrict the size of any column that holds this value, because this could result in values being truncated. As a result, rows that represent different documents might appear to represent the same document. If you do restrict the size of theDREREFERENCE
column, ensure that the length is sufficient to hold the longestDREREFERENCE
that might be indexed.
To create a flex table without any column definitions, run the following query:
create flex table my_table();
To improve query performance, create real columns for the fields that you query frequently. For documents indexed by a connector, this is likely to include the DREREFERENCE
:
create flex table my_table(DREREFERENCE varchar NOT NULL);
You can add new column definitions to a flex table at any time. Vertica automatically populates new columns with values for existing records. The values for existing records are extracted from the __raw__
column.
For more information about creating and using flex tables, refer to the Vertica Documentation or contact Vertica technical support.