Indexing documents into a standard database is problematic, because documents do not have a fixed schema. A document that represents an image has different metadata fields to a document that represents an e-mail message. Vertica databases solve this problem with flex tables. You can create a flex table without any column definitions, and you can insert a record regardless of whether a referenced column exists.
You must create a flex table before you index data into Vertica.
When creating the table, consider the following:
__raw__
. The default maximum size of the __raw__
column is 128K. You might need to increase the maximum size if you are indexing documents with large amounts of metadata.DREREFERENCE
. HPE recommends that you do not restrict the size of any column that holds this value, because this could result in values being truncated. As a result, rows that represent different documents might appear to represent the same document. If you do restrict the size of the DREREFERENCE
column, ensure that the length is sufficient to hold the longest DREREFERENCE
that might be indexed.To create a flex table without any column definitions, run the following query:
create flex table my_table();
To improve query performance, create real columns for the fields that you query frequently. For documents indexed by a connector, this is likely to include the DREREFERENCE
:
create flex table my_table(DREREFERENCE varchar NOT NULL);
You can add new column definitions to a flex table at any time. Vertica automatically populates new columns with values for existing records. The values for existing records are extracted from the __raw__
column.
For more information about creating and using flex tables, refer to the HPE Vertica Documentation or contact HPE Vertica technical support.
|