Dataset templates
Dataset templates let you create datasets based on common criteria. A dataset template must contain at least one dataset attribute, custom grammar and entity, or have scheduling enabled. When you create a dataset based on a template, the criteria defined in the template is prepopulated and can be edited for the new dataset as needed.
Changes you make to the options in a dataset template apply going forward and do not affect existing datasets based on the template. Dataset templates that are associated with existing datasets can be deleted at any time.
-
From the primary navigation pane, click Sources > Dataset Templates.
The Dataset Templates page opens.
-
Click NEW DATASET TEMPLATE.
The New Dataset Template dialog opens.
-
Complete the General options for the new dataset template and then follow the dialog prompts for the remaining options.
Option Description Dataset template name Enter a meaningful, unique name for the dataset template.
Limits: Maximum 50 characters.
Description (Optional) Enter a meaningful description of the template.
Limits: Maximum 250 characters.
Click NEXT.
-
(Can be applied to unstructured datasets and custom adapters only, optional) Complete the Advanced Capture Rules for this template to refine the analysis options. You can refine document analysis based on create date, modify date, file extensions, and file size to perform analysis on specific documents in different ways. Once the primary capture rules are applied and the initial list of files is fetched from the source, the advance capture rules evaluate each document fetched, one document at a time. You will rank the advanced capture rules so the first rule that is matched is applied to a the document.
-
Complete the advanced capture rule options.
Option Description Select action Select one of the following actions to take when processing documents matching the filter criteria.
-
Analyze. Specifies to process and index the metadata and the body content of individual files that match the filter criteria. You can preview the document content in a plain text view in Analyze and Manage, but you cannot preview any attachments or embedded images.
-
Store content as text. Specify whether to store the contents of the individual files as text; the file itself is not stored. Click the toggle to select (
) or deselect (
) the option.
-
When enabled (selected), the contents of the files are stored in the index as text.
-
When not enabled (deselected), tag values are identified but the file content cannot be viewed.
NOTE: Changes to grammar, tag, and workspace keyword search criteria will not be automatically updated for files that have already been analyzed. Storing content increases the size of your index.
-
-
Extract grammar rules. Specify whether to extract, index, and count the values identified by the grammar rules. Click the toggle to select (
) or deselect (
) the option.
-
When enabled (selected), grammar values are identified, extracted, and indexed.
-
When not enabled (deselected), grammar values are not extracted and masked view of file content cannot be shown.
TIP: You can extract grammar values later for files within a workbook.
-
-
-
Metadata only. Specifies to process and index only the metadata from individual files that match the filter criteria. You cannot preview the document content in Analyze or Manage.
-
Skip. Specifies to skip processing for individual files matching the filter criteria.
Filter Click Filter and define the desired filter criteria.
To add criteria to the desired section, click the add icon
and select the filter fields as desired.
-
For fields added to the ANY section, each entry is combined with the OR operator.
-
For fields added to the ALL section, each entry is combined with the AND operator. For extensions, each entry in the field is combined with the OR operator, then combined with any other fields with the AND operator.
-
For fields added to the NONE section, each entry is combined with the OR operator.
-
-
Set the priority order for the advanced capture rules. The first rule that is matched is applied to a the document..
Click the drag icon (
) for the desired rule and then drag the rule up or down to the desired order.
Click NEXT.
-
-
Complete the Security options to define whether you want to limit access to datasets based on this template to specific users and groups.
CAUTION: If limiting access to a source and an underlying dataset, users without access will not be able to view workspaces with a data source that includes the dataset or view individual items that originated in the dataset.
Option Description Inherit from Source Select to inherit dataset access from the source. (default) Grant access to all users Select to not limit access to this dataset. Specify the users and groups that will have access
Select to limit access to the dataset to only the defined users and groups. List of Users/Groups Define the users and groups that will have access to items originating from this dataset.
-
In the Enter name or email address box, begin typing a group name or a name or email address of a user. As you enter a string in the field, the interface displays names or email addresses matching the string.
-
Click Add to add the selected user or group to the source access list.
To remove a user or group from the source access, hover over the name in the User/Group column and then click the corresponding remove icon (
).
Click NEXT.
-
-
Select the Grammar Sets to include in the analysis of documents in this dataset.
-
In the Grammar Sets list, do one of the following:
-
Click the desired grammar set name.
-
Type the name of the desired grammar set. As you type, grammar set names that match your entry display. Click the desired grammar set name.
The grammar set is added to the top of the field. To remove a grammar set, click the associated
.
-
-
Repeat as necessary to add more grammar sets.
-
Click outside of the Grammar Sets field to view the grammar rules for the selected grammar sets.
-
Click NEXT.
-
-
Complete the Schedule options to define the schedule for processing datasets based on this template.
NOTE: Schedule options to not apply to Exchange datasets. Exchange datasets based on this template will skip these options.
Option Description Enable scheduling Specify whether to enable or disable scheduling. Click the toggle to select ( ) or deselect (
) the option.
When enabled (selected), processing of this dataset occurs as defined by the default schedule set by the associated agent cluster.
When not enabled (deselected), processing does not run on a schedule. You must manually start processing for this dataset.
When not enabled, all other scheduling options are unavailable. Click NEXT to continue.
Override default schedule Specify whether to override the default schedule defined by the associated agent cluster.
-
When enabled (selected), the selections you make in the schedule on this page override the default schedule defined by the associated agent cluster.
-
When not enabled (deselected), the default schedule for the associated agent cluster takes precedence.
Only run during this time range Define the details for when the scan runs.
-
Select the days of the week that the scan will run.
-
Define the Daily Start Time (GMT) and Daily End Time (GMT) for the scan.
The time defined is relative to server on which you will install the agent.
At the End Time, any task currently running is stopped; items in mid-processing are allowed to complete.
-
Specify to Run once every day or Run continuously with delay interval.
If running continuously with a delay interval, the task runs repeatedly during the defined days and times and pauses between runs for the defined interval (runs every n minutes.)
Limits: This option does not show if you selected "Cloud Cluster" as the agent cluster.
Blackout period (never run during this time range) (Optional) Define the details for when no new scans will run. Scans already in progress will continue until completed.
-
Select the days of the week that the scan will not run.
-
Define the Daily start time (GMT) and Daily end time (GMT) for the scan.
The time defined is relative to the server on which you will install the agent.
Click NEXT.
-
Complete the Attributes options to associate the desired attributes and values with datasets based on this template. For more information about dataset attributes, see Dataset attributes.
-
From the Select attribute list, do one of the following:
-
Select the desired attribute from the list.
-
To create a new attribute, click the New Attribute link in the list.
In the New Attribute dialog, type a name for the new attribute and then click CREATE.
The list of values for the selected attribute populate the value list.
NOTE: Any attributes you create here are automatically saved even if you cancel out of creating the dataset.
-
-
From the Select value list, do one of the following:
-
Select the desired value from the list.
-
If the selected attribute does not currently have any values or to create a new value, click the Add Value link in the list
In the New Value dialog, type a name for the new value and then click CREATE.
NOTE: Any values you create here are automatically saved even if you cancel out of creating the dataste.
-
-
To add more attributes, click the add icon (
) and then select the desired attribute and corresponding value.
Repeat until all desired attribute values are selected.
To remove a selected attribute and value pair, click the associated remove icon (
).
-
-
Click FINISH.
The new dataset template is created and can be selected when creating a new dataset.
-
On the Dataset Templates page, click the name of the dataset template you want to edit.
TIP: You can also click or hover over the row for the dataset template and then click the edit icon (
).
The Edit Dataset Template dialog opens.
-
Make the desired changes and then click FINISH.
The changes to the dataset template are saved. Changes apply to datasets based on this template going forward and do not affect previously created datasets based on this template.
-
On the Dataset Templates page, click or hover over the dataset template you want to delete.
Additional icons display in the right column.
-
Click the delete icon (
) associated wit the desired dataset template.
-
In the confirmation dialog, click YES to confirm the action.
The dataset template is deleted. Previously created datasets based on this template are not affected.