Data deletion

From time to time, data items may be deleted from the source. This may be because the items were deleted at the source location itself or deleted from the source through a workbook action within OpenText Core Data Discovery & Risk Insights.

OpenText Core Data Discovery & Risk Insights tracks file deletions made at the original source location when a processing job runs against a dataset. A job run occurs when a dataset is updated, either run on a schedule or manually updated from the Manage Datasets page in Connect (click the inline update icon for the dataset or the Update button in the dataset detail pane).

When enabled for the workspace, you can delete unstructured data items from the source location with the workbook Delete action. When unstructured data items are deleted in this manner, the items are removed from the source location and the OpenText Core Data Discovery & Risk Insights index as soon as the processing for the delete action completes.

You cannot delete items from Content Manager source location using the workbook Delete action. Deleting from the Content Manager original source is not supported in OpenText Core Data Discovery & Risk Insights by any method.

File systems

OpenText Core Data Discovery & Risk Insights tracks file deletions at the source location by directly comparing with the original file system location identified by the dataset path. Items are removed from the application index seven days after the deletion from the source location is detected. If an item within a container file (such as ZIP) is deleted in the original file system location, the item is removed from the index as part of updating the container file when the job run occurs. In this case, the item may be removed from the index sooner than seven days after deletion is detected.

Exchange

No deletion detection from Exchange. OpenText Core Data Discovery & Risk Insights retains items it has already processed until a delete action is initiated from the application.

SharePoint

OpenText Core Data Discovery & Risk Insights tracks the deletion of managed SharePoint items made at the original source location using the SharePoint change logs. Each time processing is run on a dataset—on a schedule, or on demand—OpenText Core Data Discovery & Risk Insights checks the SharePoint change logs for deleted items. For each managed item that is deleted in SharePoint, that item is deleted from OpenText Core Data Discovery & Risk Insights. If an item within a container file (such as ZIP) is deleted in SharePoint, the item is removed from the application as part of updating the container file when the job run occurs.

To ensure accurate tracking of items deleted from SharePoint, ensure that the SharePoint datasets in OpenText Core Data Discovery & Risk Insights are updated more often than the maximum number of days SharePoint change logs are kept. OpenText Core Data Discovery & Risk Insights uses information from the SharePoint change logs to identify deleted SharePoint items to be removed from the application. Without this information, items deleted from your SharePoint environment cannot be removed from the application. SharePoint items that have been added or modified are appropriately updated in OpenText Core Data Discovery & Risk Insights.

For example, if your SharePoint change logs are configured to be stored for 60 days, verify that your SharePoint datasets are updated at least every 59 days.

CAUTION: Failure to rescan SharePoint datasets before SharePoint change logs are purged will result in items being tracked incorrectly in OpenText Core Data Discovery & Risk Insights. Using the same example, if your SharePoint change logs are configured to be stored for 60 days and your SharePoint datasets are updated every 90 days, you will lose 30 days of important information about deleted items—items deleted during this 30 day time frame will not be removed from the application.

The loss of information cannot be reconciled in OpenText Core Data Discovery & Risk Insights; you would have to create a new dataset and start over.

Content Manager

OpenText Core Data Discovery & Risk Insights tracks the deletion of managed Content Manager items at the original source location using the Content Manager delete events. Each time processing is run on a dataset—on a schedule, or on demand—the application checks the delete events. For each managed item that is deleted in Content Manager, the application deletes that item from the index. If an item within a container file (such as ZIP) is deleted from Content Manager, the item is removed from the index as part of updating the container file when the job run occurs.

To ensure accurate tracking of items deleted from Content Manager, ensure that the Content Manager datasets in OpenText Core Data Discovery & Risk Insights are updated more often than Content Manager administrator purges delete events. For example, if your Content Manager administrator purges delete events every 60 days, verify that your Content Manager datasets are updated at least every 59 days.

You cannot delete items from Content Manager source location using the workbook Delete action. Deleting from the Content Manager original source is not supported in OpenText Core Data Discovery & Risk Insights by any method.

Google Drive

OpenText Core Data Discovery & Risk Insights tracks the deletion of managed Google Drive items at the original source location using the change log for the Google drive defined by the source in Connect. Each time processing is run on a dataset—on a schedule, or on demand—the application checks the change logs for deleted items. For each managed item that is deleted directly from Google Drive, OpenText Core Data Discovery & Risk Insights deletes that item from its index. If an item within a container file (such as ZIP) is deleted in Google Drive, the item is removed from the index as part of updating the container file when the job run occurs.

To ensure accurate tracking of items deleted from Google Drive, ensure that the Google Drive datasets in Connect are updated more often than the maximum number of days Google Drive change logs are kept. For example, the default retention for change logs is 30 days. Verify that your Google Drive datasets are updated at least every 29 days.

Extended ECM

No deletion detection from Extended ECM. OpenText Core Data Discovery & Risk Insights retains items it has already processed until a delete action is initiated from the application.

Documentum

No deletion detection from Documentum. OpenText Core Data Discovery & Risk Insights retains items it has already processed until a delete action is initiated from the application.

OpenText Core Data Discovery & Risk Insights tracks deletions of managed Documentum items using the Documentum audit trail. OpenText Core Data Discovery & Risk Insights uses the dm_destroy event from the Documentum audit trail to identify deleted Documentum items to be removed from the application. Without this information, items deleted from your Documentum environment cannot be removed from the application. Documentum items that have been added or modified are appropriately updated in OpenText Core Data Discovery & Risk Insights.For example, if your Documentum audit trail is configured to be purged after 90 days, verify that your Documentum datasets are updated at least every 89 days.Failure to rescan Documentum datasets before the Documentum audit logs are purged will result in items being tracked incorrectly in OpenText Core Data Discovery & Risk Insights. Using the same example, if your Documentum audit trail is configured to be purged after 60 days and your Documentum datasets are updated every 90 days, you will lose 30 days of important information about deleted items—items deleted during this 30 day time frame will not be removed from the application.The loss of information cannot be reconciled in OpenText Core Data Discovery & Risk Insights; you would have to create a new dataset and start over.