OneDrive connection

If you will be creating sources and datasets that process OneDrive or OneDrive Online (O365) data, you must complete additional tasks to enable processing by the processing agent.

NOTE: “OneDrive” is the implementation of personal sites when using on-premises SharePoint. “OneDrive Online” is the implementation of OneDrive personal sites when using SharePoint Online.

Complete the tasks for OneDrive or OneDrive Online as appropriate for the data to be managed by OpenText Core Data Discovery & Risk Insights. If you will process data by both OneDrive and OneDrive Online, complete the tasks for both implementations.

OneDrive connection tasks

Complete the following to process data from OneDrive.

  1. Update the logon account for OpenText Core Data Discovery & Risk Insights services

OneDrive Online connection tasks

OpenText Core Data Discovery & Risk Insights requires non-user based access to OneDrive Online using the Microsoft Entra ID access method.

Configure web proxy settings (optional)

OneDrive uses the SharePoint processor. This processor service controlled by the processing agent requires connectivity to the OpenText Core Data Discovery & Risk Insights cloud components, often located away from the local network where the agent host servers are located. Although direct connectivity is ideal, use of a web proxy may be required in some environments for the agent systems to reach the OpenText Core Data Discovery & Risk Insights cloud.

NOTE: Authenticated proxies are supported for OneDrive Online (O365) only.

OneDrive processing

When processing Microsoft Office items from OneDrive, OpenText Core Data Discovery & Risk Insights uses the last modified date in the de-duplication calculation. Due to the way modified dates are handled in Office items and in OneDrive, OpenText Core Data Discovery & Risk Insights will not identify documents with different dates as duplicates.

  • When an Office item is uploaded to a OneDrive site web interface, the item's modified date is changed.

  • When an Office item is added to a local system and is then synchronized to the OneDrive site, the item's modified date is not changed.

SharePoint Lists are comprised of form records, called items in OneDrive, that contain various text fields and can have attachments. When deleting OneDrive files, OpenText Core Data Discovery & Risk Insights does not delete attachments to items from SharePoint Lists.

Item counts

The document and item counts in OpenText Core Data Discovery & Risk Insights may differ from the "item" count as seen in the OneDrive site interface. This difference relates to the following.

  • In OpenText Core Data Discovery & Risk Insights, a document is an original file processed by OpenText Core Data Discovery & Risk Insights and an item is an attachment to an original file. In OneDrive, an item is a row in a table, or a record in a database and a document is a type of item.

  • In OneDrive, item counts are derived from the total number folders, documents, and items (each entry in a SharePoint Item List). In OpenText Core Data Discovery & Risk Insights, document counts are derived from the total number of documents from SharePoint Document Libraries and attachments in a SharePoint Item List. OpenText Core Data Discovery & Risk Insights does not process the field, or entry, in an Item List, only the attachments from the Item List.

    For example, if a SharePoint Item has zero attachments, OneDrive records this as one item. If a SharePoint Item has 10 attachments, OneDrive also records this as one item.

  • The item count listed on the OneDrive Site Contents page for libraries includes all items in the library, including folders. Folders in which files exist in OneDrive are not included in counts in OpenText Core Data Discovery & Risk Insights.

  • When processing OneDrive content, OpenText Core Data Discovery & Risk Insights does not process library items that include the UIVersion field. These SharePoint items are SharePoint UI elements and are skipped. For example, the Form Templates and List Template Gallery library items are UI elements and therefore not processed by OpenText Core Data Discovery & Risk Insights. However, when viewing the OneDrive Site Contents page, these items are included in the item count.

Deletion tracking

OpenText Core Data Discovery & Risk Insights tracks the deletion of managed OneDrive items made at the original source location using the SharePoint change logs. Each time processing is run on a dataset—on a schedule, or on demand—OpenText Core Data Discovery & Risk Insights checks the SharePoint change logs for deleted items. For each managed item that is deleted in OneDrive, that item is deleted from OpenText Core Data Discovery & Risk Insights. If an item within a container file (such as ZIP) is deleted in OneDrive, the item is removed from the application as part of updating the container file when the job run occurs.

To ensure accurate tracking of items deleted from OneDrive, ensure that the OneDrive datasets in OpenText Core Data Discovery & Risk Insights are updated more often than the maximum number of days SharePoint change logs are kept. OpenText Core Data Discovery & Risk Insights uses information from the SharePoint change logs to identify deleted OneDrive items to be removed from the application. Without this information, items deleted from your SharePoint environment cannot be removed from the application. OneDrive items that have been added or modified are appropriately updated in OpenText Core Data Discovery & Risk Insights.

For example, if your SharePoint change logs are configured to be stored for 60 days, verify that your OneDrive datasets are updated at least every 59 days.

CAUTION: Failure to rescan OneDrive datasets before SharePoint change logs are purged will result in items being tracked incorrectly in OpenText Core Data Discovery & Risk Insights. Using the same example, if your SharePoint change logs are configured to be stored for 60 days and your OneDrive datasets are updated every 90 days, you will lose 30 days of important information about deleted items—items deleted during this 30 day time frame will not be removed from the application.

The loss of information cannot be reconciled in OpenText Core Data Discovery & Risk Insights; you would have to create a new dataset and start over.

Capture owner information from OneDrive Online (optional)

If you are processing documents from OneDrive Online (O365) datasets using Microsoft Entra ID authentication, you can choose to capture document owner information. For OneDrive, the user is the owner.