Blog Post

Microsoft Graph Data Connect for SharePoint Blog
7 MIN READ

Links about Microsoft Graph Data Connect for SharePoint

Jose_Barreto's avatar
Jose_Barreto
Icon for Microsoft rankMicrosoft
Feb 27, 2024

Introduction

Microsoft Graph Data Connect for SharePoint delivers rich data assets to OneDrive and SharePoint tenants, so they can run their own analytics, derive insights from their data and understand how they use these products. The data is transferred to an Azure account owned by the tenant, where they can use tools like Azure Synapse, Power BI or Microsoft Fabric to transform this into insightful reports and dashboards. Microsoft Graph Data Connect for SharePoint is also known by its early codename "Project Archimedes".

The three main scenarios are Security (information oversharing, external sharing), Capacity (understanding site lifecycle and storage) and Sync Health (OneDrive Sync, device health, folder backup). SharePoint currently offers 3 datasets via Microsoft Graph Data Connect: Sites, Groups and Permissions. This solution is available in 15 Microsoft 365 regions. 

Security Scenario

The Security Scenario focuses on understanding the permissions in your SharePoint and OneDrive tenant. By looking at the Sites, Permissions and Groups datasets, you can understand if your content is properly protected.

Here are a few common questions that you will be able to answer:

  • Is oversharing happening?
  • Is external sharing happening?
  • Is sensitive data being shared?
  • Howe much sharing per sensitivity label?
  • Is sensitive data shared with external users?
  • What external domains are being shared with?
  • Which sites have been shared the most?
  • What roles (levels of sharing) are being used?
  • What permissions does a specific user have?
  • What file extensions are most shared?
  • How much sharing happens at the Web, Folder, List or File level?
  • A combination of any of the questions above…
Sample Power BI dashboards for Security

Capacity Scenario

The Capacity Scenario focuses on understanding site lifecycle, ownership and storage used by your SharePoint sites and OneDrives. By looking at the Sites and Groups, you can understand hour much content you have, how it is being used.

Here are a few common questions that you will be able to answer:

  • What SharePoint sites are the largest?
  • What type of site uses the most storage?
  • What’s the current storage for sensitive sites?
  • How much is used by previous versions?
  • Which sites were updated in the last few months?
  • Which sites have just one owner?
  • How many sites were created over 2 years ago?
  • How many sites haven’t changed in 1 year?
  • How many sites have over 1TB of files?
  • A combination of any of the questions above…
Sample Power BI dashboards for Capacity

Sync Health Scenario

The OneDrive Sync Health scenario is about understanding if users are properly using OneDrive for Business to protect their files by synchronizing them with the cloud. It includes details about the devices configured with a OneDrive Sync client, including whether they are backing up important folders and if there are any errors.

Here are a few common questions that you will be able to answer:

  • How many devices are healthy?
  • How many devices have opted in for Folder Backup?
  • Which Folders are most selected for Folder Backup?
  • What is the breakdown of unhealthy devices by OS version?
  • What is the breakdown of unhealthy devices by OneDrive Sync client version?
  • Is the device for user X reporting as healthy?
  • How many devices are showing errors?
  • Which types of errors are making most devices unhealthy?
  • Which devices are showing a specific error?
  • What are the errors occurring on a specific device?
OneDrive Sync Health Scenario

Links to resources

Here are some useful links related to the OneDrive and SharePoint data available via Microsoft Graph Data Connect. This is meant as a convenient set of pointers to specific and relevant content, which you can bookmark for future reference.

Step-by-step guides

Dashboard with Storage Used by Site Type Sample Dashboard from the Information Oversharing Template

SharePoint Datasets and Schemas

  • Dataset description: SharePoint Sites
    Dataset name: BasicDataSet_v0.SharePointSites_v1
    Link to dataset schema: data-connect-dataset-sharepointsites.md
    Dataset old name (deprecated): SharePointSitesDataset_v0_Preview

  • Dataset description: SharePoint Groups
    Dataset name: 
    BasicDataSet_v0.SharePointGroups_v1
    Link to dataset schema: data-connect-dataset-sharepointgroups.md
    Old dataset name (deprecated): SharePointGroupsDataset_v0_Preview

  • Dataset description: SharePoint Sharing Permissions
    Dataset name: BasicDataSet_v0.SharePointPermissions_v1
    Link to dataset schema: data-connect-dataset-sharepointpermissions.md
    Old dataset name (deprecated): DocumentSharingDataset_v0_Preview

  • Dataset description: SharePoint Files
    Dataset name: BasicDataSet_v0.SharePointFiles_v1
    Link to dataset schema: data-connect-dataset-sharepointfiles.md 
    Note: Coming soon (not yet available publicly)

  • Dataset description: SharePoint File Actions
    Dataset name: BasicDataSet_v0.SharePointFileActions_v1
    Link to dataset schema: data-connect-dataset-sharepointfileactions.md 
    Note: Coming soon (not yet available publicly)

  • Dataset description: SharePoint Sync Health
    Dataset name: BasicDataSet_v0.OneDriveSyncHealth_v1
    Link to dataset schema: data-connect-dataset-onedrivesynchealth.md 
    Note: Coming soon (not yet available publicly)

  • Dataset description: SharePoint Sync Errors
    Dataset name: BasicDataSet_v0.OneDriveSyncErrors_v1
    Link to dataset schema: data-connect-dataset-onedrivesyncerrors.md 
    Note: Coming soon (not yet available publicly)

You can see the official list of datasets at https://aka.ms/SharePointDatasets

Frequently Asked Questions


Links to the Microsoft Graph Data Connect for SharePoint FAQ blog series:

Microsoft Graph Data Connect for SharePoint – Official Announcements

  • Official blog post announcing new SharePoint datasets – Blog
  • Official blog post announcing the public preview and dataset renames – Blog
  • Official blog post announcing the pricing update for SharePoint datasets – Blog

Microsoft Graph Data Connect main links

Microsoft Graph Data Connect

Partner Content on Microsoft Graph Data Connect for SharePoint

  • Title: Setup Prerequisites in Microsoft 365 & Azure for Project Archimedes & Microsoft Graph Data Connect
    By: Antonio Maio, Microsoft MVP, Managing Director and Senior Enterprise Architect with Protiviti.
    Link: https://youtu.be/ym3BGYkbXhM 

  • Title: Configure and Run an Azure Synapse Data Pipeline for Project Archimedes & Microsoft Graph Data Connect
    By: Antonio Maio, Microsoft MVP, Managing Director and Senior Enterprise Architect with Protiviti.
    Link: https://youtu.be/g4-kx5KVVNA 

  • Title: Big Data Analytics for Microsoft 365 using Microsoft Graph Data Connect and Microsoft Fabric
    By: Rakesh Chenchery, CTO at Proventeq
    Link: https://youtu.be/H2D6HxHLekQ 

Other Blogs about SharePoint on Data Connect

Presentations

Updated Nov 27, 2024
Version 46.0
  • Hi Justin!

     

    In general, a SharePoint feature needs to be available in all Microsoft 365 regions to be considered Generally Available (GA). MGDC for SharePoint has been working at this, going from 3 regions to 5 regions, then 9 regions and now 15 regions. We are still missing a few regions and the plan is to get there.

     

    For more detail, please review this blog: MGDC for SharePoint FAQ: Which regions are supported?

     

    Jose

  • Hi J_Justin 

     

    Yes, the "Capacity Scenario Template" is there in the MGDC solutions GitHub waiting for the public release of the Files dataset, which is coming soon.

  • J_Justin's avatar
    J_Justin
    Copper Contributor

    Hi Team,

     

    I believe these SharePoint datasets are in public preview since Jul'23. When will they become GA?

     

    J Justin

     

  • J_Justin's avatar
    J_Justin
    Copper Contributor

    Hi Jose_Barreto 

     

    Thank you for your prompt and accurate response for my earlier questions. Last year, Microsoft gave a hint on "Project Archimedes" in Syntex techcommunity blog, about the arrival of data analytics of SharePoint and OneDrive content in two categories viz., security and capacity. I believe, Information Oversharing template covers the both. I can see one more "Capacity scenario template" at the github location https://github.com/microsoftgraph/dataconnect-solutions/tree/main/ARMTemplates/Archimedes%20Capacity%20Scenario. Is this something that is coming soon with additional capacity related insights?

     

    J Justin

     

  • J_Justin's avatar
    J_Justin
    Copper Contributor

    Hi Jose_Barreto 

    Based on your recent blog post, I understand that Files dataset is in private preview, with Public ETA expected in a few months. With this understanding, can we assume that "Capacity scenario template" will also be published hand-in-hand in GitHub?

     

    J Justin

     

     

  • Hi, J_Justin! Yes, that is the plan. Note that a lot of capacity-related questions can be answered today using only the SharePoint Sites dataset. The SharePoint Files dataset will complete the picture.

  • Vin_Cent's avatar
    Vin_Cent
    Copper Contributor

    Looks like MGDC is not available in GCC High. Can someone confirm?