Blog Post

Microsoft Graph Data Connect for SharePoint Blog
7 MIN READ

Book of Archimedes, Copilot and SharePoint Agents

Jose_Barreto's avatar
Jose_Barreto
Icon for Microsoft rankMicrosoft
Apr 17, 2025

How to use the MGDC for SharePoint blogs with your favorite AI tools

 

1. Introduction

 

The Book of Archimedes is a single PDF that contains all posts in the Microsoft Graph Data Connect for SharePoint blog. This includes 40 articles and nearly 400 pages, which is handy for searching for specific keywords quickly across the many articles.

The PDF is provided as an attachment to the main MGDC for SharePoint article at https://aka.ms/SharePointData. You can also download it directly from Book of Archimedes.

 

2. Using the Book with Copilot

 

Beyond searching inside the PDF, you can feed the Book of Archimedes as grounding to your favorite AI tool to get answers to your questions about MGDC for SharePoint. Let's look at an example.

You can use this with Copilot or any AI tool that accepts attached files. In my case, I used https://copilot.cloud.microsoft/ to access my work Copilot, the secure and compliant Copilot integrated with my enterprise account.

 

 

I start with the “+” button and then the “add content” button.

 

 

Then I used the up-arrow icon to “upload from this device” and selected the local copy of the Book of Archimedes I downloaded earlier.

 

 

After that, I added my question as a prompt. I tried “What is Project Archimedes?”

 

 

I got a summary of Microsoft Graph Data Connect for SharePoint, including key scenarios. From there, I could ask any question. If the book contains the answer, Copilot will find it for you.

 

Project Archimedes is the early codename for Microsoft Graph Data Connect for SharePoint. This project aims to deliver rich data assets to OneDrive and SharePoint tenants, enabling them to run their own analytics, derive insights from their data, and understand how they use these products. The data is transferred to an Azure account owned by the tenant, where tools like Azure Synapse, Power BI, or Microsoft Fabric can be used to transform this data into insightful reports and dashboards [1].

Key Scenarios of Project Archimedes:

  1. Security: Understanding permissions in SharePoint and OneDrive tenants to ensure content is properly protected.
  2. Capacity: Understanding site lifecycle, ownership, and storage used by SharePoint sites and OneDrives.
  3. Sync Health: Ensuring users are properly using OneDrive for Business to protect their files by synchronizing them with the cloud [1].

Datasets Provided:

  • Sites: Information about SharePoint sites.
  • Groups: Details about SharePoint groups.
  • Permissions: Information about sharing permissions.
  • Files: Data about files stored in SharePoint and OneDrive.
  • File Actions: Details about actions performed on files.
  • Sync Health: Information about devices running OneDrive Sync.
  • Sync Errors: Details about errors encountered by OneDrive Sync clients [1].

Example Use Cases:

  • Security Scenario: Answering questions about oversharing, external sharing, sensitive data sharing, and permissions.
  • Capacity Scenario: Tracking storage usage, site lifecycle, and identifying large or inactive sites.
  • Sync Health Scenario: Monitoring device health, folder backup status, and sync errors [1].

For more detailed information, you can refer to the Book of Archimedes which contains comprehensive guides, FAQs, and step-by-step instructions related to Microsoft Graph Data Connect for SharePoint [1].

[1]: Book of Archimedes.pdf

 

The more impressive capabilities of Copilot are shown when you ask for something that is not just a direct quote from the Book. For instance, I asked about pricing for a specific scenario: "If a tenant has 1,000 SharePoint sites, how much would it cost to get the SharePoint Sites and SharePoint permissions for that tenant using MGDC?" 

 

To estimate the cost of getting the SharePoint Sites and SharePoint Permissions datasets for a tenant using Microsoft Graph Data Connect (MGDC), you can follow these guidelines:

Pricing Information:

  • SharePoint Sites and Permissions: The price to pull 1,000 objects from the available MGDC for SharePoint datasets (Sites, Groups, and Permissions) in the US is $0.75, plus the cost for infrastructure like Azure Storage, Azure Data Factory, or Azure Synapse [1].
  • SharePoint Files: This dataset is billed at $0.75 per 50,000 objects due to its typical high volume [1].

Estimating Costs:

  1. Number of Objects:
    • Sites: The number of objects in your SharePoint Sites dataset will be equal to the number of sites in your tenant.
    • Permissions: The average number of SharePoint permissions per site is around 61 [1].
  2. Cost Calculation:

Example Calculation:

For a tenant with 1,000 sites:

  • Sites Dataset Cost: $0.75
  • Permissions Dataset Cost: $45.75
  • Total Cost: $0.75 + $45.75 = $46.50

Additional Costs:

  • Azure Storage: Costs for storing the data in Azure Storage.
  • Azure Synapse: Costs for running pipelines to move the data.

For more precise estimates, you can sample the dataset to get a total object count without pulling the entire dataset [1].

[1]: Book of Archimedes.pdf

 

Copilot will also answer questions in other languages. You just ask the question in whatever language you want, and it gets detected… Here’s an example using Brazilian Portuguese. I asked "Quais são os datasets que fazem parte do cenário de segurança?"

 

Os datasets que fazem parte do cenário de segurança no Project Archimedes são:

  1. Sites: Informações sobre os sites do SharePoint.
  2. Permissions: Informações sobre as permissões de compartilhamento.
  3. Groups: Detalhes sobre os grupos do SharePoint.

Esses datasets ajudam a entender se o conteúdo está devidamente protegido, respondendo a perguntas como:

  • Está ocorrendo compartilhamento excessivo?
  • Está ocorrendo compartilhamento externo?
  • Dados sensíveis estão sendo compartilhados?
  • Quanto compartilhamento ocorre por rótulo de sensibilidade?
  • Dados sensíveis são compartilhados com usuários externos?
  • Quais domínios externos estão sendo compartilhados?
  • Quais sites foram mais compartilhados?
  • Quais níveis de compartilhamento estão sendo usados?
  • Quais permissões um usuário específico possui?
  • Quais extensões de arquivo são mais compartilhadas?
  • Quanto compartilhamento ocorre no nível de Web, Pasta, Lista ou Arquivo?

Esses datasets são fundamentais para garantir a segurança e a conformidade dos dados dentro do ambiente do SharePoint e OneDrive [1].

[1]: Book of Archimedes.pdf

 

3. Using the Book to create a SharePoint agent

 

You can also use the book to create a SharePoint agent. This will make it easy to ask questions about Microsoft Graph Data Connect for SharePoint without having to upload the PDF every time.

To start, upload the Book of Archimedes to a SharePoint document library.

 

 

Next, select the file and use the option to “create an agent”.

 

 

That’s it. Your agent is created, and you can start using it immediately.

 

 

The agent shows up in the document library as a file with the “.agent” extension. You can simply open that file in SharePoint to run the agent.

 

 

Once you open the agent, it offers you a text box to ask questions.

 

 

Here is the agent response to “What are the main questions about SharePoint capacity that MGDC for SharePoint can answer?”

 

The Book of Archimedes provides detailed answers to several key questions about SharePoint capacity using Microsoft Graph Data Connect (MGDC). Here are the main questions it addresses:

Key Questions about SharePoint Capacity:

  1. How much storage is being used by SharePoint sites?
    • This includes understanding the total size of the latest versions of files, previous versions, metadata size, and deleted files in the recycle bin 1.
  2. What are the different storage sizes reported for a SharePoint site?
    • Storage Used: Includes all storage, metadata, previous versions, and recycle bins.
    • Total Size: Includes metadata and previous versions but excludes recycle bins.
    • Metadata Size: Storage used by file metadata.
    • Total File Stream Size: Size of the latest versions of files.
    • Previous Versions Size: Size of previous versions of files.
    • Recycle Bin Size: Size of deleted files in the recycle bins 1.
  3. How can administrators check the amount of storage being used by SharePoint sites?
    • SharePoint client UX: Viewing storage while using SharePoint as a site user or Site Collection administrator.
    • PowerShell cmdlets: Using command-line tools to query site storage information.
    • SharePoint Admin Center reports: Viewing usage reports in the Admin Center.
    • Microsoft Graph Data Connect: Pulling detailed datasets for SharePoint sites and OneDrives 1.
  4. How can administrators create test sites to understand storage reporting tools?
    • Creating different types of test sites (empty site, site with files, site with deleted files, site with previous versions) and analyzing how storage sizes are reported 1.
  5. How can administrators estimate Azure costs for MGDC usage?
    • Estimating the number of objects (sites, groups, permissions, files) and understanding the billing rates for different datasets 1.
  6. How can administrators use MGDC to run detailed analysis of SharePoint and OneDrive storage metrics?
    • Using detailed data in MGDC to analyze storage metrics and understand the impact of metadata, versions, and deleted files 1.

Tools and Templates:

  • Capacity Template: Includes instructions, ARM template, Azure Synapse pipeline template, and Power BI template to help administrators use MGDC datasets for capacity analysis 1.

Conclusion:

The Book of Archimedes provides comprehensive guidance on leveraging MGDC to gather, store, and analyze SharePoint data, ensuring effective data management and security. It is a valuable resource for administrators and developers working with SharePoint and OneDrive data using MGDC 1.

Would you like more specific information on any of these questions or tools?

1: <File>Book of Archimedes</File>

 

As we did with Copilot, you can also ask any question on a topic addressed in the book. You can also use languages other than English. 

You can experiment with editing the agent to rename it, change the description, add more sources, adjust its behavior, offer sample starter prompts, include specific instructions or even customize with Copilot Studio. Just use the option to "edit the agent". You can also share the agent just like you share any file in SharePoint. 

You can learn more about SharePoint agents in this YouTube video by Vesa Juvonen:  https://www.youtube.com/watch?v=pYvK76SVEv4

 

4. Conclusion

 

I hope you enjoy using the Book of Archimedes as grounding for your AI tools.

As usual, remember that AI tools may make mistakes and you should always double-check the answers you receive.

There is also a chance that a particular AI tool might not be able to handle a large document like the Book of Archimedes, currently weighing in at around 10MB.

Updated Jun 12, 2025
Version 5.0
No CommentsBe the first to comment