Blog Post

Microsoft Graph Data Connect for SharePoint Blog
2 MIN READ

MGDC for SharePoint FAQ: Why does the file count not match?

Jose_Barreto's avatar
Jose_Barreto
Icon for Microsoft rankMicrosoft
Nov 16, 2024

I am frequently asked why the number of files in the Sites dataset does not match the count of the Files dataset. It is true that they sometimes don't. Here are the counts we are talking about:

Sites: 

SELECT Id as SiteId,
       StorageMetrics.TotalFileCount AS FilesInSite
FROM Sites

Files:

SELECT SiteId,
       COUNT(*) AS FilesInSite
FROM Files
GROUP BY SiteId

 

The main reasons for the discrepancy are:

  • The Files dataset is collected weekly, while the Sites dataset is collected daily, making it difficult to capture the exact same state.
  • The Files dataset includes only items inside Document Libraries, whereas the Sites dataset counts all files, including those in other list types.
  • The Files dataset does not include pages (files ending in .ASPX), while the Sites dataset counts all files.
  • The Files dataset does not include items in the primary and secondary recycle bins, whereas the Sites dataset counts all files.

 

Here are a few examples:

  • A new site was created, and a few files were uploaded to a document library in this new site. Two days later, you get the SharePoint Sites dataset and find the new site with the right count of files. However, you cannot find these files in the SharePoint Files dataset. This is because the Files dataset may take one week to refresh. Wait a week and try pulling the Files dataset again.
  • A team site was created with a few lists, where some of the items have file attachments. The SharePoint Files dataset does not show these file attachments. This is because the SharePoint Files dataset will only show files in document libraries.
  • You deleted files from a document library in a SharePoint site a few weeks ago. Now the count of files in the SharePoint Sites is smaller than the count of files in the SharePoint Files dataset. That is because files in the recycle bins are excluded from the Files dataset. Once these deleted files go through the first and second level recycle bins, the count in the Sites dataset will also reflect that reduction.

 

Archimedes comparing file counts...

 

Updated Nov 27, 2024
Version 3.0
No CommentsBe the first to comment