Forum Discussion
Analyze Metadata for All Unstructured Files in SharePoint and OneDrive Using Microsoft Services
Hi everyone,
I’m trying to find a reliable way to collect and analyze metadata for all unstructured files across our Microsoft 365 tenant — specifically in SharePoint Online and OneDrive for Business.
The metadata we’re interested in includes:
- File creation date
- File size
- File type or extension
The goal is to get comprehensive view across our entire Microsoft tenant to help us implement retention policies and labels accordingly
Ideally, we’d like to do this using Microsoft Purview or any other Microsoft-native service or tool. We’re open to using Graph API, Power BI, Defender for Cloud Apps, or other reporting/export tools if necessary.
Has anyone done this successfully? If so, what was your approach, and what tools or scripts did you use?
Any advice or recommendations would be really appreciated.
Thanks in advance.
You could try this:
1. Content Search to search for each of those properties and generate statistics.
Example Query:
(created>2025-01-01)(filetype=docx)(size>1)https://learn.microsoft.com/en-us/purview/ediscovery-content-search
You can then export a report with the items it found.
2. Use Auto Labeling Policies based on conditions in Simulation to see what it detects.
https://learn.microsoft.com/en-us/purview/apply-retention-labels-automatically
3. Use Microsoft Graph to programmatically enumerate files and extract metadata like:
-
- name – file name
- size – file size (in bytes)
- file.mimeType – MIME type (e.g., application/pdf)
- createdDateTime – when the file was created
- lastModifiedDateTime – when the file was last modified
- createdBy, lastModifiedBy – user info
- webUrl – file URL (handy for reference)
https://learn.microsoft.com/en-us/graph/overview
I hope that helps!
-
1 Reply
- BrianStephen
Microsoft
You could try this:
1. Content Search to search for each of those properties and generate statistics.
Example Query:
(created>2025-01-01)(filetype=docx)(size>1)https://learn.microsoft.com/en-us/purview/ediscovery-content-search
You can then export a report with the items it found.
2. Use Auto Labeling Policies based on conditions in Simulation to see what it detects.
https://learn.microsoft.com/en-us/purview/apply-retention-labels-automatically
3. Use Microsoft Graph to programmatically enumerate files and extract metadata like:
-
- name – file name
- size – file size (in bytes)
- file.mimeType – MIME type (e.g., application/pdf)
- createdDateTime – when the file was created
- lastModifiedDateTime – when the file was last modified
- createdBy, lastModifiedBy – user info
- webUrl – file URL (handy for reference)
https://learn.microsoft.com/en-us/graph/overview
I hope that helps!
-