Forum Discussion
How to find files over certain limit in SharePoint Library with hundreds of folders
I added the item limit to only 4900 and still receive the something went wrong. What I'm puzzled about is, SharePoint support hundreds of thousands of files in a library and also supports with good performance 300,000 files limit for sync to OneDrive. Why can't we filter or search or create views on libraries of this size without it failing if it supposed to store a lot more than 5000?
- Dorje-McKinnonJul 29, 2021Iron Contributor
Hi luvsql
The 5000 limit is really a legacy issue because SharePoint used to be a web front end to a SQL database using on premise servers.
This legacy 5000 item issue still exists, I believe because code from the good old days, was used for SharePoint online.
You're right libraries can hold 20million files (last time I checked the limits).
What the 5000 item limit means in practice is that when you have VERY big libraries you have to use PowerShell and PAGINATE or PAGE the results, hence the code mentioned in this conversation has the
Get-PnPListItem -List $List -PageSize 500
What that tells PowerShell is get the first 500 items, then follow the generated link to the next 500 items until there are no more items to get. If there are 10,000 items in the library the script will run (10,000 / 500 = 20) 20 times and get all the items.
If the script didn't have -PageSize 500 then it would get the most it can by default, which I think is 100.
I know this seems weird but Microsoft does this to reduce the cost (processor time and electricity) of us running scripts.
As users we just have to work with this foible to get what we need.
- Paul_HK_de_JongJul 30, 2021Iron ContributorMinor addition. The 5000 item limit is not a legacy thing. It is caused by the lock escalation mechanism in SQL server. It is one of the top requests on user voice (https://sharepoint.uservoice.com/forums/329214-sites-and-collaboration/suggestions/8463457-remove-the-list-view-threshold-5000-by-default). It basically it means we have to deal with it. For example, by creating indexes, requesting data in sets, ...
- Dorje-McKinnonAug 01, 2021Iron Contributor
Thanks Paul_HK_de_Jong ,
To me the fact that SharePoint online still uses SQL rather than a no-sql back end is the result of architectural decisions made by the SharePoint team years ago when the product was deployed to on-premise servers. The impact of that decision and it's impact, the 5000 item limit, has been (your link was from 2015) and continues to be (comments on your link from 2021) an issue for users.
I agree with you that this constraint is just something we (the paying customers) have to deal with.
- luvsqlJul 28, 2021Iron Contributor
I'm trying the powershell, however since we use MFA, I had to first use Connect-SPOService -Url https://mytenant-admin.sharepoint.com which then prompted me to login and authenticate, but then using this script asked me to login again and my admin account won't work:
#Config Variables
$SiteURL = "https://mytenant.sharepoint.com/Estimating"
$CSVFilePath = "C:\Temp\LargeFiles.csv"
#Connect to PnP Online
Connect-PnPOnline -Url $SiteURL -Credentials (Get-Credential)
#Get all document libraries
$FileData = @()
$DocumentLibraries = Get-PnPList | Where-Object {$_.BaseType -eq "DocumentLibrary" -and $_.Hidden -eq $False}
#Iterate through document libraries
ForEach ($List in $DocumentLibraries)
{
Write-host "Processing Library:"$List.Title -f Yellow
#Get All Files of the library with size > 100MB
$Files = Get-PnPListItem -List $List -PageSize 500 | Where {($_.FieldValues.FileLeafRef -like "*.*") -and ($_.FieldValues.SMTotalFileStreamSize/1MB -gt 100)}
#Collect data from each files
ForEach ($File in $Files)
{
$FileData += [PSCustomObject][ordered]@{
Library = $List.Title
FileName = $File.FieldValues.FileLeafRef
URL = $File.FieldValues.FileRef
Size = [math]::Round(($File.FieldValues.SMTotalFileStreamSize/1MB),2)
}
}
}
#Export Files data to CSV File
$FileData | Sort-object Size -Descending
$FileData | Export-Csv -Path $CSVFilePath -NoTypeInformation- Dorje-McKinnonJul 28, 2021Iron Contributor
HI luvsql ,
sorry to hear you've run into an issue.
My work place has MFA setup as well.
So what I do is change the following line to use -useweblogin
Connect-PnPOnline -Url $SiteURL -useweblogin
This should popup a web browser window allowing you to logon using your admin account AND then put in your MFA code or click the MFA app on your phone.
Everything else should then run.
The -useweblogin option only works for scripts you run. You have to use other methods if you want to set the script up to run on it's own at a scheduled time. But from what you said I think you'll be running this script in person.
Dorje
- luvsqlJul 29, 2021Iron Contributor
I can now run the script, however, it's only showing me large folder sizes over 100Mb and not the individual files. It's also only going 2 folders deep (we have some folders that go 5-6 subfolders).