Jul 22 2021 07:46 AM
It's not possible to add the file size to the library column as I would have to open every possible folder and our hierarchy is crazy. We are not at our limit, however, I know that many users are uploading large attachments and not overriding with newer versions so we are using way more than we need.
I know storage metrics shows total size, however, we still have to spend hours opening every single folder > subfolder > subfolder > subfolder then back etc.
I checked the Audit Log Search but there isn't anything for Search. We do not use OneDrive client for this library as there are too many files to support it.
There must be a way?
Jul 22 2021 05:40 PM - edited Jul 22 2021 05:40 PM
Hi Mercedes,
In Site Settings > Storage Metrics you can see the size of each library and clicking through to folders to see where large folders are.
Jul 22 2021 07:16 PM
Hi @luvsql
You could also create a new View within the library, Save All Items as File Size.
Then edit the properties of this new view to be as follows
This view will then show your biggest files at the top of the list.
To show files over a certain size, you would have to add a calculated column I think.
Jul 23 2021 03:00 AM
@luvsql
Have you considered using PowerShell scripts to harvest the information of interest? e.g. get a complete list of all documents and their properties in a document library.
Another option is to look for a reporting tool. There are apps that provide this capability (e.g. here) plus more. For example, find encrypted documents, checked-out documents, documents with versions consuming storage space, ...
Paul
Jul 23 2021 06:10 AM
Thank you all for the suggestions. I shall try them out. My experience with Powershell is copying and pasting existing scripts so not sure I could figure out how to do that.
Jul 23 2021 06:41 AM
@Dorje McKinnon I just get the "something went wrong" after saving the new view. I think we have too many files. Anytime I try and use that File Size column it craps out.
Jul 23 2021 08:53 PM
Jul 25 2021 01:35 PM
Hi @luvsql ,
@Steve Knutson is right the thing to do will be to add a filter e.g. 4900 then because of the ordering you may get the largest files at the top.
I say "may" because I'm not sure if the SharePoint views go through all the files first, find the largest ones according to the order by rules you've set, then show you the first 4900 of them.
OR
if the view grabs the first 4900 files according to it's own rules, then orders those by their size.
The other route you could take is as @Paul_HK_de_Jong said, use a PowerShell script.
I understand your comment about needing one that is pre built.
on the following page
https://www.sharepointdiary.com/2018/11/sharepoint-online-find-all-large-files-using-powershell.html
There are two scripts.
The lower one "PnP PowerShell: Find Large Files in SharePoint Online Site", is the simplest it can find.
It will get you a list of files and their sizes for every library in the site you point it at. So more than you want but you'll be able to filter the spreadsheet export easily enough.
If you use the following script it will work on just one library
but you need to add one row to it to get the file size
AGain the last script on the page called "Get All Documents Inventory from a SharePoint Online Document Library and Export to a CSV" will give you almost what you want.
You just need to insert the following between row 35 and 36 so that the following :
Changes into
You will need to change the parameters in rows 2 , 3, 4 so that they refer to your site library and export file location. NOTE the library row will be the name of the library not the URL I think. What I mean is "Documents" (the name of the library) rather than the URL for that library "Shared Documents".
Jul 27 2021 05:55 AM - edited Jul 27 2021 05:55 AM
I added the item limit to only 4900 and still receive the something went wrong. What I'm puzzled about is, SharePoint support hundreds of thousands of files in a library and also supports with good performance 300,000 files limit for sync to OneDrive. Why can't we filter or search or create views on libraries of this size without it failing if it supposed to store a lot more than 5000?
Jul 28 2021 09:47 AM - edited Jul 28 2021 09:48 AM
I'm trying the powershell, however since we use MFA, I had to first use Connect-SPOService -Url https://mytenant-admin.sharepoint.com which then prompted me to login and authenticate, but then using this script asked me to login again and my admin account won't work:
#Config Variables
$SiteURL = "https://mytenant.sharepoint.com/Estimating"
$CSVFilePath = "C:\Temp\LargeFiles.csv"
#Connect to PnP Online
Connect-PnPOnline -Url $SiteURL -Credentials (Get-Credential)
#Get all document libraries
$FileData = @()
$DocumentLibraries = Get-PnPList | Where-Object {$_.BaseType -eq "DocumentLibrary" -and $_.Hidden -eq $False}
#Iterate through document libraries
ForEach ($List in $DocumentLibraries)
{
Write-host "Processing Library:"$List.Title -f Yellow
#Get All Files of the library with size > 100MB
$Files = Get-PnPListItem -List $List -PageSize 500 | Where {($_.FieldValues.FileLeafRef -like "*.*") -and ($_.FieldValues.SMTotalFileStreamSize/1MB -gt 100)}
#Collect data from each files
ForEach ($File in $Files)
{
$FileData += [PSCustomObject][ordered]@{
Library = $List.Title
FileName = $File.FieldValues.FileLeafRef
URL = $File.FieldValues.FileRef
Size = [math]::Round(($File.FieldValues.SMTotalFileStreamSize/1MB),2)
}
}
}
#Export Files data to CSV File
$FileData | Sort-object Size -Descending
$FileData | Export-Csv -Path $CSVFilePath -NoTypeInformation
Jul 28 2021 01:36 PM
HI @luvsql ,
sorry to hear you've run into an issue.
My work place has MFA setup as well.
So what I do is change the following line to use -useweblogin
Connect-PnPOnline -Url $SiteURL -useweblogin
This should popup a web browser window allowing you to logon using your admin account AND then put in your MFA code or click the MFA app on your phone.
Everything else should then run.
The -useweblogin option only works for scripts you run. You have to use other methods if you want to set the script up to run on it's own at a scheduled time. But from what you said I think you'll be running this script in person.
Dorje
Jul 29 2021 06:11 AM - edited Jul 29 2021 06:16 AM
I can now run the script, however, it's only showing me large folder sizes over 100Mb and not the individual files. It's also only going 2 folders deep (we have some folders that go 5-6 subfolders).
Jul 29 2021 02:51 PM - edited Jul 29 2021 02:57 PM
HI @luvsql
Great that it is working
The following line is that one that will need to change
#Get All Files of the library with size > 100MB
$Files = Get-PnPListItem -List $List -PageSize 500 | Where {($_.FieldValues.FileLeafRef -like "*.*") -and ($_.FieldValues.SMTotalFileStreamSize/1MB -gt 100)}
The # means comment out this line i.e. it tells the programming language to ignore this row
So I suggest you change the script so it looks like this , removing the part after the -and statement
#Get All Files of the library with size > 100MB
#$Files = Get-PnPListItem -List $List -PageSize 500 | Where {($_.FieldValues.FileLeafRef -like "*.*") -and ($_.FieldValues.SMTotalFileStreamSize/1MB -gt 100)}
#Get all files, of any size
$Files = Get-PnPListItem -List $List -PageSize 500 | Where {($_.FieldValues.FileLeafRef -like "*.*")
I can't see anything in the script which will limit it to just 2 folders deep.
It may be that there are NO files larger than 100MB that exists in folders lower than 2 folders deep.
Microsoft have some training about PowerShell
as do a number of prominent SharePoint online authorities
https://www.google.co.nz/search?q=how+to+learn+powershell+for+sharepoint
Everyone who does our kind of work, can get value out of knowing something about PowerShell, so I highly recommend learning a little about it. I prefer dealing with people and the social anthropology side of my job, but there are times when Powershell makes things possible that wouldn't be otherwise.
Jul 29 2021 03:05 PM
Hi @luvsql
The 5000 limit is really a legacy issue because SharePoint used to be a web front end to a SQL database using on premise servers.
This legacy 5000 item issue still exists, I believe because code from the good old days, was used for SharePoint online.
You're right libraries can hold 20million files (last time I checked the limits).
What the 5000 item limit means in practice is that when you have VERY big libraries you have to use PowerShell and PAGINATE or PAGE the results, hence the code mentioned in this conversation has the
Get-PnPListItem -List $List -PageSize 500
What that tells PowerShell is get the first 500 items, then follow the generated link to the next 500 items until there are no more items to get. If there are 10,000 items in the library the script will run (10,000 / 500 = 20) 20 times and get all the items.
If the script didn't have -PageSize 500 then it would get the most it can by default, which I think is 100.
I know this seems weird but Microsoft does this to reduce the cost (processor time and electricity) of us running scripts.
As users we just have to work with this foible to get what we need.
Jul 29 2021 11:28 PM
Aug 01 2021 02:25 PM
Thanks @Paul_HK_de_Jong ,
To me the fact that SharePoint online still uses SQL rather than a no-sql back end is the result of architectural decisions made by the SharePoint team years ago when the product was deployed to on-premise servers. The impact of that decision and it's impact, the 5000 item limit, has been (your link was from 2015) and continues to be (comments on your link from 2021) an issue for users.
I agree with you that this constraint is just something we (the paying customers) have to deal with.