Forum Discussion
OneDrive Client, Files on Demand and Syncing large libraries
I thought I'd post some observations regarding the OneDrive sync client we've observed that aren't documented anywhere but we needed to figure out when planning a massive move to SharePoint from on-premise file servers:
Limits:
Microsoft documents that you shouldn't sync more than 300,000 files across all libraries that the client is connected to, but there was no documentation about Files on Demand limits, and we have observed the following:
The OneDrive client will fail when the dat file that stores object metadata reaches exactly 2GB in size (%localappdata%\Microsoft\OneDrive\settings\Business1). Now, while Microsoft says you shouldn't sync more than 300,000 files, you can connect using files on demand to libraries that contain more than this. The trick here is that in this case, the total number of files and folders matter, lets call them collectively "objects". (Interestingly, when you first connect to a library and the client says "Process changes" and gives you a count, "changes" is the total number of objects in the library that it's bringing down using files on demand and storing in the dat file.)
My suspicion is that since the OneDrive client is still 32bit, it's still subject to certain 32bit process restrictions, but I don't really know. What matters in this case is that up until build 19.033.0218.0009 (19.033.0218.0006 insiders build), the client would fill up the dat file and reach the 2GB limit after about 700-800,000 objects. After build 19.033.0218.0009, it appears that the client has been optimized and no longer needs to store quite as much metadata about each object, "increasing" the upper limit of files on demand. (It seems that in general, each object takes up just over 1KB of data in the dat file, putting the limit somewhere just under 2 million objects). Keep in mind, this is not per library, this is across all libraries, including OneDrive for Business (personal storage), SharePoint Document Libraries, etc.
Performance:
The client has made some significant improvements in performance quickly as they refine each new build, but there are some things to be aware of before you start connecting clients to large libraries:
It. takes. forever.
The more objects in a library, the longer it's going to take for the client to build it's local cache of files on demand copies of all the items in the library. It seems that in general, the client can process about 50 objects per second, if you were connecting to a library or multiple libraries that had 1.4 million objects, it will take around 8 hours before the client is "caught up".
During the time that the content is being built out locally, Windows processes will also consume a large quantity of system resources. Specifically, explorer.exe and the Search Indexer will consume a lot of CPU and disk as they process the data that the client is building out.
The more resources you have, the better this experience will be. On a moderately powered brand new Latitude with an i5, 8GB of Memory and an SSD OS Drive, the machine's CPU was pretty heavily taxed (over 80% CPU) for over 8 hours connecting to libraries with around 1.5 million objects. On a much more powerful PC with an i7 and 16GB of memory, the strain was closer to 30% CPU, which wouldn't cripple an end user while they wait for the client and Windows to finish processing data. But, most organizations don't deploy $2000 computers to everyone, so be mindful when planning your Team-Site automount policies.
Restarts can be painful. when the OS boots back up OneDrive has to figure out what changed in the libraries in the cloud and compare that to it's local cache. I've seen this process take anywhere from 15 minutes to over an hour after restarts, depending on how many objects are in the cache.
Also, if you're connected to a large number of objects in the local cache, you can expect OneDrive to routinely use about a third of CPU on an i5 processor trying to keep itself up to date. This doesn't appear to interfere with the overall performance of the client, but it's an expensive process.
Hopefully over time this will continue to improve, especially as more organizations like mine move massive amounts of data up into SharePoint and retire on premise file servers. If I had to make a design suggestion or two:
- If SharePoint could pre-build a generic metadata file that a client could download on first connection, it would significantly reduce the time it takes to set up a client initially.
- Roll the Activity Log into an API that would allow the client to poll for changes since the last restart (this could also significantly improve the performance of migration products, as they wouldn't have to scan every object in a library when performing delta syncs, and would reduce the load on Microsoft's API endpoints when organizations perform mass migrations)
- Windows to the best of my knowledge doesn't have a mechanism to track changes on disk, i.e. "what recursively changed in this directory tree in the last x timeframe", if it were possible to do this, Windows and SharePoint could eliminate most of the overhead that the OneDrive client has to shoulder on it's own to keep itself up to date.
Speaking to OneDrive engineers at Ignite last year, support for larger libraries is high on their radar, and it's apparent in this latest production release that they are keeping their word on prioritizing iterative improvements for large libraries. If you haven't yet started mass data migrations into SharePoint, I can't stress enough the importance of deeply analyzing your data and understanding what people need access to and structuring your libraries and permissions accordingly. We used PowerBI to analyze our file server content and it was an invaluable tool in our planning.
Happy to chat with anyone struggling with similar issues and share what we did to resolve them. Happy SharePointing!
P.S., shoutout to the OneDrive Product Team, you guys are doing great, love what you've done with the OneDrive client, but for IT Pros struggling with competing product limits and business requirements, documenting behind the scenes technical data and sharing more of the roadmap would be incredibly valuable in helping our companies adopt or plan to adopt OneDrive and SharePoint.
- dustintadamIron Contributor
Update:
I recently got to have a conversation with some people from the OneDrive engineering team regarding an issue we were having that stemmed from a recent code update to SharePoint that impacted the way OneDrive handles document library content (our tenant is on First Release).
In short, our permissions model was specifically designed to take advantage of client behavior wherein, if a user had access to content, but NOT the parent folder, OneDrive would not try to represent that content in the local Files on Demand cache file and would only process the data that the user could navigate to.
We had various reasons for doing this, from effectively hiding content that was stale or archived, reducing load on OneDrive iOS clients, as well as reducing workload on the OneDrive Windows clients.
This behavior was driven by what Microsoft apparently saw as a problem that they referred to as Gap Folders, folders whose permissions had a gap from the parent, and because of this, the Sync Client couldn't synchronize content that had a gap in permissions.
Well, they recently pushed a code update to Sharepoint that addressed this issue, allowing the Client to have an understanding of all content in a library that a user has permissions to, which undoubtedly solved sync issues for many customers. Unfortunately this introduced a huge issue to us, as over 80% of our content is considered Archived and should never be synced locally, even though it's routinely accessed for historical content and we had taken advantage of the SharePoint permissions model and the client to manage this is a way that to us was an elegant and easy to manage solution.
If you're using gapped permissions in a similar manner with clients actively deployed, be aware that there are code changes coming down the pike that could cause clients to get overloaded and fail if they are suddenly presented with significantly more data that had been planned for.
They are actively evaluating this issue now, but I don't have an update yet on what the path forward will look like, as there are legitimate use cases for both wanting the client to see gapped content and not.
- Faiza QadriIron ContributorThanks for detailed finding and explanation. By any chance are you using RMS for your OneDrive for Business or SharePoint. Would love to hear your thoughts.
- dustintadamIron ContributorNot at the moment no, though it’s on our roadmap. We are in early planning stages for an RMS rollout, part of the problem is that we are leveraging a wide number of other Microsoft technologies such as Cloud App Security, and we want to make sure that we take advantage of the full ecosystem across CAS, Azure RMS, SharePoint, Windows Information Protection, etc. I’m sure once we get that far I’ll have more to share :)
- Joe McGowanIron Contributor
Any new updates on this topic? I have a Document Library with about 115,000 files in it and our users are having problems with the OneDrive sync getting stuck on "processing changes". Sometimes I can do a OneDrive reset, but other times it doesn't work. We're well under the 300k file limit.
- dustintadamIron Contributor
Nothing major that will simply "solve" the problem unfortunately.
Couple questions:
What Client version are your users running?
Does the library contain folders with broken inheritance?
How many users are syncing the library simultaneously?
Does the library have a high rate of change? (i.e. lots of files being modified in many different folders) or is it a lot of old static content?
- Joe McGowanIron Contributor
The latest Production ring client: 20.134.0705.0008
No broken inheritance.
Around 5-10 users syncing at the same time.
Yes, high rate of change.
I gave myself permissions and started syncing the library and I'm having similar issues. So I don't think its machine related. I can't even reset OneDrive now, it gives an error that is couldn't shut down OneDrive.
- Mohcine ChaoukiCopper Contributor
Joe McGowan - The sync limit is a pain.
Do you know if your users have a lot of files in their personal OneDrive folders that usually sync by default. Usually the personal OneDrive folders especially if the desktop protection option is enabled where documents, pictures, and the desktop get sync'ed as well.
For me the easiest way to fix stuck sync'ing is to uninstall OneDrive, hard deleted the OneDrive folder and then resync.
Hopefully this helps.
- Martijn SteffensIron Contributor
dustintadam thanks for sharing your findings. We have multiple customers struggling with OneDrive and files on Demand. We see the same performance CPU peaks when 'syncing' large SPO library's.
How to continue, that's the question....
Customers are thinking about leaving the Onedrive SPO sync, and start using the old network connection feature. Which also comes with other issue's. Like filenames, and session time-outs. Last resort would be to go browser only. But this is a huge change for users, well for most of them.- dustintadamIron Contributor
I can understand why you'd have customers thinking twice about their OneDrive deployments. In our case, we were... fortunate? in learning in advance about the pitfalls and spent an inordinate amount of time in planning before making a transition. As powerful as the technologies are, they are still plagued with a large number of upper limits that serve to significantly raise the bar to entry compared with other cloud storage platforms:
List view thresholds, 50,000 ACL limit per library, API throttling, 1000 clients per library, Sync limits, etc. Not only did we have to account for all these limits and understand them in great detail, we had to translate our on-premise content and find ways of making it fit inside Sharepoint.
Not all organizations understand these limits, or are willing to expend the energy that we were to move their content in such a way as to make it work. Case in point, we had a custom Azure web application developed that would propagate and manage a very granular permissions model onto our content in SharePoint so that we could both reduce our risk profile by restricting what content could be shared externally and have a mechanism to reduce the load on a sync client. It was an elegant solution to a complicated problem, but again... the bar to entry was high.
So far for us, the "answer" has been to place an exceptionally heavy emphasis on permissions that grants us the administrative control needed to manage the limits.
- Myles JefferyBrass Contributor
Martijn Steffens If you are considering the drive mapping approach take a look at Zee Drive. Zee Drive maps network drives to OneDrive for Business and SharePoint Online. Zee Drive will keep users authenticated and it also provides a number of useful productivity features off the File Explorer context menu.
Myles
- Cian AllnerSilver Contributor
Thanks for sharing these insights Dustin, it's interesting to hear how OneDrive performs under these conditions and how it might work even better.
Out of interest, do all your user's sync document libraries, like as an alternative to mapped network drives, when previously on-premise?
- dustintadamIron Contributor
Actually no, as I am not a glutton for punishment and I didn't want our Service Desk to plot to murder me. For most people, we've switched to using the browser to navigate the content libraries, and reserved the sync client to workloads that require it (CAD drawings, etc). Interestingly though, this has obviously made mobile access much easier and a lot of our users will be getting in through the iOS OneDrive client.
We haven't seen it yet, but one fear we have that I'm sure will present itself at some point is the fact that you can't granularly control who has the right to sync a library, it's either on or off for the entire document library, so I'm sure we will be fielding helpdesk requests for broken clients that tried to connect to too much data, even though they should have been browser-only users.
As a side note, for anyone interested:
If you have a client that has become over-subscribed, you can recover it and manually remove specific libraries from the sync scope by:
1.) forcefully close the OneDrive client in task manager
2.) navigate to C:\Users\user\AppData\Local\Microsoft\OneDrive\settings\Business1
3.) open the {GUID}.ini file
4.) find the sync target entry that begins with either "libraryScope" or "libraryFolder" and delete it and save the ini file
5.) next, open regedit as the local user and navigate to Computer\HKEY_CURRENT_USER\Software\Microsoft\OneDrive\Accounts\Business1\ScopeIdToMountPointPathCache
6.) locate the REG_SZ entry that corresponds to the sharepoint location that you deleted from the ini file and delete it.
7.) perform a reset on the OneDrive client (onedrive.exe /reset)
This will force OneDrive to re-map all local files in scope, but will ignore the location that was manually removed. It is now safe to delete the orphaned files from the library that was deleted from the ini file and the registry
- Cian AllnerSilver Contributor
Thanks, that makes sense.
I think it's amazing the progress the sync client has come from the old days of Groove.exe and the workloads that wouldn't have been feasible before, that can now be done with ease.
- Matt_IgniteCopper ContributorGreat post dustintadam - thanks very much for taking the time to write it all up.
I've been struggling with a migration that we did for a client earlier in the year. We moved all their documents from an old on-prem file server into various Sharepoint libraries, and have them synced down to the client workstations with OneDrive / Files on Demand. The libraries are big (199k items), but still well inside the limits of the service. However, we've been experiencing all manner of problems with using it in practice - OneDrive sync agent errors, very slow sync, etc. What you've outlined above goes a long way to explaining why we've been seeing the problems that we have.
I'm glad that you've been working with the OneDrive team on this, and I'm REALLY looking forward to the product updates that you've alluded to. They can't come quickly enough for me.
Cheers,
Matt- dustintadamIron Contributor
They are working towards higher limits. Part of the problem stems from the fact that they were trying to bring the Cloud Experience down to the desktop with the highest fidelity possible. By most measurements, they succeeded. For example, renaming or moving files can be completed without downloading the file to the desktop, it simply brokers the API calls that complete the action in OD or SharePoint. This is super cool, but it begins to fall over when the number of objects starts to increase.
Windows itself was never designed to really support this sort of integration, so there isn't any robust or elegant way to poll the drive and ask "What Changed?". This leaves it up to the client itself to have to scan periodically for changes to Files on Demand content, even if it hasn't been downloaded, because action can still be taken on those objects.
While I can't give you any specific number, I can tell you that the total number of objects their client is about to be able to handle without crashing is significantly higher. But that's really just the first step, while the client is getting more bullet-proof when handling large numbers of objects, there is still more work to do on performance. The more objects you bring down, the longer it will take for the client to detect changes and complete a replication either to or from the cloud.
They are making significant headway, and I can assure you they care more than you might imagine about the limits the client can handle, hang in there, it's getting better ;)
- Diego VasquezCopper ContributorThank you very much @dustin adams for starting this thread!
Could you post the link to the "documented" 300,000 library ceiling recommendation?
Our computers are blue-screening from the load of OneDrive, and this is without hydrating any files!
Also if anybody has other references on so-called "gap folders" and the behavior on "restricted view" permissions?- dustintadamIron Contributor
All the current technical documentation regarding the Sync Client and it's limits can be found here:
Interestingly, and I'm assuming this is based on feedback, the upper sync limit has been revised down from 300,000 files to 100,000.
There isn't any formal documentation regarding the behavior of Restricted View and it's impact on the sync client, but in our testing it does honor the underlying permission limits defined by the permission level:
The critical thing to understand with this is that this is a recursive permission, this permission will cascade down the rest of the directory tree underneath where this permission is defined, regardless of any other permission a user may have (Even if you have broken inheritance further down). In our use case, we applied this permission to our "Archive" folders inside our document libraries, effectively allowing our colleagues to access archived content in the browser, but restricting their ability to download it with the sync client. Not only does this help protect the archived content, it relieves stress on the client.
I would recommend that anyone currently struggling with over-loaded clients take a second look at cloud content and determine if any of the data in a Document Library really NEEDS to be synced, if not, there is probably method to help relive client stress by implementing Restricted View on parent folders that hold content that can effectively remain cloud-only and then re-assign contribute permission below that level.
- jab365cloudSteel ContributorWe still face issues with large list even if we sync a single folder. Any update on this?
- _Chris_GCopper Contributor
dustintadam is there a way to enforce online/cloud only when using OneDrive vs Files On Demand? I know this is a completely different architecture, but when dealing with all these issues and user complaints, comparing it to a Google Drive implementation for enterprise, Google seems to have gone with a 'make it look/work like a mapped network drive'. They don't need to constantly sync and check what's changed as far as I can tell. Staff who do want to use OneDrive instead of the browser, really just want the explorer view if they are in that transactional type role. If there are staff that want an offline option, they can just do the right-click - keep offline as-hoc (basically as it is now).
- dustintadamIron Contributor
Hey Chris;
I'm not sure if this is exactly what you are looking for, but through MDM or ADML templates you can enforce the OneDrive Client to use Files On Demand by default:
https://docs.microsoft.com/en-us/onedrive/use-group-policy#FilesOnDemandEnabled
If I misunderstood your question let me know.
- _Chris_GCopper Contributor
dustintadam thanks for the reply. I was actually meaning the opposite and to prevent any local download/offline files using the OneDrive client and keep it as 'cloud only' access. This would be an attempt to prevent performance syncing issues on the client as well as the general conflicts/issues that can occur. I understand the trade-off would be to have reliable internet access. I basically want to replicate the map network drive and file server architecture as in the past but instead use the OneDrive Client and SharePoint online in its place. I feel this would prevent all the issues in this thread (until at least the sync client is reliable and fast when picking up changes). I suspect that is not an option and 'Files on Demand' is our only choice? I want 'Files Cloud Only' in OneDrive.
- JonnaPCopper ContributorThe Synology NAS with Cloud Sync solution seems to be a viable option. But I am strongly encouraging all customers and even our company internally to not rely on the OneDrive Desktop App Sync feature anymore. Library issues aside, I have also observed detrimental performance to PC's, and the handling of file conflicts among many users is simply unmanageable.
- BaboonCZCopper Contributor
Hi i have problem with 2GB limit on .dat file. I sync 1 499GB on over 2 000 000 files. So i maked simple batch file to rename .dat file.
I really need 64 bit version of OneDrive bcs first init will make 2GB .dat file again and shutdown OneDrive 😕
SpoilerSET Today=%Date:~8,2%%Date:~3,2%%Date:~0,2%
goto menu
:menu
@echo off
cls
cd /d %LOCALAPPDATA%\Microsoft\OneDrive
echo OneDrive limit over 2GB
echo ________________________________________________________________________echo 1 - Stop OneDrive
echo 2 - Rename .dat file to _%Today%.dat
echo 3 - Restart OneDrive
echo 4 - exit
echo ________________________________________________________________________
SET /p source=if %source% == 1 goto kill
if %source% == 2 goto rename
if %source% == 3 goto restart
if %source% == 4 goto exit:kill
@echo on
start OneDrive /shutdown
goto menu:rename
cd /d %USERPROFILE%\AppData\Local\Microsoft\OneDrive\settings\Business1\
rename "IDofOneDrive".dat "IDofOneDrive"_%Today%.dat
goto menu:restart
@echo on
start OneDrive /restart
goto menu:exit
exit- JonasBackSteel ContributorJust keeping this thread active since it contains a LOT of important and valuable information.
Anyone want to mention some new findings or if they have some new suggestions not already shared here to solve these scenarios?
I really wish Microsoft would add the feature to "Keep files cloud only" so it doesn't need to sync or keep track of changes att all (just as we had with the traditional mapped network shares) but with the One Drive client rather than the old WebDav way of doing it. We always going to have business areas who store a lot of files and still want to use Explorer to browse instead of web browser.- BaboonCZCopper ContributorOneDrive is finally a native 64-bit app 🙂
https://winaero.com/onedrive-is-now-a-native-64-bit-app/
Few months ago i start old backup folders transfer to Sharepoint Backup folder but is a limitation how many files i can move. And sometimes around christmas was released OneDrive x64 so i install it and start move files back from Shrapiont to Onedrive 😄
At this moment i have 1 709 GB on Onedrive and sync .dat file have 1,73GB. I'll wait to see what happens in the coming months.
- M1StorytellerCopper Contributor
dustintadamI'm seeking feedback on a recent mass file migration I attempted from OneDrive to Sharepoint. I'm a video creator for my organization. I moved a large amount of media files - videos/photos - from my company OneDrive to a company Sharepoint site. My colleague (also a video creator) and I both sync Sharepoint back to our OneDrives, so this was an attempt to create a more collaborative, cloud-based workflow - rather than each of us having our work live in silo-ed OneDrive folders. I'm now realizing I may have overtaxed the platform, and it's trying to slowly play catch up. Because, a week later, my OneDrive client is still "preparing to upload" and certain files that I now try to add to OneDrive are stuck in a perpetual "Syncing" state.
The other byproduct of this attempt is that I now see a large number of duplicate files with a suffix ID in the filename based on my device's name.
I did not consult this site or others like it for advice before making this migration - obviously. Can anyone with a similar experience offer advice?
- MarWerNoBrass Contributor
M1Storyteller I had a similar issue when migrating a lot of data or re-distributing data over my 5 accounts since 1TB is not enough in todays time. (I had finally uploaded 6TB over my slow connection to my old provider when they announced the end of the service... highly annoying).
There seems to be no clear way.
If there is a suffix ID it is always a duplication, so these files can usually be deleted. The platform detects a change in timestamp or file size and duplicates it. This could be due to a auto save function in the software you are using or simply a problem when copying from one file system onto another.
It is highly annoying that this does not work as it supposed to.
First option:
Try resetting the OneDrive database (there are a number of tutorials online) and see if this fixes the issue. Let the computer online for several days and see if this resolves it, don't make an file changes.Sometimes it works, sometime not...
If not, I would do the following:
- Make backup of all files, make sure you have a "Master" where you know you have ALL files. Clean up duplicate files
- Disconnect the OneDrive service on you PC
- Delete all files in your local OneDrive
- Delete all files in the Online OneDrive
- Copy only one file or folder after another. Always wait until syncing has finished. In this time, I would not use it as a working directory (I know this is annoying...).
I actually have done it with a work around as well: I left the files where they where and synced to a empty folder. Now place a Symlink to the first folder at the original place. Let it sync. Now do the next one and so on.
If you want, you can replace the Symlink with the actual folders. Just make sure you and Onedrive is offline when you do the moving around (Remove symlink, place original folder inside Sync folder)
What I found is simple: OneDrive does not like too many files changed at once and it seems to have more of an issue with plenty of small files than with fewer large files.
Also there is a path length restriction, so I placed the Onedrive folder on the top level of a drive letter I created specially for OneDrive. Since I have multiple account, I need to be at least one folder deep, the OneDrive folder is in a one letter folder 🙂
Like: W:\M\Onedrive
Where "Onedrive" is the automatically created folder by OneDrive
Other Accounts follow the same rule with other letters. I have multiple log-ons on the computer with each Account syncing its own folder.
Again, with Symlinks, I pull the various places/ accounts into one folder as a working folder on my PC. Having multiple account (Family) will help to have smaller databases on each account which so far works for me.So I have all the Software and Backup on my Onedrive account, my Wife has all the Pictures and private files. My kids have all my music and videos.
Each have a folder with all data pulled in one folder.
Since I had one account left, I created one OneDrive account for my relatives into which we have all common pictures from all the family gatherings or holidays or trips we made together. And is also used as exchange folder to share larger files by e-mail. In this way I do not need to have a share in my private files (location).
One thing I forgot:
I almost never switch off m PC nowadays, otherwise Onedrive seems to be unable to catch up... From a security point of view, this is of course a terrible practice...