Mar 18 2019 07:55 AM
I thought I'd post some observations regarding the OneDrive sync client we've observed that aren't documented anywhere but we needed to figure out when planning a massive move to SharePoint from on-premise file servers:
Limits:
Microsoft documents that you shouldn't sync more than 300,000 files across all libraries that the client is connected to, but there was no documentation about Files on Demand limits, and we have observed the following:
The OneDrive client will fail when the dat file that stores object metadata reaches exactly 2GB in size (%localappdata%\Microsoft\OneDrive\settings\Business1). Now, while Microsoft says you shouldn't sync more than 300,000 files, you can connect using files on demand to libraries that contain more than this. The trick here is that in this case, the total number of files and folders matter, lets call them collectively "objects". (Interestingly, when you first connect to a library and the client says "Process changes" and gives you a count, "changes" is the total number of objects in the library that it's bringing down using files on demand and storing in the dat file.)
My suspicion is that since the OneDrive client is still 32bit, it's still subject to certain 32bit process restrictions, but I don't really know. What matters in this case is that up until build 19.033.0218.0009 (19.033.0218.0006 insiders build), the client would fill up the dat file and reach the 2GB limit after about 700-800,000 objects. After build 19.033.0218.0009, it appears that the client has been optimized and no longer needs to store quite as much metadata about each object, "increasing" the upper limit of files on demand. (It seems that in general, each object takes up just over 1KB of data in the dat file, putting the limit somewhere just under 2 million objects). Keep in mind, this is not per library, this is across all libraries, including OneDrive for Business (personal storage), SharePoint Document Libraries, etc.
Performance:
The client has made some significant improvements in performance quickly as they refine each new build, but there are some things to be aware of before you start connecting clients to large libraries:
It. takes. forever.
The more objects in a library, the longer it's going to take for the client to build it's local cache of files on demand copies of all the items in the library. It seems that in general, the client can process about 50 objects per second, if you were connecting to a library or multiple libraries that had 1.4 million objects, it will take around 8 hours before the client is "caught up".
During the time that the content is being built out locally, Windows processes will also consume a large quantity of system resources. Specifically, explorer.exe and the Search Indexer will consume a lot of CPU and disk as they process the data that the client is building out.
The more resources you have, the better this experience will be. On a moderately powered brand new Latitude with an i5, 8GB of Memory and an SSD OS Drive, the machine's CPU was pretty heavily taxed (over 80% CPU) for over 8 hours connecting to libraries with around 1.5 million objects. On a much more powerful PC with an i7 and 16GB of memory, the strain was closer to 30% CPU, which wouldn't cripple an end user while they wait for the client and Windows to finish processing data. But, most organizations don't deploy $2000 computers to everyone, so be mindful when planning your Team-Site automount policies.
Restarts can be painful. when the OS boots back up OneDrive has to figure out what changed in the libraries in the cloud and compare that to it's local cache. I've seen this process take anywhere from 15 minutes to over an hour after restarts, depending on how many objects are in the cache.
Also, if you're connected to a large number of objects in the local cache, you can expect OneDrive to routinely use about a third of CPU on an i5 processor trying to keep itself up to date. This doesn't appear to interfere with the overall performance of the client, but it's an expensive process.
Hopefully over time this will continue to improve, especially as more organizations like mine move massive amounts of data up into SharePoint and retire on premise file servers. If I had to make a design suggestion or two:
- If SharePoint could pre-build a generic metadata file that a client could download on first connection, it would significantly reduce the time it takes to set up a client initially.
- Roll the Activity Log into an API that would allow the client to poll for changes since the last restart (this could also significantly improve the performance of migration products, as they wouldn't have to scan every object in a library when performing delta syncs, and would reduce the load on Microsoft's API endpoints when organizations perform mass migrations)
- Windows to the best of my knowledge doesn't have a mechanism to track changes on disk, i.e. "what recursively changed in this directory tree in the last x timeframe", if it were possible to do this, Windows and SharePoint could eliminate most of the overhead that the OneDrive client has to shoulder on it's own to keep itself up to date.
Speaking to OneDrive engineers at Ignite last year, support for larger libraries is high on their radar, and it's apparent in this latest production release that they are keeping their word on prioritizing iterative improvements for large libraries. If you haven't yet started mass data migrations into SharePoint, I can't stress enough the importance of deeply analyzing your data and understanding what people need access to and structuring your libraries and permissions accordingly. We used PowerBI to analyze our file server content and it was an invaluable tool in our planning.
Happy to chat with anyone struggling with similar issues and share what we did to resolve them. Happy SharePointing!
P.S., shoutout to the OneDrive Product Team, you guys are doing great, love what you've done with the OneDrive client, but for IT Pros struggling with competing product limits and business requirements, documenting behind the scenes technical data and sharing more of the roadmap would be incredibly valuable in helping our companies adopt or plan to adopt OneDrive and SharePoint.
Jan 09 2020 02:50 PM
@Dustin Adam I am wondering if you could share any updates on your experience with using OneDrive for Business Sync for large libraries. We have a use case where the company is attempting to replace certain network drives used by many users. i.e. For example, a drive with 5,000+ Folders and 7,500+files
Would you be able to offer any thoughts or input on such plan looking forward into 2020? Any input would be greatly appreciated!
Jan 09 2020 03:26 PM - edited Jan 09 2020 03:45 PM
Jan 10 2020 07:08 AM
A couple things to look for and consider:
One of the things we've discovered that isn't really documented anywhere is that the more content you shove into a single Document Library, the worse that library performs. Adding additional Indexes to the Library manually can help, but in general, the fewer items you put into a library the better. This becomes apparent even when browsing the library via the UI: a library with fewer total objects browses faster than one with hundreds of thousands. The Sync client will ultimately be affected by that increased overhead as well: when it makes API calls to detect or replicate changes, it's going to take longer to complete. We learned this the hard way ourselves and are actively working on breaking up our libraries. If you haven't migrated data yet, find a way to break up your content into as many libraries as possible to reduce the total volume. Also bear in mind that as is the nature of all file storage, it never gets smaller, nobody ever deletes anything, if you start with an overly large library, your experience will never get better from that point.
I know that the eventual goal is to get the OD Client to gracefully handle syncing up to a million objects, but that hasn't been publicly communicated and there is no timeline for when that might be realized.
Jan 29 2020 01:16 PM
Feb 13 2020 11:23 PM
@Dustin Adam is there a way to enforce online/cloud only when using OneDrive vs Files On Demand? I know this is a completely different architecture, but when dealing with all these issues and user complaints, comparing it to a Google Drive implementation for enterprise, Google seems to have gone with a 'make it look/work like a mapped network drive'. They don't need to constantly sync and check what's changed as far as I can tell. Staff who do want to use OneDrive instead of the browser, really just want the explorer view if they are in that transactional type role. If there are staff that want an offline option, they can just do the right-click - keep offline as-hoc (basically as it is now).
Feb 14 2020 07:17 AM
Hey Chris;
I'm not sure if this is exactly what you are looking for, but through MDM or ADML templates you can enforce the OneDrive Client to use Files On Demand by default:
https://docs.microsoft.com/en-us/onedrive/use-group-policy#FilesOnDemandEnabled
If I misunderstood your question let me know.
Feb 16 2020 08:24 PM
@Dustin Adam thanks for the reply. I was actually meaning the opposite and to prevent any local download/offline files using the OneDrive client and keep it as 'cloud only' access. This would be an attempt to prevent performance syncing issues on the client as well as the general conflicts/issues that can occur. I understand the trade-off would be to have reliable internet access. I basically want to replicate the map network drive and file server architecture as in the past but instead use the OneDrive Client and SharePoint online in its place. I feel this would prevent all the issues in this thread (until at least the sync client is reliable and fast when picking up changes). I suspect that is not an option and 'Files on Demand' is our only choice? I want 'Files Cloud Only' in OneDrive.
Feb 17 2020 12:58 AM
So we took the plunge of moving a file server to OneDrive for Business Plan 2, on request from a client. The migration spanned approx. 10 parent folders (Shared Drives) and roughly 600 000 - 800 000 files in total - 2.4TB. There was one or two folders in excess of 100 000 files which we split out as we learned about the 100 000 limit. All users have the Files on Demand feature enabled and we shared folders from the OBP2 account to respective users (approx. 20 users).
Unfortunately it has been a disaster. With the most common issue being that the end users cannot even sync 1 shared folder to their PC's, with files on demand enabled. It often just hangs in "processing changes" state without any files appearing for days on end.
We raised it with Microsoft support (Premier support) - but here's the strange thing. While their communication has been absolutely dismal - what I've gathered between the radio silence and infrequent responses is that they have run some "diagnostics tool" on some of the affected accounts. Within less than an hour suddenly those affected accounts start syncing the shared folders immediately, things start appearing in the app at light speed. It works brilliantly. But then after say 48 hours the user's OneDrive account/sync is "broken" again and just hangs forever.
I've often struggled to gather any precise responses from MS Support team on the issue and what they did when, but the client is now cancelling with MS and wants us to find another solution. Perhaps the scope was too large for OneDrive for Business, or we did it wrong or missed the fine print, but we've also learned a hard lesson that support for the product is also poor and not business ready. I have subsequently cancelled all OneDrive migrations lined up in future for fear of this happening to others.
Feb 17 2020 06:44 AM
Feb 17 2020 06:47 AM
Are there a lot of broken inheritance permissions in the content that your client is syncing? or does everyone have access to everything? there is a performance switch that can be enabled in the Sync Client if it's handling read-only folders or folders with broken inheritance.
Feb 17 2020 09:24 PM
Feb 18 2020 12:47 AM
@JonnaP Hi. Interested in your case. What is your largest document library in terms of number of files and what is your largest site in terms of number of files (answers may be the same if you only have one doc library in each site).
Feb 18 2020 06:45 AM
Feb 18 2020 12:58 PM
Hi @Dustin Adam ,
Kudos on some valuable insight here. I was wondering if there were any new updates in regards to the one drive sync limitation increase... I am dying and hoping that something will get released soon.
We have spent most of last year planning our SharePoint collaboration environment and we have re-structured our large department libraries to an active (sync'able) and archive (non-sync'able) libraries. We slimmed down the "active" libraries to about "60K" sync'able files for our large departments (from millions of files) and only recommend to sync 1 library per department.
However, we still notice a lot of sync delays, conflicts, and performance overhead.
Please let us know if there is a light at the end of the tunnel.
Feb 18 2020 01:22 PM
Feb 18 2020 01:27 PM
@Dustin Adam let's hope it comes even in incremental steps and not with a major release going from 100k limit to 1M files limit.
What I've heard from a friend is that there is a partner beta that tries to offload the sync client load by identefying stale files and ignoring them.
We are also struggling with sync issues and we ended up splitting large libraries to many smaller 7 months after the initial migration...
Feb 18 2020 01:30 PM - edited Feb 18 2020 01:30 PM
lets hope so, the product team isn't great at discussing performance roadmap objectives
Feb 18 2020 09:43 PM - edited Feb 18 2020 09:44 PM
There's a lot of good knowledge in this thread - keep them coming.
So basically, if you have a huge library with 1 million files you will probably run into problems. If you decided to split them into 20 of them with 50k each you will be better off. But if a user decides they need to sync (even though just on-demand) all 20 of them, the user will still run into problems because the total volume of files sync'ed on this particular user's computer is 1 million?
Feb 18 2020 09:59 PM - edited Feb 18 2020 10:36 PM
@eddablinHi there. If I understand the terminology correctly... so all folders/files only exist on a single OneDrive for Business Plan 2 account. That means there's only 1 library. This plan 2 account has 9 parent folders. Unfortunately OneDrive doesn't seem to detail or report the number of files (or I just can't find where in the UI), but it should be around 600 000 - 800 000 files in total.
Each user who has O365 Business Premium license (About 20 users total) has been given shared links to certain folders in the above library, but in most cases the shared link is actually a subfolder and not the entire parent folder, see below...
So in at least 2 parent folders in the library there exists subfolders that initially contained in excess of 100 000 files. Due to this causing issues with sharing links to users and us finding out about the 100 000 limitation, we split out alphabetized variants of the subfolders and shared those instead.
Parant Folder A >
Subfolder A-F (Shared)
Subfolder G-K (Shared)
etc...
I'm now considering that despite us overcoming the shared permission issue by alphabetized split of the subfolders, because they exist in a parent folder that contains in excess of 100 000 objects then perhaps this could still cause the issue we are observing? (File on Demand sync at client end just hanging and changes taking forever to process)
I'm now considering moving the subfolders out of the parent folder and re-sharing to users so they can sync to their PC's and see what happens...
UPDATE: I signed into a user's OneDrive account, accessed his "Shared With Me" section, clicked sync on a folder shared from the OneDrive Plan 2 library/account that contains 20 000 items only - unfortunately it's not syncing to my PC. Hangs on processing changes, nothing comes through - occasionally says processing 501 or 502 changes then back to nothing again. There goes that theory :( So moving things out a parent folder to remove the 100 000 concern probably won't help.
My next plan is to move one of the folders to another OneDrive account and try sharing/syncing that to the user's account.
Feb 18 2020 10:37 PM
Hi there. If I understand the terminology correctly... so all folders/files only exist on a single OneDrive for Business Plan 2 account. That means there's only 1 library. This plan 2 account has 9 parent folders. Unfortunately OneDrive doesn't seem to detail or report the number of files (or I just can't find where in the UI), but it should be around 600 000 - 800 000 files in total.
Each user who has O365 Business Premium license (About 20 users total) has been given shared links to certain folders in the above library, but in most cases the shared link is actually a subfolder and not the entire parent folder, see below...
So in at least 2 parent folders in the library there exists subfolders that initially contained in excess of 100 000 files. Due to this causing issues with sharing links to users and us finding out about the 100 000 limitation, we split out alphabetized variants of the subfolders and shared those instead.
Parant Folder A >
Subfolder A-F (Shared)
Subfolder G-K (Shared)
etc...
I'm now considering that despite us overcoming the shared permission issue by alphabetized split of the subfolders, because they exist in a parent folder that contains in excess of 100 000 objects then perhaps this could still cause the issue we are observing? (File on Demand sync at client end just hanging and changes taking forever to process)
I'm now considering moving the subfolders out of the parent folder and re-sharing to users so they can sync to their PC's and see what happens...
UPDATE: I signed into a user's OneDrive account, accessed his "Shared With Me" section, clicked sync on a folder shared from the OneDrive Plan 2 library/account that contains 20 000 items only - unfortunately it's not syncing to my PC. Hangs on processing changes, nothing comes through - occasionally says processing 501 or 502 changes then back to nothing again. There goes that theory :( So moving things out a parent folder to remove the 100 000 concern probably won't help.
My next plan is to move one of the folders to another OneDrive account and try sharing/syncing that to the user's account.