OneDrive Client, Files on Demand and Syncing large libraries

Iron Contributor

I thought I'd post some observations regarding the OneDrive sync client we've observed that aren't documented anywhere but we needed to figure out when planning a massive move to SharePoint from on-premise file servers:

 

Limits:

 

Microsoft documents that you shouldn't sync more than 300,000 files across all libraries that the client is connected to, but there was no documentation about Files on Demand limits, and we have observed the following:

 

The OneDrive client will fail when the dat file that stores object metadata reaches exactly 2GB in size (%localappdata%\Microsoft\OneDrive\settings\Business1). Now, while Microsoft says you shouldn't sync more than 300,000 files, you can connect using files on demand to libraries that contain more than this. The trick here is that in this case, the total number of files and folders matter, lets call them collectively "objects". (Interestingly, when you first connect to a library and the client says "Process changes" and gives you a count, "changes" is the total number of objects in the library that it's bringing down using files on demand and storing in the dat file.)

 

My suspicion is that since the OneDrive client is still 32bit, it's still subject to certain 32bit process restrictions, but I don't really know. What matters in this case is that up until build 19.033.0218.0009 (19.033.0218.0006 insiders build), the client would fill up the dat file and reach the 2GB limit after about 700-800,000 objects. After build 19.033.0218.0009, it appears that the client has been optimized and no longer needs to store quite as much metadata about each object, "increasing" the upper limit of files on demand. (It seems that in general, each object takes up just over 1KB of data in the dat file, putting the limit somewhere just under 2 million objects). Keep in mind, this is not per library, this is across all libraries, including OneDrive for Business (personal storage), SharePoint Document Libraries, etc.

 

Performance:

 

The client has made some significant improvements in performance quickly as they refine each new build, but there are some things to be aware of before you start connecting clients to large libraries:

 

It. takes. forever. 

 

The more objects in a library, the longer it's going to take for the client to build it's local cache of files on demand copies of all the items in the library. It seems that in general, the client can process about 50 objects per second, if you were connecting to a library or multiple libraries that had 1.4 million objects, it will take around 8 hours before the client is "caught up".

 

During the time that the content is being built out locally, Windows processes will also consume a large quantity of system resources. Specifically, explorer.exe and the Search Indexer will consume a lot of CPU and disk as they process the data that the client is building out.

 

The more resources you have, the better this experience will be. On a moderately powered brand new Latitude with an i5, 8GB of Memory and an SSD OS Drive, the machine's CPU was pretty heavily taxed (over 80% CPU) for over 8 hours connecting to libraries with around 1.5 million objects. On a much more powerful PC with an i7 and 16GB of memory, the strain was closer to 30% CPU, which wouldn't cripple an end user while they wait for the client and Windows to finish processing data. But, most organizations don't deploy $2000 computers to everyone, so be mindful when planning your Team-Site automount policies.

 

Restarts can be painful. when the OS boots back up OneDrive has to figure out what changed in the libraries in the cloud and compare that to it's local cache. I've seen this process take anywhere from 15 minutes to over an hour after restarts, depending on how many objects are in the cache.

 

Also, if you're connected to a large number of objects in the local cache, you can expect OneDrive to routinely use about a third of CPU on an i5 processor trying to keep itself up to date. This doesn't appear to interfere with the overall performance of the client, but it's an expensive process.

 

Hopefully over time this will continue to improve, especially as more organizations like mine move massive amounts of data up into SharePoint and retire on premise file servers. If I had to make a design suggestion or two:

 

- If SharePoint could pre-build a generic metadata file that a client could download on first connection, it would significantly reduce the time it takes to set up a client initially.

- Roll the Activity Log into an API that would allow the client to poll for changes since the last restart (this could also significantly improve the performance of migration products, as they wouldn't have to scan every object in a library when performing delta syncs, and would reduce the load on Microsoft's API endpoints when organizations perform mass migrations)

- Windows to the best of my knowledge doesn't have a mechanism to track changes on disk, i.e. "what recursively changed in this directory tree in the last x timeframe", if it were possible to do this, Windows and SharePoint could eliminate most of the overhead that the OneDrive client has to shoulder on it's own to keep itself up to date.

 

Speaking to OneDrive engineers at Ignite last year, support for larger libraries is high on their radar, and it's apparent in this latest production release that they are keeping their word on prioritizing iterative improvements for large libraries. If you haven't yet started mass data migrations into SharePoint, I can't stress enough the importance of deeply analyzing your data and understanding what people need access to and structuring your libraries and permissions accordingly. We used PowerBI to analyze our file server content and it was an invaluable tool in our planning.

 

Happy to chat with anyone struggling with similar issues and share what we did to resolve them. Happy SharePointing!

 

P.S., shoutout to the OneDrive Product Team, you guys are doing great, love what you've done with the OneDrive client, but for IT Pros struggling with competing product limits and business requirements, documenting behind the scenes technical data and sharing more of the roadmap would be incredibly valuable in helping our companies adopt or plan to adopt OneDrive and SharePoint.

 

 

69 Replies

Thanks for sharing these insights Dustin, it's interesting to hear how OneDrive performs under these conditions and how it might work even better.

 

Out of interest, do all your user's sync document libraries, like as an alternative to mapped network drives, when previously on-premise?

@Cian Allner 

Actually no, as I am not a glutton for punishment and I didn't want our Service Desk to plot to murder me. For most people, we've switched to using the browser to navigate the content libraries, and reserved the sync client to workloads that require it (CAD drawings, etc). Interestingly though, this has obviously made mobile access much easier and a lot of our users will be getting in through the iOS OneDrive client.

 

We haven't seen it yet, but one fear we have that I'm sure will present itself at some point is the fact that you can't granularly control who has the right to sync a library, it's either on or off for the entire document library, so I'm sure we will be fielding helpdesk requests for broken clients that tried to connect to too much data, even though they should have been browser-only users.

 

As a side note, for anyone interested:

 

If you have a client that has become over-subscribed, you can recover it and manually remove specific libraries from the sync scope by:

 

1.) forcefully close the OneDrive client in task manager

2.) navigate to C:\Users\user\AppData\Local\Microsoft\OneDrive\settings\Business1

3.) open the {GUID}.ini file

4.) find the sync target entry that begins with either "libraryScope" or "libraryFolder" and delete it and save the ini file

5.) next, open regedit as the local user and navigate to Computer\HKEY_CURRENT_USER\Software\Microsoft\OneDrive\Accounts\Business1\ScopeIdToMountPointPathCache

6.) locate the REG_SZ entry that corresponds to the sharepoint location that you deleted from the ini file and delete it.

7.) perform a reset on the OneDrive client (onedrive.exe /reset)

 

This will force OneDrive to re-map all local files in scope, but will ignore the location that was manually removed. It is now safe to delete the orphaned files from the library that was deleted from the ini file and the registry

Thanks, that makes sense.  

 

I think it's amazing the progress the sync client has come from the old days of Groove.exe and the workloads that wouldn't have been feasible before, that can now be done with ease.

@Cian Allner 

Oh yeah, the progress has been spectacular, especially with their aggressive release cadence.

@Dustin Adam thanks for sharing your findings. We have multiple customers struggling with OneDrive and files on Demand. We see the same performance CPU peaks when 'syncing' large SPO library's.
How to continue, that's the question....

Customers are thinking about leaving the Onedrive SPO sync, and start using the old network connection feature. Which also comes with other issue's. Like filenames, and session time-outs. Last resort would be to go browser only. But this is a huge change for users, well for most of them.

@Martijn Steffens 

I can understand why you'd have customers thinking twice about their OneDrive deployments. In our case, we were... fortunate? in learning in advance about the pitfalls and spent an inordinate amount of time in planning before making a transition. As powerful as the technologies are, they are still plagued with a large number of upper limits that serve to significantly raise the bar to entry compared with other cloud storage platforms:

 

List view thresholds, 50,000 ACL limit per library, API throttling, 1000 clients per library, Sync limits, etc. Not only did we have to account for all these limits and understand them in great detail, we had to translate our on-premise content and find ways of making it fit inside Sharepoint.

 

Not all organizations understand these limits, or are willing to expend the energy that we were to move their content in such a way as to make it work. Case in point, we had a custom Azure web application developed that would propagate and manage a very granular permissions model onto our content in SharePoint so that we could both reduce our risk profile by restricting what content could be shared externally and have a mechanism to reduce the load on a sync client. It was an elegant solution to a complicated problem, but again... the bar to entry was high.

 

So far for us, the "answer" has been to place an exceptionally heavy emphasis on permissions that grants us the administrative control needed to manage the limits. 

@Martijn Steffens If you are considering the drive mapping approach take a look at Zee Drive. Zee Drive maps network drives to OneDrive for Business and SharePoint Online. Zee Drive will keep users authenticated and it also provides a number of useful productivity features off the File Explorer context menu.

 

Myles

Update:

 

I recently got to have a conversation with some people from the OneDrive engineering team regarding an issue we were having that stemmed from a recent code update to SharePoint that impacted the way OneDrive handles document library content (our tenant is on First Release).

 

In short, our permissions model was specifically designed to take advantage of client behavior wherein, if a user had access to content, but NOT the parent folder, OneDrive would not try to represent that content in the local Files on Demand cache file and would only process the data that the user could navigate to.

 

We had various reasons for doing this, from effectively hiding content that was stale or archived, reducing load on OneDrive iOS clients, as well as reducing workload on the OneDrive Windows clients.

 

This behavior was driven by what Microsoft apparently saw as a problem that they referred to as Gap Folders, folders whose permissions had a gap from the parent, and because of this, the Sync Client couldn't synchronize content that had a gap in permissions.

 

Well, they recently pushed a code update to Sharepoint that addressed this issue, allowing the Client to have an understanding of all content in a library that a user has permissions to, which undoubtedly solved sync issues for many customers. Unfortunately this introduced a huge issue to us, as over 80% of our content is considered Archived and should never be synced locally, even though it's routinely accessed for historical content and we had taken advantage of the SharePoint permissions model and the client to manage this is a way that to us was an elegant and easy to manage solution.

 

If you're using gapped permissions in a similar manner with clients actively deployed, be aware that there are code changes coming down the pike that could cause clients to get overloaded and fail if they are suddenly presented with significantly more data that had been planned for.

 

They are actively evaluating this issue now, but I don't have an update yet on what the path forward will look like, as there are legitimate use cases for both wanting the client to see gapped content and not.

Thanks for detailed finding and explanation. By any chance are you using RMS for your OneDrive for Business or SharePoint. Would love to hear your thoughts.
Not at the moment no, though it’s on our roadmap. We are in early planning stages for an RMS rollout, part of the problem is that we are leveraging a wide number of other Microsoft technologies such as Cloud App Security, and we want to make sure that we take advantage of the full ecosystem across CAS, Azure RMS, SharePoint, Windows Information Protection, etc. I’m sure once we get that far I’ll have more to share :)
@Dustin Adam This is a great write up. I am wondering if there is a way to prevent users from syncing libraries with the ODFB client, or maybe a way to limit what libraries are ok to sync with and others not. 

Document library cleanup is something i have been dealing with for years. my thought is that if you have large libraries with so many objects in them, that it would make sense to start moving stale content to a different library?
Now that I am thinking about it, would creating an active document library and an archive document library serve you? For example, you could technically use the Information Management Policies to move documents that haven't been touched in X amount of time to an archive library, thuse freeing up the total amount of objects? With the new persistent urls in SPO, you wouldn't need to worry about links breaking.
Perhaps i am completely oversimplifying the complexity of your library. But if microsoft isn't providing a way to limit what you can sync to, then we would have to macgyver this out?

Nando
Hey Nando;

As it happens I managed to get in contact with the product development team and they’ve been working with us over the last month or so regarding sync and large libraries.

Long story short: they are actively making changes to the sync client based on our feedback and use case that will see the upper limits raise fairly substantially. In addition, there is a neat little trick we learned from the SharePoint product team:

If you don’t want to restrict sync for an entire library, but only a subset of content inside a library, you can set “Restricted View” permissions on a parent folder, yet still add higher permissions lower down in the tree. This causes the sync client to ignore all content below the Restricted View permission, effectively allowing you to segregate content inside a library that you want to allow the local sync client to bring down.

We have an NDA with Microsoft so I can’t disclose any detail, but large library sync will be getting better within the next few months, with more to come later in the year.

@Dustin Adam 

@Microsoft OneDrive team

 

Hi Dustin,

 

many thanks for your write-up and efforts taken. Funny enough I do not have an issue with the Business client so I used Office365 with OneDrive privately. I have close to 900GB of data and over 400000 files (and this still does not count in my recorded TV library with another 7TB).

 

I am regularly having issues with the sync hanging and CPU loaded to 50 - 70% (on a 3.2GHz Quad core machine with 16GB RAM...). Restarting the client takes a long time before it takes up its work again an is only a temporary solution.

Since I am on a slow network (not of my choosing, it just isn't possible at this small village I am living in) files on demand is not an option. Also I really would like to have almost all the files on all machines (5 total) at all times (with exception of the Video and music library on my 2 laptops).

One thing that seems to hang OD client for sure:
It really doesn't like similar file names with the German character "ß" (which happens often in my Music library).

i.e.

fussball.jpg

fußball.jpg 

 

Similar names seem to be looking the same to the OD client and it will hang! Once I deleted the similar files (doesn't matter if I do it to the file with "ss" or "ß"!) syncing goes on smooth(er). This is a real problem in German speaking countries!

Maybe other countries have similar letters which also cause this issue.

 

Also it seems to hang often on my ".odg" files (Open office draw / Libre Office file format in compressed vector drawing format) which I use extensively for my renovation work. Highly annoying if you find out that a file has not been synced and now you are left with 2 drawing files (with 20 layers) and you do not know/ can't find out what you have changed on either of them... And we are not talking large file size, it is just below 200KB!

Teamviewer has saved my a.. in many occasions because of a mess up by OneDrive. For this reason I often leave my computers on all the time. Not really great on money, material and environment.

 

Which brings me to another topic:

Why isn't it possible for OneDrive to "see" what folder you are in/ files used last and sync those folders/ files with higher priority?

Enable marking certain folders as priority over others manually?

Or even better make a "force sync on this file" option available.  This would really be useful if you are on a slow network or if you have to shut-down your laptop (sometimes due to low battery and/ or plane boarding). Especially in Business environment. I often had to wait around for 30 minutes or more to wait for OD to pick-up that one particular file that I needed on a Business trip overseas or wanted to have it online to share it with my colleagues.

Great post @Dustin Adam - thanks very much for taking the time to write it all up.

I've been struggling with a migration that we did for a client earlier in the year. We moved all their documents from an old on-prem file server into various Sharepoint libraries, and have them synced down to the client workstations with OneDrive / Files on Demand. The libraries are big (199k items), but still well inside the limits of the service. However, we've been experiencing all manner of problems with using it in practice - OneDrive sync agent errors, very slow sync, etc. What you've outlined above goes a long way to explaining why we've been seeing the problems that we have.

I'm glad that you've been working with the OneDrive team on this, and I'm REALLY looking forward to the product updates that you've alluded to. They can't come quickly enough for me.

Cheers,
Matt

@Matt_Ignite 

 

They are working towards higher limits. Part of the problem stems from the fact that they were trying to bring the Cloud Experience down to the desktop with the highest fidelity possible. By most measurements, they succeeded. For example, renaming or moving files can be completed without downloading the file to the desktop, it simply brokers the API calls that complete the action in OD or SharePoint. This is super cool, but it begins to fall over when the number of objects starts to increase.

 

Windows itself was never designed to really support this sort of integration, so there isn't any robust or elegant way to poll the drive and ask "What Changed?". This leaves it up to the client itself to have to scan periodically for changes to Files on Demand content, even if it hasn't been downloaded, because action can still be taken on those objects.

 

While I can't give you any specific number, I can tell you that the total number of objects their client is about to be able to handle without crashing is significantly higher. But that's really just the first step, while the client is getting more bullet-proof when handling large numbers of objects, there is still more work to do on performance. The more objects you bring down, the longer it will take for the client to detect changes and complete a replication either to or from the cloud.

 

They are making significant headway, and I can assure you they care more than you might imagine about the limits the client can handle, hang in there, it's getting better ;)

Thank you very much @dustin adams for starting this thread!
Could you post the link to the "documented" 300,000 library ceiling recommendation?
Our computers are blue-screening from the load of OneDrive, and this is without hydrating any files!

Also if anybody has other references on so-called "gap folders" and the behavior on "restricted view" permissions?

@Diego Vasquez ;

 

All the current technical documentation regarding the Sync Client and it's limits can be found here:

 

https://support.office.com/en-us/article/invalid-file-names-and-file-types-in-onedrive-onedrive-for-...

 

Interestingly, and I'm assuming this is based on feedback, the upper sync limit has been revised down from 300,000 files to 100,000.

 

There isn't any formal documentation regarding the behavior of Restricted View and it's impact on the sync client, but in our testing it does honor the underlying permission limits defined by the permission level:

 

Restricted View.PNG

The critical thing to understand with this is that this is a recursive permission, this permission will cascade down the rest of the directory tree underneath where this permission is defined, regardless of any other permission a user may have (Even if you have broken inheritance further down). In our use case, we applied this permission to our "Archive" folders inside our document libraries, effectively allowing our colleagues to access archived content in the browser, but restricting their ability to download it with the sync client. Not only does this help protect the archived content, it relieves stress on the client.

 

I would recommend that anyone currently struggling with over-loaded clients take a second look at cloud content and determine if any of the data in a Document Library really NEEDS to be synced, if not, there is probably method to help relive client stress by implementing Restricted View on parent folders that hold content that can effectively remain cloud-only and then re-assign contribute permission below that level.

We still face issues with large list even if we sync a single folder. Any update on this?

@jab365cloud ;

 

There are a lot of variables involved that can impact sync performance. One thing to note however:

 

The OneDrive Team has been working on a method to sync individual folders that will be more performant, its related to the notion of "Sync Root". Essentially, SharePoint currently only allows the "Sync Root" to be the root of a Document Library. How this manifests in practice is that the Sync Client will download ALL the metadata regarding the entire library, even if you're only syncing a single subfolder. When they do manage to finish the work to allow SharePoint to set the Sync Root to a different level, you'll see improved performance when synchronizing a single subfolder.

 

However, another thing to keep in mind (that my organization learned the hard way), is that large libraries place additional pressure on the SQL infrastructure behind the scenes. Even if everything is working, you'll notice that all operations against a larger library will be slower across the board.

 

We are in the process of breaking up our larger libraries into smaller ones to ensure that we can maintain better performance in web browsing, sync, etc.