SOLVED

Migrating from network drive to SharePoint

Copper Contributor

We currently work with a 5TB network drive with 2,400,000 files, adding 15,000 files per month. The drive is organized with one folder per year, then one subfolder per month, etc.

 

I was looking into migrating the network drive to one SharePoint library with on demand files, but I found this page that says "for optimum performance we recommend syncing no more than 300,000 files across all document libraries".

 

So... if I can't use one library and I can't split it in multiple libraries, how do I organize it?

Is SharePoint not the right tool?

18 Replies

@stenci The key word in the statements is SYNCING - don't use OneDrive sync client to pull down libraries where the combined total of items is over 300000 items as that is a known performance issue. Document libraries in SharePoint Online are built to handle large amounts of content - but there are best practices and set up to consider before migrating a large number of files to a single library.

@Timothy Balk Smart caching / on demand file sync is a requirement, so SharePoint is not the right tool for us?

 

Is it possible to configure the library with the sync on demand for the last year and use the hand picked sync for the older folders?

 

For example we could keep available with on demand caching the last one year, or 180,000 files, and the older stuff, without on demand caching, will only be available if the user manually syncs the folder?

Use the SharePoint Migration Tool for this scenario, but be sure you have enough space in your SPO tenant to keep 5 TB of Data

@stenci It's all in your user case. Probably provide more details regarding the why this large of a sync is needed.

 

Honestly, for syncing 2 million plus files I wouldn't consider a valid use case for it because other technologies O365 provides should be considered first. Because they will provide a more robust solution when engineer properly that will scale than this brute force use SharePoint as a dumping ground.

@stenci 

What you'll need is to structure of libraries/folders so that you don't have to "Sync" or access folders/files that may go over the limit. You'll need to take into consideration how many folders/files do you produce in a year, month, etc. separating them as libraries if necessary.

 

Also, another storage may be more suited if you think files/projects are no longer going to be accessed.

 

Fred

@Timothy Balk 

Thanks for your comment. Here is a few more details:

 

Every month we have about 5 new projects, each with 1,000 to 50,000 files, most of them are CAD drawings, plus a small percentage of PDF, JPG and Excel files (with VBA macros, they cannot be used with Office 365). I would say 5,000 to 30,000 files per month are added.

 

Each project lasts 3 to 12 months. We have about 20 projects alive at any given time that require syncing for 2-300,000 files. The syncing is required because we have many CAD, CAM, PLM tools that look for the files on the file system.

 

After a project is completed we could archive it. After that we will access it only if the client looks for a spare part or an addition. No automatic syncing is required here, something like a manual un-archiving would work just fine.

 

A solution with two areas would work:

- one for the live projects with the on demand file syncing; this would still play well with our tools

- one for archived projects that is not automatically synced; we would need a way to bring the project back to life if requested

 

This would be an acceptable compromise to work around the 300,000 file limitation.

 

Redesigning an infrastructure with all the software tools that rely on files being available on a network drive is unthinkable.

> separating them as libraries if necessary

If I understand the documentation the limitation applies even if the files are separated in multiple libraries.
Have you looked at using 1 Office 365 Group per project? You could automate creation using a SP 'projects master' list and a Flow, and perhaps a site design / site scripts to give a customised structure / config to the Group sites.

@Rob Ellis 

I can organize and split the files in different libraries, sites or groups, but I want to be sure that it will solve the 300,000 file limit.

 

Do you know if splitting in groups would work?

I just found this that talks about new features being added to the synchronization system that will allow to pick which folders will be visible or not to the syncing engine. 

 

This could be the solution to my problem... when it becomes available?

It will not solve the 300,000 limit, no. One way to solve that would be to have users only sync libraries for projects that they are actually working on - although depending on which users work on which projects, you may find that some users will struggle to keep below the limit.

Because for an end user, sync is simplest to configure at a library level (e.g. a user either syncs or does not sync a given library), it would make sense to split the content into multiple libraries, rather than a single library.

I'm starting to think that SharePoint is not the right tool for us.

Both the 300,000 file limit and being impossible to force check out are both show stopper for us.

Hi @stenci 

Due to large amount of data we may dig into the below problems.

1) Maintenance of Large amount of database

2) Consider long backup time

3) Search / Indexing shall complete with long delay.

4) Search result may inappropriate due to non indexing.

 

Its always better to categories and maintain required folder hierarchy for your repositories. By dividing docs in to multiple categories, store your document to multiple sites / libraries. In any technology, instead of maintaining large database, its better to divide into multiple chunks. SharePoint fast search have proved better results for regular users.   

best response confirmed by stenci (Copper Contributor)
Solution

@stenci To get around the 300000 file limit it's an issue where you have to throw hardware at it. Balance the load across multiple machines using multiple clients.

 

SharePoint may not be the best place for this. As I said it sounds like this process just using SharePoint as a dumping ground.

 

I would say that if you're required to use SharePoint - this is incurring a lot of technical debt because of the need to use OneDrive client to sync contents as if they were in a local drive. Re-engineer the process to work for the place where you are storing the files there are plenty of different ways to automate the file upload process. I would also be mindful of how many files that are created that are like versioned by adding something to the end of the file because that could be de-duped by using versioning in SharePoint.

 

Whatever the outcome may be document what is going on because IMO - this isn't a process or development that I would want to come into having to figure out what is going on with it.

After reading your response to Timothy, I've seen files that have dependencies which relies on network path. This is also something to consider when using SharePoint as repository.

This may also involve re-thinking your process how files are accessed, in order, to go around the 300K limit as you've already identified - live and archive.

I agree with a lot of responses here. SharePoint isn't a direct replacement for a file share. If you were going to use SharePoint as your document management system for sharing and collaborating directly on documents, I could see it being a potential option if it had a solid information architecture. 

However, it does seem like just trying to get rid of a file share and use SharePoint instead. I would recommend against this, and it's not the most cost effective storage. 

@Rob Ellis 

it would make sense to split the content into multiple libraries, rather than a single library

 

I am trying to consider different options:

- More smaller libraries or fewer larger libraries?

- Is there a maximum number of libraries?

 

My understanding is that there if you have more than 2000 sub sites or 2000 libraries in a site collection, you may experience performance issues.
1 best response

Accepted Solutions
best response confirmed by stenci (Copper Contributor)
Solution

@stenci To get around the 300000 file limit it's an issue where you have to throw hardware at it. Balance the load across multiple machines using multiple clients.

 

SharePoint may not be the best place for this. As I said it sounds like this process just using SharePoint as a dumping ground.

 

I would say that if you're required to use SharePoint - this is incurring a lot of technical debt because of the need to use OneDrive client to sync contents as if they were in a local drive. Re-engineer the process to work for the place where you are storing the files there are plenty of different ways to automate the file upload process. I would also be mindful of how many files that are created that are like versioned by adding something to the end of the file because that could be de-duped by using versioning in SharePoint.

 

Whatever the outcome may be document what is going on because IMO - this isn't a process or development that I would want to come into having to figure out what is going on with it.

View solution in original post