How many documents is to much for a Document Set and Record center?

Brass Contributor

Hi

 

Hoping someone can advise on a large Correction: document set

 Document center/record centers. How many documents are to much?

 

I am architecting a solution for a client and just want to try and tap into some brain power for relatively large volumes of documents.

SharePoint 2016 Standard Edition.

The volume of invoices they receive per month is +-2500 to 3000.

 

At this point the solution looks like this: 

Invoice lands in a Document Library and with certain metadata.
Kicks-off a Nintex workflow for approvals and some of the topic stored procedures with details exported from their accounting system.

Once the document is approved is where I'm planning for all the limits and thresholds and need some advice from preferable practical experience with volume.

 

Thinking of archiving the invoices after approval to either a Document Center or a Record center and then configure some filtering so that the list view threshold does not become an issue.

What are the limits and threshold differences between a document center and a record center and which one will be better suited?
If we have to keep invoices for 7 to 10 years...which one will better handle +- 400 000 documents and should I plan for some more archiving? Maybe move documents from a Document Center to a record center after 5 years?
Or keep it in the original document library and move it to a record center or document library after +-6 months? (18 000 documents in the document library)

 

How would you design this solution?

4 Replies
In my experience, a document library with document sets is easily doable for this volume.

We do this in SharePoint Online, and have a single library with hundreds of thousands of documents, with root-level document sets, standard folders within each document set, and then the subsequent documents inside of those folders. And a few other similar implementations, with tens of thousands of docs.

The keys are this:
1) Make sure you define any indexes you need on the front end. Use all 20 of them, and if you can't fill them up, consider creating some placeholder names just in case a new requirement in the future drives the need for a new one

2) Know how to script (in powershell) and how to iterate documents, in case you need to make mass changes

3) Leverage Search and Refiners as the primary interface

4) Make sure your root level document set has views to avoid threshold limits, and whatever folder structures you have underneath do the same. I know on-prem you can change the threshold values, in the cloud you can't, but if you do it right it may not matter. I'm not sure any recent updates on prem that might make it easier. If you are gonna automatically archive stuff, consider using subfolder by year or some other type of value.

You could move things to a record center if you want, but technically you dont HAVE to if you dont have a good reason to. A well architected solution will allow you continue to grow.

Technically, it will work, just takes some thought on the front end.

Hi,

first off Brents reply is really good. Second it looks like you did your homework.

I've been working with several IMHO rather large environments e.g.:
- 5.000 new documents every month containing millions of documents
- collaboration environements with tens of millions of documents

What I would like to emphasize is: Know your limits! This is what a foundation is for a building. If you mess that up... #DOH.
Since you didn't mention the version of SharePoint you are using they are:
SharePoint Online - https://technet.microsoft.com/en-us/library/mt842345.aspx
SharePoint 2016 - https://docs.microsoft.com/nl-nl/SharePoint/install/software-boundaries-and-limits-0
SharePoint 2013 - https://docs.microsoft.com/nl-nl/SharePoint/install/software-boundaries-and-limits

Your story isn't clear about your information architecture.
Is het perhaps possible to create multiple archives?
Let's say you need to keep documents of a certain type for 7 years.
Perhaps in the first 2 years there is (e.g) a 15% change that you actually need it.
When the document is 3-7 years old the chances you need it deminish.


Some last ramblings:
Why are you using Document Sets. If you have a good reason: perfect. These could give you nightmares. Suppose you need to restore 1 document from a document set?
Are you allowed to modify the document set because you need 1 document?

To elaborate on the answer Brent gave. Indexes are a very good idea (if you are in SPO you don't need to worry about them, columns are already indexed).
Make you sure SP automaticly creates (sub) folders for archived documents. Then make views based on your metadata (and use the indexed columns).
 
My 2 cents: Check the limits and create an information architecte (including a document life cycle plan)
Maybe these worksheets will help:
http://www.sharepointgeoff.com/planning-worksheets/

Thanks Brent for the great reply. Sorry for the late reply, had a public holiday yesterday. 

 

Going to keep that in mind...Going to have a read through the other replies. 
So...Indexes and views really really important. 

Still, it is possible but is it recommended?

 

Hi Sander

Thanks for the detailed response. Yes, need to do your research, this is my bread and butter. (:


The client is running SharePoint 2016 Standard Edition. Sorry, mentioned it in my first post but somehow that got marked as Spam so had to redo the post.
It is a new clean environment. We did not implement it so still have to verify the documentation we received.

Can we have multiple archives? Yes, we are designing it and if that is the best way we will recommend it to the clients. 

 

"Perhaps in the first 2 years there is (e.g) a 15% chance that you actually need it."
So maybe I must do the following:

Create document Library, Index metadata fields.

Create views

(Just realized that I actually did mention document set in the original post, suppose to be a document center. )

Leave invoices in the document library for 2 years then set a retention policy to move documents to a Document center on a different Content Database. Not sure if want a Content-Type Syndication hub as well so maybe just keep in one DB... 

 
Or must I forget about it and just leave in one document library?


So then the number of documents in Document Library will be: 
74 000+  probably increasing a little every year
450 000+ documents in the document center? (Separate DB hopefully)


So good to know that the document library can handle the volume but is it the best way since im designing the solution at the moment?

Thanks