Need some advice on working with a large document library

Brass Contributor

Hi all

 

we are on SharePoint Online, and want to store an archive with 50000 PDF files in it. The library should be searchable on the document title. I already tried some things but I'm running into problems.

 

I tried storing all the documents in a single, separate library called archive. Before uploading I created an indexed column called documentTitle. I also created a Flow that triggers on document creation, and fills the documentTitle column with the document name. When I started uploading, this worked, but only for the first xx documents (I didn't count them). I guess I hit a SharePoint threshold limit with the Flow. No error messages, Flows are reported as successful.

 

So, the next day I created a flow that selects 500 documents with empty documentTitle column and fills the column. This worked for 2500 documents. I thought I hit the same SharePoint threshold. On day 3 I tried to rerun the flow of day 2, and although it states to be successful, nothing happens because SharePoint doens't return any documents with an empty documentTitle.

 

Also, none of the documents are returned when searching the site.

 

It appears that you can't really work with libraries with more than 5000 documents. I searched this community but didn't find a fitting answer to my problem.

 

Right now, I'm thinking of creating about 30 document libraries, by letter of afphabet and splitting the big letters, so that none of the libraries contain more than 5000 documents. Since the archive grows every year, that would mean I'd have to shuffle the documents around what sounds like a nightmare...

 

Do you have some advice for me on this? Something I haven't tried yet? How would you tackle this? Maybe a smarter way of splitting the documents? Any advice would be helpful!

5 Replies

Hi @Jille Floridor ,

 

We have document libraries with 250,000 items in them. Search works fine if no folder has more than 5000 items in it.  I am pretty sure this will work but worth testing on the first few letters but create a folder for each letter of the alphabet. We have also done that in another large document library. 

 

Andy

Yeah, not sure why you are creating the other column and populating, the search should pull the file name as is from the name column. I do try to split my documents into sub folders by date as well but the libraries are made to handle 30 million documents, folders are not required.

Hi @Chris Webb ,

 

I agree folders are not strictly required but if a view or Folder is returning/has more than 5000 items (roughly) the following error is returned - 

 

 

Something Went Wrong.PNG

I haven't see it is possible to just have 50,000 files returned by the default "All documents" view without putting those files in folders or applying meta data to cut down what is returned.  I got this error by turning on "Show items without folders" in the view settings.

Hmm, I have a library with well over 5000 items and the no folder view works just fine. I do have a handful of indexes thou. Make sure that view does not have any filters applied.

Hi @Chris Webb an @Andrew Hodges 

 

Thanks for your replies!

 

The idea behind the additional columns was to use them for filtering, since search didn't return any results. I didn't mention I'm filling two more columns. I split the filename to fill a name, firstname and a date column.

 

I'm going to create a set of folders and try and upload the files in those folders. See how that works out.