Forum Discussion
Limitations of Predictive Indexing
Hey Dean,
Very interesting... We have not been made aware of any limitations.
In fact, we use the Microsoft Migration API to upload documents so we don't have much control over the import and the predictive indexing.
I am headed to Microsoft's office in 2 weeks, I'll be sure to ask.
Essentially, Sharegate drops all content for the migration API to pick up and import into the various libraries with the right metadata. So I am not sure about the first point "content migrated using a 3rd party tool". I could see large number of items are created in quick succession being the reason for it however. But again, not much we could do as we are leveraging the migration API.
Keep us posted, I will do the same :)
- Dean_GrossFeb 20, 2018Silver Contributor
MS Support also stated the following:
Also, by default, you may not be able to add columns to indexed columns once the list or library threshold is crossed.
Hence, if you want indices to be created after the threshold is crossed, the only available options is to bring the items below 5000 and create the index. You can then contact us for restoring all the items from recycle bin deleted by a specific account after a particular date and time.
I am sharing some best practices when using 3rd party tools for data migration. However, as this is not a supported scenario for our break fix team, we cannot assist you on this concern.
Please refer : Avoid getting throttled or blocked in SharePoint Online
To avoid getting throttled when migrating data:
1) Don't retry in a tight loop
- SPO will charge for the small amount of time/ CPU needed to determine if the usage is over quota.
- If the customer retries in a very tight loop, these small amounts can sum up to make them remain over quota.
- Generally the response headers have a back-off time that tells you how long to wait before the next retry.
2) If users keep hitting the throttle limit, see if there is a way to make lighter-weight calls. If not, implement a wait-and-retry loop.
3) Quotas are by user or app id. If they are not connecting via an app id, use a different user login.
- Feb 21, 2018I think the intention of the statement regarding 3rd parties is that you may be placing a large number of items within the list quickly (and hopefully through Azure, not directly to SPO). As noted, the Views/Filters need to be present for the automatic list index to kick in, but you also must be under LVT.
- Feb 21, 2018
The list view threshold that applies when adding indexes is 20,000 now though I thought? and not 5000 as quoted by the support guy?
- Tom FranksApr 23, 2018Copper ContributorWas there actually every any progress on this? We have several large lists we would love to move to online from on-prem!
- Kerem YuceturkAug 23, 2018Microsoft
Hey everyone,
As of Spring of 2018, you can add/remove indexes from lists of any size in SPO. Having more than 20,000 items should no longer block adding/removing of indexes. We are working on enabling predictive indexing of lists larger than 20K items as well so that any views for these lists will automatically add indexes in the background.
Adding indexes on the fly when you sort by an unindexed column will only work for lists smaller than 20,000 items for the moment. We are trying to see how we can make that happen in the coming months.
- micro99999Oct 01, 2019Copper Contributor
In SharePoint Online, we have two document libraries with 7,000 and 12,000 items respectively, and we are unable to use indexes to sort or filter on the following column types: managed metadata, name (e.g. Modified By).
We have tried the following: created an index for these column types prior to upload of files; created a view with a sort against one of these column types and then added files, triggering automated indexing; created an index after upload of files. In all of these cases, we are unable to sort by these column types, and if we switch to a view which sorts against these column types, we get an error referencing the list view threshold.
We are successfully able to do both things with indexed columns based on date or text.
Is there any known limitation for these field types in reference to indexing? Would the method of import (all at once via API) be a potential source of the issue? I doubted the latter since the behavior is not consistent across column types.
Thanks for your help with this!