Limitations of Predictive Indexing

Silver Contributor

I noticed that automatic indexes were not present in several document libraries so I opened a ticket with MS Support (SRX618020694115495ID ) and they provided me the following explanation: 

 

" I am sharing the expected conditions for the feature to work since we do not have any documentation explaining this. " 

Expected conditions when predictive indexing may not work:

1)       If content is migrated using 3rd party tools.

2)       Large number of items are created in quick succession without using filters / sorting / views.

3)       If Automatic indexing is disabled in list or library settings from Advanced settings.

 

Needless to say, I was not expecting this limitation and is a major surprise to me. Has anyone else seen this behavior? 

 

Can anyone from Metalogix, AvePoint and/or ShareGate @Benjamin Niaulin comment?

 

@Chris McNulty none of the recorded presentations I have watched gave me this expectation. Can you provide some more context?

 

The fact that the roll-out of this feature is completed and there is still no documentation in support.office.com is all too typical and long-standing source of frustration to many of us.

13 Replies

Hey Dean,

 

Very interesting... We have not been made aware of any limitations.

In fact, we use the Microsoft Migration API to upload documents so we don't have much control over the import and the predictive indexing.

I am headed to Microsoft's office in 2 weeks, I'll be sure to ask.

 

Essentially, Sharegate drops all content for the migration API to pick up and import into the various libraries with the right metadata. So I am not sure about the first point "content migrated using a 3rd party tool". I could see large number of items are created in quick succession being the reason for it however. But again, not much we could do as we are leveraging the migration API.

 

Keep us posted, I will do the same :)

MS Support also stated the following:

 

Also, by default, you may not be able to add columns to indexed columns once the list or library threshold is crossed.

Hence, if you want indices to be created after the threshold is crossed, the only available options is to bring the items below 5000 and create the index. You can then contact us for restoring all the items from recycle bin deleted by a specific account after a particular date and time.

 

I am sharing some best practices when using 3rd party tools for data migration. However, as this is not a supported scenario for our break fix team, we cannot assist you on this concern.

Please refer : Avoid getting throttled or blocked in SharePoint Online

To avoid getting throttled when migrating data:

1) Don't retry in a tight loop 

  • SPO will charge for the small amount of time/ CPU needed to determine if the usage is over quota.
  • If the customer retries in a very tight loop, these small amounts can sum up to make them remain over quota.
  • Generally the response headers have a back-off time that tells you how long to wait before the next retry.

2) If users keep hitting the throttle limit, see if there is a way to make lighter-weight calls. If not, implement a wait-and-retry loop.

3) Quotas are by user or app id. If they are not connecting via an app id, use a different user login.

I think the intention of the statement regarding 3rd parties is that you may be placing a large number of items within the list quickly (and hopefully through Azure, not directly to SPO). As noted, the Views/Filters need to be present for the automatic list index to kick in, but you also must be under LVT.

The list view threshold that applies when adding indexes is 20,000 now though I thought? and not 5000 as quoted by the support guy?

LVT is still 5k. But you won't necessarily hit the LVT just because you have >5k items in a list (on-prem 2016 or online). Automatic indexing doesn't kick in until 2500 items.

No I meant the LVT that applies to index creation is now 20,000. The support guy stated that you needed to bring the list items under 5000 which isn't true.

 

"Hence, if you want indices to be created after the threshold is crossed, the only available options is to bring the items below 5000 and create the index. You can then contact us for restoring all the items from recycle bin deleted by a specific account after a particular date and time."

 

Paul.

Here's a reply from a product manager working on large list updates:

 

Predictive indexing applies to lists between 2500 and 20,000. End-users may also manage indexes in lists as large as 20,000.

 

If a large number of items is created in a short time, whether by third-party tool or some other means, it's possible that the predictive indexing may miss the chance to manage indexes for the list. After the list has grown larger than 20,000 items, predictive indexing will no longer make changes to it.

 

Predictive auto-indexing is enabled by default. It may be disabled for a specific list in Advanced settings.

I find it interesting that for something that was made such a deal of at ignite that this feature, despite something that that is fully deployed is a topic 1) the help desk is ill equipped to discuss with any level of knowledge and 2) something that still feels incomplete and a "not as advertised" feature.

 

Predictive indexing and it's buzz seems like a big bait and switch to me.

Was there actually every any progress on this? We have several large lists we would love to move to online from on-prem!

Hey everyone,

 

As of Spring of 2018, you can add/remove indexes from lists of any size in SPO. Having more than 20,000 items should no longer block adding/removing of indexes. We are working on enabling predictive indexing of lists larger than 20K items as well so that any views for these lists will automatically add indexes in the background.

 

Adding indexes on the fly when you sort by an unindexed column will only work for lists smaller than 20,000 items for the moment. We are trying to see how we can make that happen in the coming months.

@Kerem Yuceturk 

 

In SharePoint Online, we have two document libraries with 7,000 and 12,000 items respectively, and we are unable to use indexes to sort or filter on the following column types: managed metadata, name (e.g. Modified By).

 

We have tried the following: created an index for these column types prior to upload of files; created a view with a sort against one of these column types and then added files, triggering automated indexing; created an index after upload of files. In all of these cases, we are unable to sort by these column types, and if we switch to a view which sorts against these column types, we get an error referencing the list view threshold.

 

We are successfully able to do both things with indexed columns based on date or text.

 

Is there any known limitation for these field types in reference to indexing? Would the method of import (all at once via API) be a potential source of the issue? I doubted the latter since the behavior is not consistent across column types.

 

Thanks for your help with this!

Hi @micro99999, this is indeed a limitation for these types of columns due to how they are implemented. Here's the relevant piece from our support article:

Columns with column types people, lookup or managed metadata can cause list view threshold errors when sorting. However, text, number, date and other column types can be used in the first sort.

 

Hopefully it is possible to work around this by sorting by ID or another field of type text, number or date, and then using filters for these values?

@Kerem Yuceturk 

 

Thanks for the reply. Just to verify, is it never possible to sort by managed metadata columns regardless of sort settings or indexes? For instance, the documentation states: "However, text, number, date and other column types can be used in the first sort." Does this mean a secondary sort may work?