Hello all,
Last week I did a brief presentation on SQLPort about hinting the database engine. What’s related here is that during the presentation, when I was discussing index hints, an attendee asked me about the usefulness of Filtered Indexes in certain scenarios, and how the database engine would behave differently in SQL Server 2008R2 and SQL Server 2012. I quickly pulled out my filtered indexes demo, but still he was adamant that something had to be different between engine versions.
Later, he was kind enough to send me a piece of code that he stated would generate different query plans in SS2008R2 and SS2012, respectively, namely where SS2008R2 would choose a non-filtered non-clustered index to resolve a simple query, whereas SS2012 would choose the filtered non-clustered index.
To test that, I used SQL Server 2008R2 Build 2769 and SQL Server 2012 Build 2100. I couldn’t find any differences between the generated plans on different engine versions for the given parameters.
Here is a close adaptation of what I was sent. All screenshots are from 2008R2 only.
Create a new database:
SET NOCOUNT ONThen create a table and populate:
Then create filtered and unfiltered indexes:
And the queries to test on, with the original comments:
So here are the plans for this:
Let’s not get carried away by the apparent query cost percentage. This is just an approximation done by SSMS, not the database engine itself.
That said, the I/O statistics show 157 logical reads for the 1st query, 154 for the 2nd. The time statistics show a total execution time of 13ms for the 1st query, and 23ms for the 2nd. Not that different in terms of I/O. What comes across right away is the type of operation. An Index Seek on IX_c2 and the expected Index Scan on IX_c3Filtered.
So let’s look at the statistics related to these indexes. I highlighted a few differences that will help explain why the query optimizer might have chosen to leverage the unfiltered vs. the filtered index:
What we see here is that there is almost no difference in the number of steps in the histograms. That is obviously because the index WHERE clause is not very narrow. Then we see the average key length, 66% larger in the filtered index, because NULLs do not exist here to average it down. Finally a very small difference in the density vector for the key column, with 0.047 for IX_c2 and 0.05 for IX_c2Filtered. Also bear in mind that the filtered index covers almost 40% of all data that is 20.000 rows out of 70.000. The filtered indexes guidelines only refer to the term “subset” when providing insight on filtered index design. In that sense, 40% is hardly a “subset”.
What can also be observed is that the indexes are string based, or tried tree based. Overall we might not have the best candidate for a filtered index. This condition is not narrow enough to leverage the full power of what filtered indexes brought us back in 2008, and is more costly to use it in this scenario.
I ran the example again, but changing one of the strings to an integer, to get a numeric based index.
Create a new database:
Then create a table and populate it:
The indexes again:
And the queries to test on:
Let’s check the execution plans:
The I/O statistics show 37 logical reads for the 1st query, 39 for the 2nd. The time statistics show a total execution time of 14ms for the 1st query, and 15ms for the 2nd. Not much difference both I/O and time wise. As expected, an Index Scan on IX_c2Filtered and an Index Seek on IX_c2.
Let’s look at the statistics related to these indexes. Again, I highlighted a few differences that will help explain why the query optimizer might have chosen to leverage the filtered vs. the unfiltered index.
Again we see almost no difference in the number of steps in the histograms. Then we see the average key length, only 33% larger in the filtered index. The same small difference in the density vector for the key column, with 0.047 for IX_c2 and 0.05 for IX_c2Filtered.
Much smaller differences between both indexes than in the previous example, this time the Query Optimizer leveraged after weighing all factors.
Of course, if we introduced a more narrow index, such as the following:
and queried on:
We would have little doubt the new filtered index would be chosen:
Having filtered indexes in place does not mean a query will always use them. You can get more information on how SQL Server uses statistics
here
.
Statistic information might show, as was the case earlier, that seeking the unfiltered non-clustered index would be effective enough, if the filtered index is not that narrow.
Bottom line, know your data, your index design, its distribution statistics, and test, test, test.
Until next time!
Disclaimer: I hope that the information on these pages is valuable to you. Your use of the information contained in these pages, however, is at your sole risk. All information on these pages is provided "as -is", without any warranty, whether express or implied, of its accuracy, completeness, fitness for a particular purpose, title or non-infringement, and none of the third-party products or information mentioned in the work are authored, recommended, supported or guaranteed by Ezequiel. Further, Ezequiel shall not be liable for any damages you may sustain by using this information, whether direct, indirect, special, incidental or consequential, even if it has been advised of the possibility of such damages.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.