A couple weeks ago while I was onsite, I was looking at some queries that had performance issues. To the customer, one in particular had become a conundrum that could not be easily answered with the usual approach of good indexing, simply because there was already a good covering index for that highly used query. Yet, massive index scans were being issued, even with good solid statistics in place. That’s what was puzzling to them.
To cut it short, the issue was neither in the table nor the index(es), but rather on the query itself. And yes, the query had a WHERE clause that was fairly narrow so nothing quite that obvious. The query was using a string function as the SARG and by design, in SQL Server, an index cannot be used to seek on some more complex expressions, expressions using functions or strings with a leading wildcard character.
Given that, the following expressions will always result in scans:
Let’s see this in more detail using the AdventureWorks2008 database. For this exercise I created an index in the Sales.SalesOrderHeader table. Remember that a statistics object was automatically created for the 1 st key column of the index, in this case “OrderDate”.
Take these these two queries, where the output is the exact same 184 rows each:
Yet, the 1 st will leverage a seek (5 logical and 2 physical reads) while the 2 nd will perform an index scan (200 logical and 4 physical reads). That’s because of the date functions in the WHERE clause that are restricting the optimization choices of the database engine. Here is the execution plan:
So refactoring a query in a less obvious way will yield better results. Let’s try another example. Take the two following queries, where the output is the exact same 73 rows each:
Notice how they both use date functions in the WHERE clause. Yet, the 1 st will leverage a seek (4 logical and 2 physical reads) while the 2 nd will perform an index scan (200 logical and 4 physical reads). Although they both use date functions in the WHERE clause, the optimization restriction will occur only when the function is the search argument, not the search condition itself, which is the case of the 2 nd query. Here is the execution plan:
In the last example, take these two queries, where the output is the exact same 1859 rows each:
At a glance, given what I’ve shown here, I would say the 1 st query will leverage a scan, while the 2 nd will leverage a seek. Let’s check the execution plan:
My guess was wrong here.
Both queries leveraged scans, with 200 logical and 4 physical reads each. In the 1 st query there is a string function as the search argument so the scan is expected. Yet, that is not the case in the 2 nd query, and there is no wildcard in the beginning of the search condition. So why did I have a scan?
Well, the catch is AccountNumber cannot be seeked like in the previous examples. Ranges are found in the date, on the index statistics used here, so a scan is the only choice. Remember what I referenced earlier about statistics ?
That was an interesting talk I though I should share.
Until next time!
Disclaimer: I hope that the information on these pages is valuable to you. Your use of the information contained in these pages, however, is at your sole risk. All information on these pages is provided "as -is", without any warranty, whether express or implied, of its accuracy, completeness, fitness for a particular purpose, title or non-infringement, and none of the third-party products or information mentioned in the work are authored, recommended, supported or guaranteed by Ezequiel. Further, Ezequiel shall not be liable for any damages you may sustain by using this information, whether direct, indirect, special, incidental or consequential, even if it has been advised of the possibility of such damages.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.