Tracking Down Missing Indexes in SQL Azure

Published Jan 15 2019 01:59 PM 1,500 Views
Microsoft
First published on MSDN on Jun 16, 2011

One of the challenges of SQL Azure is that not all of the TSQL that you are used to using is supported yet.  Since the underlying engine is plain ole’ SQL Server, the engine can understand the TSQL, but we just block its use because we haven’t yet made it work in the multi-tenant, multi-server environment that is SQL Azure.

One of the classic missing index scripts can be seen in Bart Duncan’s classic post .  For simplicity, I have reposted the TSQL below:

1: SELECT

2: migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) AS improvement_measure,

3: 'CREATE INDEX [missing_index_' + CONVERT (varchar, mig.index_group_handle) + '_' + CONVERT (varchar, mid.index_handle)

4: + '_' + LEFT (PARSENAME(mid.statement, 1), 32) + ']'

5: + ' ON ' + mid.statement

6: + ' (' + ISNULL (mid.equality_columns,'')

7: + CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN ',' ELSE '' END

8: + ISNULL (mid.inequality_columns, '')

9: + ')'

10: + ISNULL (' INCLUDE (' + mid.included_columns + ')', '') AS create_index_statement,

11: migs.*, mid.database_id, mid.[object_id]

12: FROM sys.dm_db_missing_index_groups mig

13: INNER JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = mig.index_group_handle

14: INNER JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle

15: WHERE migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) > 10

16: ORDER BY migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans) DESC



Unfortunately, if you try to use this TSQL, you immediately run into the problem that none of the DMVs are supported in SQL Azure.  So much for the easy way…



Since the DMVs are just ongoing collections of information that you can collect manually from dm_exec_query_stats, I decided to try to build this up manually.  This led me to generate the following query:






1: SELECT top (50) cast(replace(cast(qp.query_plan as nvarchar(max)),'xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan"','') as xml),

2: qp.query_plan.value('declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/showplan"; (/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple/QueryPlan/MissingIndexes/MissingIndexGroup/@Impact)[1]' , 'decimal(18,4)') * execution_count AS TotalImpact

3: FROM sys.dm_exec_query_stats qs cross apply sys.dm_exec_sql_text(sql_handle) st cross apply sys.dm_exec_query_plan(plan_handle) qp

4: WHERE qp.query_plan.exist('declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/showplan";/ShowPlanXML/BatchSequence/Batch/Statements..."]') = 1

5: ORDER BY TotalImpact DESC





This generates a list of ShowPlanXMLs ordered by “execution count * missing index group impact”.  Now that we have a list of ShowPlans ordered by overall impact, we need to parse the ShowPlanXML to pull out the missing indexes.  For those unfamiliar with the missing index information in ShowPlanXML data, here is an example:



<MissingIndexes><MissingIndexGroup Impact="98.6314"><MissingIndex Database="[BugCheck]" Schema="[dbo]" Table="[Watchlists]"><ColumnGroup Usage="EQUALITY"><Column Name="[ID]" ColumnId="1" /></ColumnGroup></MissingIndex></MissingIndexGroup></MissingIndexes>



As you can see, it contains all the information necessary to define the indexes the engine thinks are missing.



Now, for each ShowPlanXML row, we need to use XQuery to shred the MissingIndexes information into its key information.  In a classic case of copying good work already done instead of spending time doing it myself, I found that the Performance Dashboard Reports already do this shredding in one of their reports, so I could copy it:




1: SELECT cast(index_node.query('concat(string((./@Database)[1]),".",string((./@Schema)[1]),".",string((./@Table)[1]))') as nvarchar(100)) as target_object_name

2: ,replace(convert(nvarchar(max), index_node.query('for $colgroup in ./ColumnGroup,$col in $colgroup/Column where $colgroup/@Usage = "EQUALITY" return string($col/@Name)')), '] [', '],[') as equality_columns

3: ,replace(convert(nvarchar(max), index_node.query('for $colgroup in ./ColumnGroup,$col in $colgroup/Column where $colgroup/@Usage = "INEQUALITY" return string($col/@Name)')), '] [', '],[') as inequality_columns

4: ,replace(convert(nvarchar(max), index_node.query('for $colgroup in .//ColumnGroup,$col in $colgroup/Column where $colgroup/@Usage = "INCLUDE" return string($col/@Name)')), '] [', '],[') as included_columns

5: from (select convert(xml, @query_plan) as xml_showplan) as t outer apply xml_showplan.nodes ('//MissingIndexes/MissingIndexGroup/MissingIndex') as missing_indexes(index_node)



By combining the above two queries with a cursor, I can stick each shredded missing index into a temporary table.  Then, I can use the equality, inequality, and included columns from the temporary table to generate CREATE INDEX statements as follows:




1: select distinct 'Create NonClustered Index IX_' + substring(replace(replace(target_object_name,'[',''),']',''), 0, charindex('.',replace(replace(target_object_name,'[',''),']',''))) +' On ' + target_object_name +

2: ' (' + IsNull(equality_columns,'') +

3: CASE WHEN equality_columns IS Null And inequality_columns IS Null THEN ',' ELSE '' END + IsNull(inequality_columns, '') + ')' +

4: CASE WHEN included_columns='' THEN

5: ';'

6: ELSE

7: ' Include (' + included_columns + ');'

8: END

9: from #results



DISCLAIMER:  As with all automated INDEX suggestion scripts, you need take a look at the CREATE INDEX statements suggested and decide if they make sense for you before you run out and apply them to your production instance!!



One important thing to point out is that even though I was designing this script for SQL Azure, it works just fine against an on-premise instance of SQL Server.  Since SQL Azure supports a subset of the overall SQL Server functionality, you will almost always find that a solution for SQL Azure works just fine against SQL Server.  Lastly, this functionality has been added to the CSS SQL Azure Diagnostics Tool (CSAD) so that you don’t have to worry about running this manually if you don’t want to.



For completeness, here is the TSQL statement in its entirety:




create table #results (target_object_name nvarchar(100), equality_columns nvarchar(100), inequality_columns nvarchar(100), included_columns nvarchar(100))



declare @query_plan as xml

declare @totalimpact as float



declare querycursor CURSOR FAST_FORWARD FOR

SELECT top (50) cast(replace(cast(qp.query_plan as nvarchar(max)),'xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan"','') as xml),

qp.query_plan.value('declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/showplan"; (/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple/QueryPlan/MissingIndexes/MissingIndexGroup/@Impact)[1]' , 'decimal(18,4)') * execution_count AS TotalImpact

FROM sys.dm_exec_query_stats qs cross apply sys.dm_exec_sql_text(sql_handle) st cross apply sys.dm_exec_query_plan(plan_handle) qp

WHERE qp.query_plan.exist('declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/showplan";/ShowPlanXML/BatchSequence/Batch/Statements..."]') = 1

ORDER BY TotalImpact DESC



OPEN querycursor

FETCH NEXT FROM querycursor

INTO @query_plan, @totalimpact --need to remove the namespace



WHILE @@FETCH_STATUS=0

BEGIN



insert into #results (target_object_name, equality_columns, inequality_columns, included_columns)

SELECT cast(index_node.query('concat(string((./@Database)[1]),".",string((./@Schema)[1]),".",string((./@Table)[1]))') as nvarchar(100)) as target_object_name

,replace(convert(nvarchar(max), index_node.query('for $colgroup in ./ColumnGroup,$col in $colgroup/Column where $colgroup/@Usage = "EQUALITY" return string($col/@Name)')), '] [', '],[') as equality_columns

,replace(convert(nvarchar(max), index_node.query('for $colgroup in ./ColumnGroup,$col in $colgroup/Column where $colgroup/@Usage = "INEQUALITY" return string($col/@Name)')), '] [', '],[') as inequality_columns

,replace(convert(nvarchar(max), index_node.query('for $colgroup in .//ColumnGroup,$col in $colgroup/Column where $colgroup/@Usage = "INCLUDE" return string($col/@Name)')), '] [', '],[') as included_columns

from (select convert(xml, @query_plan) as xml_showplan) as t outer apply xml_showplan.nodes ('//MissingIndexes/MissingIndexGroup/MissingIndex') as missing_indexes(index_node)



FETCH NEXT FROM querycursor

INTO @query_plan, @totalimpact



END



CLOSE querycursor

DEALLOCATE querycursor



select distinct 'Create NonClustered Index IX_' + substring(replace(replace(target_object_name,'[',''),']',''), 0, charindex('.',replace(replace(target_object_name,'[',''),']',''))) +' On ' + target_object_name +

' (' + IsNull(equality_columns,'') +

CASE WHEN equality_columns IS Null And inequality_columns IS Null THEN ',' ELSE '' END + IsNull(inequality_columns, '') + ')' +

CASE WHEN included_columns='' THEN

';'

ELSE

' Include (' + included_columns + ');'

END

from #results



drop table #results

2 Comments
Version history
Last update:
‎Jan 15 2019 01:59 PM
Updated by: