Lesson Learned #487: Identifying Parallel and High-Volume Queries in Azure SQL Database
Published May 15 2024 03:57 AM 1,291 Views

Today, I worked on a service request that our customer needs to monitor and optimize queries that can significantly impact performance. This includes both queries running in parallel and those that, although executed with a single thread (MAXDOP 1), have a high volume of executions. In this article, we will explain a query that helps us identify these two types of queries in an Azure SQL Database.

 

The provided SQL query uses Dynamic Management Views (DMVs) in SQL Server to group and analyze execution statistics for queries. Let's break down and explain each part of this query:

 

WITH QueryStats AS (
    SELECT 
        query_hash,
        SUM(total_worker_time) AS total_worker_time,
        SUM(total_elapsed_time) AS total_elapsed_time,
        SUM(execution_count) AS execution_count,
        MAX(max_dop) AS total_dop
    FROM 
        sys.dm_exec_query_stats
    GROUP BY 
        query_hash
)
SELECT 
    qs.query_hash,
    qs.execution_count,
    qs.total_worker_time,
    qs.total_elapsed_time,
    qs.total_dop,
    SUBSTRING(st.text, 
              (qs_statement.statement_start_offset/2) + 1, 
              ((CASE qs_statement.statement_end_offset 
                  WHEN -1 THEN DATALENGTH(st.text) 
                  ELSE qs_statement.statement_end_offset 
               END - qs_statement.statement_start_offset)/2) + 1) AS query_text
FROM 
    QueryStats qs
CROSS APPLY 
    (SELECT TOP 1 * 
     FROM sys.dm_exec_query_stats qs_statement 
     WHERE qs.query_hash = qs_statement.query_hash) qs_statement
CROSS APPLY 
    sys.dm_exec_sql_text(qs_statement.sql_handle) AS st
ORDER BY 
    qs.total_worker_time DESC;

Breakdown of the Query

  1. Common Table Expression (CTE): QueryStats

    • This part aggregates statistics from the sys.dm_exec_query_stats view, which contains performance statistics for cached query plans.
    • query_hash: A hash value used to identify queries that are similar in structure.
    • total_worker_time: Total CPU time used by the query.
    • total_elapsed_time: Total time taken for the query to execute.
    • execution_count: Number of times the query has been executed.
    • total_dop: Maximum degree of parallelism (DOP) used by the query.
  2. Main Query:

    • qs.query_hash, qs.execution_count, qs.total_worker_time, qs.total_elapsed_time, qs.total_dop: These columns are selected from the CTE QueryStats.
    • query_text: Extracts the text of the query using the substring function, which extracts a portion of the query text from the sys.dm_exec_sql_text.
  3. CROSS APPLY:

    • qs_statement: Retrieves the top row from sys.dm_exec_query_stats  where the query_hash matches, providing detailed information for each query hash.
    • st: Retrieves the SQL text of the query using sys.dm_exec_sql_text with the SQL handle from qs_statement. 
  4. ORDER BY:

    • The final result is ordered by total_worker_time in descending order, showing the most CPU-intensive queries at the top.

How This Helps Identify Performance-Impacting Queries

  • Parallel Queries: Queries with a high total_dop value indicate parallel execution. While parallel execution can speed up individual queries, it can also lead to resource contention, especially if many queries run in parallel simultaneously.
  • High-Volume Queries: Queries with a high execution_count value but low total_dop are typically executed with MAXDOP 1. If these queries are executed frequently, they can still significantly impact the overall performance of the database.

By identifying both types of queries, database administrators can take actions such as query optimization, indexing, or adjusting the degree of parallelism settings to improve overall database performance.

1 Comment
Version history
Last update:
‎May 15 2024 03:57 AM
Updated by: