prompt
2 TopicsLesson Learned #530: Comparing Execution Plans to Expose a Hidden Performance Anti-Pattern
One of the most powerful features of SSMS Copilot is how it lets you compare execution plans and immediately show you performance issues. In this case, I would like to share with you my lesson learned comparing two queries and how they behave very differently inside the engine. We have the following queries, these are using a table _x_y_z_MS_HighCPU that contains 4 millon of rows. The column TextToSearch is a varchar(200) datatype. -- Query 1 SELECT COUNT(*) FROM [MSxyzTest].[_x_y_z_MS_HighCPU] WHERE TextToSearch = N'Value: 9'; -- Query 2 SELECT COUNT(*) FROM [MSxyzTest].[_x_y_z_MS_HighCPU] WHERE TextToSearch = 'Value: 9'; Since the query texts are different, each will have a different query ID in Query Store. By running the following T-SQL, for example, I can identify the query IDs. SELECT qsqt.query_sql_text, qsq.query_id, qsp.plan_id, qsp.query_plan_hash, qsp.last_execution_time FROM sys.query_store_query_text qsqt JOIN sys.query_store_query qsq ON qsqt.query_text_id = qsq.query_text_id JOIN sys.query_store_plan qsp ON qsq.query_id = qsp.query_id WHERE qsqt.query_sql_text LIKE '%SELECT COUNT(*)%' -- FROM [[MSxyzTest]].[[_x_y_z_MS_HighCPU]]%' ORDER BY qsp.last_execution_time DESC; Queries 1 and 2 can be compared directly. Using Copilot, I ran the following prompt: Compare the execution plans for the two queries (query id 1 and query id 2 using Query Store. Highlight any differences in operators, estimated vs actual row counts, or implicit conversions. Running the following prompt : CPU Usage: Please, show the top resource-consuming queries in the current database using Query Store data. Include query text, execution count, duration, CPU time, and logical reads. We could see the impact of using an antipattern:154Views0likes0CommentsLesson Learned #525: Tracking Command Timeouts in Azure SQL: Beyond Query Store with Extended Events
A few days ago, we were working on a support case where our customer was intermittently experiencing command timeouts. What made the case interesting was that queries which usually completed in under one second suddenly started taking more than 10 seconds to execute. Since the application — developed in Python using the ODBC Driver 18 for SQL Server — had a command timeout set to 5 seconds, the following error was triggered every time the threshold was exceeded: Error executing command, retrying in 5 seconds. Attempt 1 of 3 with new timeout 5. Error: ('HYT00', '[HYT00] [Microsoft][ODBC Driver 18 for SQL Server]Query timeout expired (0) (SQLExecDirectW)') The application had built-in retry logic, dynamically increasing the timeout in each of the three retry attempts, to allow time for the query to complete and to log enough data for post-error analysis. Example logs from the retry logic: (RunCommandTimeout) - Thread: 39808 - Error executing command, retrying in 5 seconds. Attempt 1 of 3 with new timeout 5. Error: ('HYT00', '[HYT00] [Microsoft][ODBC Driver 18 for SQL Server]Query timeout expired (0) (SQLExecDirectW)') INFO:root:Connecting to the DB jmjuradotestdb1 - Thread id 39808 - (Attempt 1/3) INFO:root:Connected to the Database in jmjuradotestdb1 - Thread id 39808 - 0.0445 seconds --- (RunCommandTimeout) - Thread: 39808 - Error executing command, retrying in 9 seconds. Attempt 2 of 3 with new timeout 9. Error: ('HYT00', '[HYT00] [Microsoft][ODBC Driver 18 for SQL Server]Query timeout expired (0) (SQLExecDirectW)') INFO:root:Connecting to the DB jmjuradotestdb1 - Thread id 39808 - (Attempt 1/3) INFO:root:Connected to the Database in jmjuradotestdb1 - Thread id 39808 - 0.0532 seconds --- (RunCommandTimeout) - Thread: 39808 - Error executing command, retrying in 13 seconds. Attempt 3 of 3 with new timeout 13. Error: ('HYT00', '[HYT00] [Microsoft][ODBC Driver 18 for SQL Server]Query timeout expired (0) (SQLExecDirectW)') (RunCommandTimeout) - Thread: 39808 - Loop:2/5 Execution Time: 9.7537 seconds My first prompt using SSMS Copilot was this "Review the queries that experienced a command timeout or were aborted in the last 30 minutes. Include query text, queryid, duration, and the reason and code for the abort if available." and I got the following results. So, all points that the query 216 got command timeouts. My next question, was, for query ID 216, show the number of total executions reporting that is 28 executions. The response showed 28 executions, but this number didn’t match the number of aborted and non-aborted executions observed in the application logs, why this difference? Checking the table sys.query_store_runtime_stats I found 10 rows all having execution_type = 3, and total executions 28. So, that's mean that Query Store aggregates query execution data over a fixed interval. So, the execution_type is an indicator that at least an execution during this runtime interval was aborted. So, at least several of them were aborted and other not. To obtain a more granular and accurate picture, I created an Extended Events session to capture these events using ring_buffer target. CREATE EVENT SESSION [CommandAborted] ON DATABASE ADD EVENT sqlserver.attention( ACTION ( sqlserver.client_app_name, sqlserver.client_hostname, sqlserver.username, sqlserver.database_name, sqlserver.sql_text ) ) ADD TARGET package0.ring_buffer WITH (MAX_MEMORY = 4096KB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS); GO ALTER EVENT SESSION [CommandAborted] ON DATABASE STATE = START; after reproducing the command timeout scenario again, I was able to see only the aborted executions. So, in this case, 28 executions were executed and 7 executions were aborted. WITH RingBufferXML AS ( SELECT CAST(t.target_data AS XML) AS target_data FROM sys.dm_xe_database_session_targets t JOIN sys.dm_xe_database_sessions s ON t.event_session_address = s.address WHERE t.target_name = 'ring_buffer' AND s.name = 'CommandAborted' ) SELECT x.value('@name', 'varchar(50)') AS event_name, x.value('@timestamp', 'datetime2') AS event_time, x.value('(action[@name="client_app_name"]/value)[1]', 'nvarchar(256)') AS client_app_name, x.value('(action[@name="sql_text"]/value)[1]', 'nvarchar(max)') AS sql_text, x.value('(data[@name="duration"]/value)[1]', 'bigint') AS duration_microseconds, CAST(x.value('(data[@name="duration"]/value)[1]', 'bigint') / 1000000.0 AS decimal(10,3)) AS duration_seconds FROM RingBufferXML CROSS APPLY target_data.nodes('//event') AS tab(x) WHERE x.value('@name', 'varchar(50)') = 'attention' and x.value('(action[@name="client_app_name"]/value)[1]', 'nvarchar(256)') = 'TEST-DataCon' ORDER BY event_time DESC;187Views0likes0Comments