Issue
Incorrect memory grants on certain queries can lead to extremely high CPU spikes in an Azure SQL MI. This is usually expected in high workload production Azure SQL MI systems that have high rate of query execution. The spike in CPU utilization also partially results from the SQL plan cache not getting utilized properly. All of this leads to poor performance from regularly running queries there by leading to poor overall system performance.
Error
Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
Workaround/Fix
The sudden spike in CPU utilization in this case was caused by large memory grants being taken by end user queries, causing large memory pressure on the other in-memory caches like the SQL plan cache (caches sql execution plans) disallowing the plan cache from caching execution plans, causing large number of query recompilations which caused the CPU spikes.
The issue was fixed by increasing the compatibility level of the Azure SQL MI to 150 and thereby enabling the “Memory Grant feature”. This feature limits the memory grants by certain queries allowing the plan cache to grow in memory and thus lowering the compile CPU time which lowered the total CPU time.
Once the compatibility level was increased to 150, the Memory Grant Feedback feature was enabled that dynamically lowered the memory grants requested by the queries.
Here’s a sample of Memory distribution before & after enabling the feature:
We can see the MFG lowering the memory requirements of queries to their actual memory usage. We can also see a huge difference between The CPU usage before and after the compatibility level change.
Before Change CPU Utilization:
CPU utilization After the change (Both Node & Instance CPU utilization has come down):
For Parameter-sensitive plans, memory grant feature will disable itself on a query if it has unstable memory requirements. The memory grant feature is disabled after several repeated runs of the query and this can be observed by monitoring the memory_grant_feedback_loop_disabled extended event.
Memory grant feature can also be enabled without changing the compatibility level of the database, here’s the SQL command that needs to be executed (Currently available only on Azure SQL DB):
ALTER DATABASE SCOPED CONFIGURATION SET ROW_MODE_MEMORY_GRANT_FEEDBACK = ON;
Two new query plan attributes have also been introduced with Row mode memory grant feedback. These are IsMemoryGrantfeedbackAdjusted & LastRequestedMemory which are added to the MemoryGrantInfo query plan XML element.
References:
Monitor performance using DMVs - Azure SQL Managed Instance | Microsoft Learn
Query processing feedback features - SQL Server | Microsoft Learn
Cardinality Estimation (SQL Server) - SQL Server | Microsoft Learn