Lesson Learned #483: Optimizing Performance in Azure SQL Database ReadScale Out Databases

Jose_Manuel_Jurado · ‎May 09 2024

Introduction:

Today, I worked on a service request that our customer has a Business Critical database with 4 vCores and Read-Scale Out enabled. Our customer noticed several performance issues using Read-Scale Out database and I would like to explain several lessons learned found during the troubleshooting steps.

Initial Script Execution and Synchronization:

The investigation began with the execution of a script on the primary database, which populated approximately 7,864,320 rows. Synchronization between the primary and secondary databases was successfully completed, ensuring consistency in data distribution.

DROP TABLE IF EXISTS [dbo].[Notes]

CREATE TABLE [dbo].[Notes](
	[ID] [int] NULL,
	[NAME] [varchar](200) NULL,
	[id2] [int] IDENTITY(1,1) NOT NULL,
 CONSTRAINT [PK_Notes] PRIMARY KEY CLUSTERED ([id2] ASC))

-------------------------------------------------
-- Create the store procedure
--------------------------------------------------
CREATE OR ALTER PROCEDURE dbo.GiveNotes
   @N AS INT = null
AS 
SELECT count(Name),name FROM Notes where ID<@n group by Name

--------------------------------------------------
-- Insert data
--------------------------------------------------
INSERT INTO Notes (ID,Name) SELECT RAND()*(100000 - 1) + 1, 'Info:'+convert(varchar(200),RAND()*(100000 - 1) + 1) 

--------------------------------------------------
-- Execute this query several times - 
--------------------------------------------------
INSERT INTO Notes (ID,Name) SELECT RAND()*(100000 - 1) + 1, 'Info:'+convert(varchar(200),RAND()*(100000 - 1) + 1) FROM Notes

Observations on Secondary Database:

Running the following script (we never executed this query previously in the primary) it was observed that temporary automatic statistics were created (sys.stats (Transact-SQL) - SQL Server | Microsoft Learn ) , indicating potential areas for optimization and needed for this workload and will be deleted in case of any reconfiguration/failover. You could see them using the column is_temporary in sys.stats.

Analysis of the execution plan highlighted a recommendation for index creation to improve query performance.

declare @n as int = 0
set @n=RAND()*(100000 - 1) + 1
EXEC dbo.GiveNotes @n

Identifying Performance Bottlenecks:

Further analysis revealed that the query execution was consuming a significant amount of CPU resources on the secondary database. To pinpoint the underlying issues, a query was executed to identify the top CPU-intensive operations.

-- Which Queries are taking the most time/cpu to execute
SELECT TOP 5
     total_worker_time, total_elapsed_time,
     total_worker_time/execution_count AS avg_cpu_cost, execution_count,
     (SELECT DB_NAME(dbid) + ISNULL('..' + OBJECT_NAME(objectid), '')
         FROM sys.dm_exec_sql_text([sql_handle])) AS query_database,
     (SELECT SUBSTRING(est.[text], statement_start_offset/2 + 1,
         (CASE WHEN statement_end_offset = -1
             THEN LEN(CONVERT(nvarchar(max), est.[text])) * 2
             ELSE statement_end_offset
             END - statement_start_offset) / 2
         )
         FROM sys.dm_exec_sql_text([sql_handle]) AS est) AS query_text,
     total_logical_reads/execution_count AS avg_logical_reads,
     total_logical_writes/execution_count AS avg_logical_writes,
     last_worker_time, min_worker_time, max_worker_time,
     last_elapsed_time, min_elapsed_time, max_elapsed_time,
     plan_generation_num, qp.query_plan
FROM sys.dm_exec_query_stats
     OUTER APPLY sys.dm_exec_query_plan([plan_handle]) AS qp
WHERE [dbid] >= 5 AND DB_NAME(dbid) IS NOT NULL
   AND (total_worker_time/execution_count) > 100
--ORDER BY avg_cpu_cost DESC;
--ORDER BY execution_count DESC;
ORDER BY total_worker_time DESC;

Proposed Solutions and Implementation: To address the performance challenges, several actions were proposed:

Removing temporary statistics generated on the secondary database.
Creating the recommended index on the primary replica to enhance query execution.
Executing necessary scripts on the primary replica to ensure optimal performance.

Actions done:

If the indexes is needed, the first topic is to remove statistics using the command: DROP STATISTICS NOTES.[_WA_Sys_00000002_5AEE82B9_readonly_database_statistics]

select * from sys.stats 
inner join sys.tables on sys.stats.object_id = sys.tables.object_id 
where is_temporary=1

Create the index needed in primary replica CREATE NONCLUSTERED INDEX [NOTES_IX1] ON [dbo].[Notes] ([ID] ASC) include (name)
Execute the script in the primary replica if any additional statistics is needed to create or any other topic:

declare @n as int = 0
set @n=RAND()*(100000 - 1) + 1
EXEC dbo.GiveNotes @n

We could see that right now the primary has all the details and secundary will not needed to create additional statistics and will use the index to improve the query performance.

Products (50)

Special Topics (27)

Video Hub (462)

Most Active Hubs