Today, we are announcing General Availability (GA) of native implementation of APPROX_PERCENTILE in Azure SQL Database and Azure SQL Managed Instance. We announced preview of these functions in October 2022. Since then, many customers have adopted these for the applications where response time of percentile calculation was more important than the accuracy of the result.
These aggregate functions (APPROX_PERCENTILE_CONT & APPROX_PERCENTILE_DISC) need lesser memory and lesser CPU as compared to their precise analytic counterparts (PERCENTILE_CONT & PERCENTILE_DISC). Here is a quick example to demonstrate this.
To follow along, you would need to use WideWorldImportersDW sample database and increase the number of rows using below script.
For this demo, we have imported WideWorldImportersDW-Full.bacpac to a Azure SQL Database – Business Critical tier and made it bigger by using script mentioned above. Let's start with looking at the 95th percentile for order total (including tax) by employee using PERCENTILE_CONT
DBCC DROPCLEANBUFFERS
GO
SET STATISTICS TIME,IO ON
GO
SELECT DISTINCT de.Employee
, PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY [Total Including Tax]) OVER (PARTITION BY de.Employee) AS [95 percentile CONT]
FROM Fact.OrderHistory AS foh
INNER JOIN Dimension.Employee AS de ON foh.[Salesperson Key] = de.[Employee Key]
ORDER BY de.Employee;
GO
SET STATISTICS TIME,IO OFF
Here is the output (only partial shown to save some electrons).
Statistics IO
Table 'Employee'. Scan count 1, logical reads 5, physical reads 1, page server reads 0, read-ahead reads 10, page server
Table 'OrderHistory'. Scan count 9, logical reads 101000, physical reads 0, page server reads 0, read-ahead reads 100937
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server
Table 'Worktable'. Scan count 44, logical reads 21189783, physical reads 0, page server reads 0, read-ahead reads 19097,
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server
Statistics time
SQL Server Execution Times:
CPU time = 50469 ms, elapsed time = 12969 ms.
Now let's look at how to achieve the same thing with APPROX_PERCENTILE
DBCC DROPCLEANBUFFERS
GO
SET STATISTICS TIME,IO ON
GO
SELECT de.Employee
, APPROX_PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY [Total Including Tax] ) AS [95 percentile APPROX_CONT]
FROM Fact.OrderHistory AS foh
INNER JOIN Dimension.Employee AS de ON foh.[Salesperson Key] = de.[Employee Key]
GROUP BY de.Employee
ORDER BY de.Employee;
GO
SET STATISTICS TIME,IO OFF
Statistics IO
Table 'OrderHistory'. Scan count 9, logical reads 101000, physical reads 0, page server reads 0, read-ahead reads 100937, page serve
Table 'Employee'. Scan count 0, logical reads 202, physical reads 3, page server reads 0, read-ahead reads 0, page server read-ahead
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 101, page server read-ahea
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead r
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead
Statistics time
SQL Server Execution Times:
CPU time = 11096 ms, elapsed time = 1583 ms.
If we compare the elapsed time (in Statistics time output), it is ~1.5 seconds vs. ~13 seconds. IO also has significantly reduced using approximate function (in Statistics IO output). Since each query plan has parallelism, we are seeing more CPU time than elapsed time.
Here is the memory used by percentile (taken from XML plan)
<MemoryGrantInfo SerialRequiredMemory="3072" SerialDesiredMemory="383784" RequiredMemory="27080" DesiredMemory="407816" RequestedMemory="407816" GrantWaitTime="0" GrantedMemory="407816" MaxUsedMemory="204872" MaxQueryMemory="6471920" />
and from approx_percentile query plan
<MemoryGrantInfo SerialRequiredMemory="1536" SerialDesiredMemory="2944" RequiredMemory="13512" DesiredMemory="14920" RequestedMemory="14920" GrantWaitTime="0" GrantedMemory="14920" MaxUsedMemory="3640" MaxQueryMemory="6471920" />
Above snippets show low memory requirement of approx_percentile family of function.
Here are the links to documentation for approximate percentile functions to learn more.