SQL Server Blog

6 MIN READ

Optimize OLTP Performance with SQL Server on Azure VM

Microsoft

Oct 16, 2019

Written by Mine Tokus, Senior Program Manager; Jamie Reding, Senior Program Manager and Sadashivan Krishnamurthy, Principal Architect

Reviewed by Bob Ward, Principal Architect

Most of you are running OLTP applications today with read heavy and complex transactions. The TPC-E benchmark is a reliable representation of modern OLTP workloads and highly recommended to measure OLTP workload performance. The TPC-E benchmark uses diverse transactions and utilizes wide range of data types with 12 transactions acting on 33 tables all exercising the ACID compliance tests. These transactions cover traditional OLTP processing along with some very light analytic queries. In this article, we will share recommendations to optimize SQL Server performance on Azure VMs based on performance testing with scaled down TPC-E [1] benchmark.

Azure Es_v3 series offer price/perf advantages with the optimized memory to v-core ratios and commonly selected by SQL Server customers. We used E64s_v3 VM size as the server for this tests which comes with 64 hyper threaded v-cores on 2.3 GHz Intel XEON ® E5-2673 v4 (Broadwell) processor family, 432 GB memory and 1600 GB of Azure Blob Cache capacity. On the client side, we placed the client in the same region and availability zone as the server. We used DS13-v2’s for all clients and scaled down TPC-E workload to 800,000 customers with data size at 800GB.

Azure Blob Cache is the key for the best price/perf

For the first test we used 2 of Premium SSD P30 disks for SQL Server data file and 1 Premium SSD P20 disk for SQL Server log file. We enabled ReadOnly cache for both P30 disks and created a drive with disk striping for SQL Server data file. Data drive had 10,000 IOPS and 400MB/sec throughput capacity with 2-P30 disks and log drive had 2300 IOPS and 150 MB/sec throughput capacity with 1 P20 disk. We used SQL Server 2017 Enterprise Edition with Windows Server 2016 Data Center with SQL Server defaults except that Lock Pages in memory is enabled and max server memory is set to 90%. Data and Log files are located on separate drives and Temp DB files are located on local SSD (D: drive).

In the first test, workload drove 18,403 IO per second and reached up to 490 TPS (Transaction Per Second) with 256 concurrent users (Table-1). That is a great price/performance optimization level as the paid IO capacity is used only for writes (9,668 write IOPS served by 2-P30 disks) and all reads (8,735 total read IOPS) are served from the blob cache. This is a huge benefit for ready heavy workloads as all the read IOPS are served directly from the host cache with very low response times and did not occur any additional charges.

Table 1- E64s_v3, 2-P30 for data and 1 P20 for Log

Scale up application throughput by simply by adding more storage capacity

The first test was bottlenecked on the writes as 2- P30 disks are throttled at 10,000 IOPS. We repeated this test with 4 P30 disks stripped on a single drive (all disks have ReadOnly cache enabled) to host SQL Server data file and 1 P20 disk for SQL Server log file. The same workload scaled up to 823 TPS with 67% increase on the application throughput. Running 823 TPS (14,672 batch requests/sec) required total 37K IO per second (Table -2). Again, 4 P30 disks are used only for the writes (18,978 writes IOPS) and all reads are served from the free blob cache (17,809 read IOPS).

To scale up the workload and increase the throughput start with adding additional storage capacity until CPU becomes the bottleneck. With 823 TPS on E64s_v3, the CPU was only consumed at 45%, there is still significant bandwidth on this VM to increase throughput by simply increasing storage capacity.

Table 2- E64s_v3, 4-P30 for data and 1 P20 for Log

Maximize throughput with lower storage response times

Premium Disks has great price/performance advantages with ReadOnly caching and monthly low-end storage cost. If your workload requires very low storage response times, then you can use ultra-disk to get microsecond level IO latencies at very high IOPS. Azure ultra-disks deliver high throughput, high IOPS, and consistent low latency disk storage for Azure VMs.

With the same test repeated with SQL server data and log files on Ultra Disk, we got up to 1,489 TPS and utilized all VM resources at peak (the CPU is consumed at 92%). Single ultra-disk provided 60,000 total IOPS (31,431 read IOPS and 29,250 write IOPS) and total 964 MB/sec throughput with microsecond level response times (Table-3).

Table 3- E64s_v3, Ultra Disk for Data and Log

Premium SSD delivers great performance for write heavy workloads

If you are running a write heavy workload with simple transactions, then you can validate performance with the TPC-C benchmark. Hammer DB TPC-C [2] is a tool that can be used to simplify TPC-C benchmarking. We run Hammer DB TPC-C with 1000 warehouses at 100GB total size. Test server is configured with SQL Server 2017 Enterprise edition on E64_v3; data files located on a storage pool stripped over 2-P30 disks (ReadOnly cache enabled), log file is located on a single P20 disk and Temp DB files are hosted on Local disk. With this test, workload reached to 936,896 TPM (Transaction Per Minute) as total IOPS are throttled at 10,000 with 2-P30 disks (Table - 4).

Table 4 – E64_v3, 2 P30 for Data, 1 P20 for Log

Note that, on this run CPU utilization was only at 12%. Can we double the throughput by simply adding more storage capacity? The answer is yes, simply by adding 2 more P30 drives, workload reached to 1.6 million TPM (Table - 5).

Table 5- E64s_v3, 4 P30 for Data and 1 P20 for log.

Write heavy workloads modify data in memory at a frequent rate. Checkpoint process writes the in-memory modified pages (known as dirty pages) and transaction log information from memory to disk. For write heavy workloads, SQL Server performs intensive IO activity to achieve the 1-minute default recovery time.

To show the impact of the checkpoint process on throughput, we repeated the same test with recovery interval set to the maximum value (checkpoint processes is disabled). When recovery interval is set to maximum, the HammerDB TPC-C workload doubled the throughput and reached up to 3,2 million TPM (Table - 6).

Table 6 – E64s_v3, Data on 4-P30, Recovery Time is set to max.

This setting was experimental only and cannot be used on production environments. But recovery time interval can be relaxed to increase the throughout for write heavy workloads modifying millions of pages per minute. If aiming for recovery in ~5 minutes or more is acceptable for the workload, then relaxing the recovery time interval setting will enable higher throughput at the same cost.

With a wide range of VM series and sizes, Premium SSD and Ultra Disks Azure offers many option to design for your workloads’ specific performance requirements. Leverage the performance best practices to boost throughput for your OLTP workload with SQL Server workloads with Azure VMs. Click here to start testing with free SQL Server Developer edition images on Azure VMs.

[1] Scaled down TPC-E model reduces the initial footprint of the database. It is primarily used for ease of benchmarking; we reduced the amount of history rows and then instruct the driver to not search too far back. For our tests, we reduced the number of trade days in the history table from 300 to 30. Please note that results from the scaled down model are non-comparable to the published results.

[2] HammerDB provides an implementation of the TPC-C benchmark. HammerDB TPC-C is derived from TPC-C and as such is not comparable to published TPC-C results, and the HammerDB TPC-C results do not fully comply with TPC-C.

Updated Oct 21, 2019

Version 9.0

sqlserverazurevm

Mine Tokus

Microsoft

Joined September 25, 2018

View Profile

SQL Server Blog

Follow this blog board to get notified when there's new activity

4 Comments

bms231
Copper Contributor
Oct 12, 2020
This isn't real world as the machine you selected has a 1200 MiB/s throughput. I mean this is all fine and well for a company who can afford a 64 core machine and SQL license for it, but getting down to real life the most commonly deployed server used is going to be an 8 to 16 core machine. Striping 2 P30s to get 10K PIOPs sounds great, but let's look at the machine you chose. The E64_v3 has Max uncached disk throughput IOPS - 80000 / MBps - 1200 for 64 cores. A more realistic DS13_V2 has 25600/384.

That 384 number is what I am concerned about. A single P30 disk can do 5000 PIOPs and has a provisioned throughput of 200MiB/s. So right there if you choose a server with lower cores such as the DS13_V2 and stripe 2 P30s your disk throughput already outperforms your VM selection. Since you had room to go you can add more P30s. Very interesting how "doubling" your P30s still only got you 200MiB/s throughput which means you had to have 4 P30s to get the published provisioned throughput of a single P30. Which again begs the question even though you had 4 P30s why did you cap just under the published throughput limit of a single P30 when you should have seen more throughput.
LaurenceSchwarz
Copper Contributor
Apr 02, 2020
Hi Blaine,

Thanks for the courtesy of the reply! Yep. I found out between the time that I posted this question and a few weeks ago that this used to be a big no-no, but like all things Azure it too has changed, and is now not only an acceptable or common practice, it's recommended.

-Laurence
sc_bperry
Copper Contributor
Apr 02, 2020
Hi Laurence,

This is safe since tempdb itself is ephemeral and is recreated upon restart anyway. We have been doing this in production for a year now. The only minor catch is that you need a way to create the tempdb folder upon the server start and grant the appropriate permissions to the MSSQLSERVER service. We handle that by a scheduled task that runs at startup. If you're curious, here's the two actions we run:

%ComSpec% /C IF NOT EXIST "D:\Databases\Data\" MKDIR "D:\Databases\Data\" icacls.exe "D:\Databases\Data" /grant *S-1-5-80-3880718306-3832830129-1677859214-2598158968-1052248003:(OI)(CI)(F)

Obviously, our tempdb files are configured to live in the D:\Database\Data folder.

Also, even though the 😧 drives are technically ephemeral, they survive across a guest OS reboot and I've rarely seen them actually be deprovisioned even when the VM has been deallocated. They will definitely disappear if you redeploy the VM.

Hope that helps,
Blaine
LaurenceSchwarz
Copper Contributor
Mar 03, 2020
Hi,

I am slightly confused. I thought that putting tempdb on an IaaS vm's local D$ drive was a big no-no due to its ephemeral nature. If the vm were to restart, no more tempdb?

Thanks,

Laurence M. Schwarz
Sr SQL Database Administrator
Redapt, Inc.

Blog Post

Optimize OLTP Performance with SQL Server on Azure VM

Azure Blob Cache is the key for the best price/perf

Scale up application throughput by simply by adding more storage capacity

Maximize throughput with lower storage response times

Premium SSD delivers great performance for write heavy workloads