Performance Benchmark - Azure Synapse Analytics (Data Warehouse)

Sudhir Rawat · ‎May 11 2020

Today, any organization who is ready to spin Azure service always curious of performance, load, design etc. The story is no different with data. With Azure Synapse DWH organization want to make sure they choose right skus and design to benefit customer and themselves. Any organization can come up with following questions

Are we choosing right sku for current and future workload?
Are we utilizing the resources well?
What about the latency?
Based on upcoming request, what will be the impact on allocated resources aka load testing?
How to make sure queries are performing well on DWH design?
How to build baseline matrix?

Well, one of the ways is to monitor (and setup alerts) the service on production environment and adjust the sku. However this may impact the user experience and become hard practice to follow. This blog post try to overcome such problems in advance to avoid any surprises on production environment. The focus will be on building a framework to simulate the load with multiple sql queries and get answer for above questions.

Reference Architecture

In this end to end architecture, we'll simulate the load from different regions with different queries. The outcome will help us understand if we choose right sku and setup good DWH design like table distribution, index, cache etc . Here is the reference architecture of the framework

Architecture diagram

All scripts and step by step guide can be found here. However, below are the steps to build the framework along with few pictures.

Setting up environment

Azure Synapse Analytics (Data warehouse)
Setup Data warehouse
Download and configure Apache Jmeter
Build Java Management Extension (aka jmx) file
Move scripts to Azure storage
Execute work load

JDBC Request Diagram

PowerShell Outcome Diagram

Analyse the Test Results

Investigate result over Azure portal dashboard
Understand the bottleneck and performance

1) Table Distribution

2) Partition strategy

3) Index

4) Cache
6) Views
Apache Jmeter Report

PBI Dashboard Diagram

Query Execution Diagram

MallikarjunaAvula · ‎May 29 2020

Hi @Sudhir Rawat ,

Could you please help how to troubleshoot performance issues in Azure Synapse step by step like SQL server (on-prem db) ?

Sudhir Rawat · ‎May 29 2020

Hi @MallikarjunaAvula ,

I am sure you will agree that document performance issues and how to troubleshoot requires more time and efforts. So I am interested to know what, where and how you need my help.

-Sudhir Rawat

shkumesh · ‎Oct 16 2020

@MallikarjunaAvula

Azure Synapse SQL Monitor gives you a comprehensive insight into your workloads and end-to-end visibility into the performance of your SQL data warehouse, helping you to dramatically reduce the time it takes to isolate and fix data warehouse issues. The monitor provides an intuitive interface on a single pane of glass that visualizes data warehouses and server activity so you can see at a glance where problem areas are.

https://azuremarketplace.microsoft.com/en-us/marketplace/apps/creativesoftware.synapse_sql_monitor_s...

golden83 · ‎Dec 02 2021

Hi @Sudhir Rawat , We have a synapse implemented for marketing automation and want to understand if it can support below use case

We have a sales event happening and a microservice is fired for every transaction. Inside the service, we can to fire a select statement to pick the associated Customer from a table with 50M customers. The "where clause" of the query will use a column with index (Unique Key).

The pick volume we are expecting is ~60 per second (3500/min) queries. Do you think Synapse is designed to handle that kind of load or we should look outside (any suggestion from Azure stack? CosmosDB?)

So basically it will be kind of "Select Col1.. From Customers where UID='ABCD'" fired at a rate of 60 sqls per second at pick.

Products (50)

Special Topics (27)

Video Hub (462)

Most Active Hubs

Most Active Hubs

Video Hub

Performance Benchmark - Azure Synapse Analytics (Data Warehouse)