Performance Benchmark - Azure Synapse Analytics (Data Warehouse)
Published May 11 2020 08:18 PM 8,903 Views
Microsoft

Today, any organization who is ready to spin Azure service always curious of performance, load, design etc. The story is no different with data. With Azure Synapse DWH organization want to make sure they choose right skus and design to benefit customer and themselves. Any organization can come up with following questions 

 

  • Are we choosing right sku for current and future workload? 
  • Are we utilizing the resources well?
  • What about the latency?
  • Based on upcoming request, what will be the impact on allocated resources aka load testing?
  • How to make sure queries are performing well on DWH design?
  • How to build baseline matrix?
Well, one of the ways is to monitor (and setup alerts) the service on  production environment and adjust the sku. However this may impact the user experience  and become hard practice to follow. This blog post try to overcome such problems in advance to avoid any surprises on production environment. The focus will be on building a framework to simulate the load with multiple sql queries and get answer for above questions.

 

Reference Architecture
In this end to end architecture, we'll simulate the load from different regions with different queries. The outcome will help us understand if we choose right sku and setup good  DWH design like table distribution, index, cache etc . Here is the reference architecture of the framework
 
Architecture diagramArchitecture diagram

All scripts and step by step guide can be found here. However, below are the steps to build the framework along with few pictures.  

 
Setting up environment
  1. Azure Synapse Analytics (Data warehouse)
  2. Setup Data warehouse
  3. Download and configure Apache Jmeter
  4. Build Java Management Extension (aka jmx) file
  5. Move scripts to Azure storage
  6. Execute work load
JDBC Request DiagramJDBC Request Diagram

PowerShell Outcome DiagramPowerShell Outcome Diagram

Analyse the Test Results

  1. Investigate result over Azure portal dashboard
  2. Understand the bottleneck and performance
           1) Table Distribution
           2) Partition strategy
           3) Index
           4) Cache
           6) Views
  3. Apache Jmeter Report

PBI Dashboard DiagramPBI Dashboard Diagram

Query Execution DiagramQuery Execution Diagram

 

4 Comments
Copper Contributor

Hi @Sudhir Rawat ,

Could you please help how to troubleshoot performance issues in Azure Synapse step by step like SQL server (on-prem db) ?

Microsoft

Hi @MallikarjunaAvula ,

 

I am sure you will agree that document performance issues and how to troubleshoot requires more time and efforts. So I am interested to know what, where and how you need my help.   

 

-Sudhir Rawat 

Copper Contributor

@MallikarjunaAvula

 

Azure Synapse SQL Monitor gives you a comprehensive insight into your workloads and end-to-end visibility into the performance of your SQL data warehouse, helping you to dramatically reduce the time it takes to isolate and fix data warehouse issues. The monitor provides an intuitive interface on a single pane of glass that visualizes data warehouses and server activity so you can see at a glance where problem areas are.

 

https://azuremarketplace.microsoft.com/en-us/marketplace/apps/creativesoftware.synapse_sql_monitor_s...

 

 

Copper Contributor

Hi @Sudhir Rawat , We have a synapse implemented for marketing automation and want to understand if it can support below use case

We have a sales event happening and a microservice is fired for every transaction. Inside the service, we can to fire a select statement to pick the associated Customer from a table with 50M customers. The "where clause" of the query will use a column with index (Unique Key). 

The pick volume we are expecting is ~60 per second (3500/min) queries. Do you think Synapse is designed to handle that kind of load or we should look outside (any suggestion from Azure stack? CosmosDB?)

So basically it will be kind of "Select Col1.. From Customers where UID='ABCD'" fired at a rate of 60 sqls per second at pick.

Version history
Last update:
‎May 11 2020 08:18 PM
Updated by: