Blog Post

Azure Synapse Analytics Blog
2 MIN READ

Serverless Architecture and Concepts. What is it?

Liliam_C_Leme's avatar
Liliam_C_Leme
Icon for Microsoft rankMicrosoft
Jan 22, 2021

I needed to go through this subject this week so I thought that would be a good opportunity to share SQL Serverless Architecture concepts.

 

1) What is the difference between SQLOD and the former SQLDW?

 

SQL Serverless or SQL On-demand

  • Grow or shrink compute power, within a dedicated SQL pool, without moving data.
  • Pause compute capacity while leaving data intact, so you only pay for storage.
  • Resume compute capacity during operational hours.

SQL DW you can actually pause and resume. You insert, update and delete. You have storage with your data which SQLDW is held responsible.

 

What is SQLOD?

SQLOD is a query service over the data in your data lake. You do not need to pause or resume. It is a service per comsumption or per demand. The service is resilient to failure and elastic.

Note: Serverless SQL pool has no local storage, only metadata objects are stored in databases. Basically is for reading, only.

 

What it means resilient to failures and elastic?

It means it auto-scale the node's resources if required by the engine while querying your file and it means also if there is any failure in any node it recovers without any user intervention.

 

 

What is supported on Serverless?

Supported T-SQL:

  • Full SELECT surface area is supported, including a majority of SQL functions
  • CETAS - CREATE EXTERNAL TABLE AS SELECT
  • DDL statements related to views and security only

How does it work?

  • DQP or Distributed Query Processing.
  • Compute Node

 

DQP is responsible per optimize and orchestrate distributed execution of user queries by splitting them into smaller queries that will be executed on Compute nodes.

The Compute Nodes will execute the tasks creates by the DQP. The tasks are pretty much the query logic break in chunks of data to be processed. Those chunks of data are the files organized in data cells.  How many data cells and tasks will be executed depends on the plan optimization.

 

Plan optimization also depends on the stats. The more serverless the SQL pool knows about your data, the faster it can execute queries against it and in the end, the plan chosen is based on cost, lowest cost.  Note: Automatic creation of statistics is turned on for Parquet files. For CSV files, you need to create statistics manually.

 

 

 

Ref: Synapse SQL architecture - Azure Synapse Analytics | Microsoft Docs

Serverless SQL pool - Azure Synapse Analytics | Microsoft Docs

https://www.microsoft.com/en-us/research/publication/polaris-the-distributed-sql-engine-in-azure-synapse/

Create and update statistics using Azure Synapse SQL resources - Azure Synapse Analytics | Microsoft Docs

 

That is it!

Liliam Leme

UK Engineer.

 

Updated Sep 15, 2021
Version 3.0
No CommentsBe the first to comment