Scenario
Today I would like to share a scenario that I was working on one of my serverless SQL Pool support cases. The customer asked for an advice on how to monitor serverless SQL requests by using log analytics.
The intention of this guide is to help you with choosing the configuration required to easily setup the Synapse Analytics Workspace monitoring and all other considerations about how to monitor serverless SQL requests with Azure Monitor. Spoiler: At the end of this article, I will share the latest version of the serverless workbook posted on the Azure_Synapse_Tool_Box. This includes a really cool way to see query execution information.
Before we start... here are some initial considerations.
Azure monitor provides base-level infrastructure metrics, alerts, and log for most Azure Services.
Metrics are numerical values that describe some aspects of a system at a particular point in time. It is the most important way to monitor data and gain insight into the performance counter.
Logs are events that occur in the system. They can contain different kinds of data and might be structured or freeform text with a timestamp. Logs in Azure Monitor are stored in a Log Analytics workspace that's based on Azure Data Explorer.
For Azure Synapse Analytics metrics, we have the following types:
- Workspace-level metrics
- Dedicated SQL pool metrics
- Apache Spark Pool metrics
To monitor some data from the Synapse Workspace operation, Azure Synapse Analytics writes the diagnostic logs in Azure Monitor.
For Azure Synapse Analytics logs, we have the following types:
- Workspace-level logs
- Dedicated SQL pool logs
- Apache Spark Pool log
You can find more information regarding Synapse metrics and logs in the following public reference: How to monitor Synapse Analytics using Azure Monitor - Azure Synapse Analytics | Microsoft Learn
At the moment when I was writing this article, the only way to read the logs emitted by Azure Synapse workspaces for serverless SQL requests is by querying the log analytics table name called SynapseBuiltinSqlPoolRequestsEnded.
On the table above, we can find the Synapse built-in serverless SQL Pool ended requests. It means that only sql requests succeeded or failed.
Now that you know a little bit about metric and logs, let's start creating the Synapse diagnostic settings to configure diagnostic logs for serverless sql requests workspace resource.
Environment
Pre-Requisites
Make sure you have at least one Log Analytics workspace to easily store, retain and query data collected from your monitored Synapse workspace resource.
Creating a Log Analytics workspace
Follow these steps below to set up a new Log Analytics workspace:
1. From Azure Portal, in the search bar on top of your portal, search for Log Analytics workspace.
2. From the Log Analytics workspaces page -> "+ Create".
3. Provide the project and instance details: Subscription, resource group, name and region. "Review and Create".
Creating a Log Analytics workspace
Follow these steps below to set up a new diagnostic setting and use the Log Analytics workspace to monitor Synapse logs:
1. From the Azure Portal, open the Synapse Workspace page that you would like to monitor. From the left-menu, "Monitoring" -> "Diagnostic settings" -> "Add diagnostic setting"
We can see in the image above, some collection of data that can be logged.
2. From the "logs categories" choose the built-in Sql Pool Requests Ended and on "Destination details" check the "Send to Log Analytics workspace" box and select the workspace already created. Type the "Diagnostic setting name" and "Save" the diagnostic settings.
3. After a few moments, the new setting appears in your list of settings for this resource.
Monitoring the serverless sql requests from Synapse Studio
Follow these steps below to query logged data from log analytics to monitor serverless sql requests:
1. Let's start to query a sample data from data lake storage by using OPENROWSET function in Synapse Studio page.
2. Let's check by using Synapse Studio to monitor the SQL requests to view the sql requests details about this specific query above.
Querying the serverless sql ended requests from Log Analytics
1. Go back to your Synapse workspace page, in the left-menu, "Monitoring" -> "Logs". Close the query recommendation if it pops-up on the workspace.
Select the "SynapseBuiltinSqlPoolRequestsEnded" table, type the table name and "Run" the simple KQL query below.
SynapseBuiltinSqlPoolRequestsEnded
As a result of my log analytics environment, I have the following query result.
We can see on the screenshot above a lot of interesting fields which can be used to gain insight from serverless sql requests. Here we have the complete list of columns, type and description of the data which can be used to find information regarding serverless SQL query.
Feel free to use these KQL scripts below to start looking at some interesting sql requests ended results. #Enjoy
// All metadata and query results
SynapseBuiltinSqlPoolRequestsEnded
| where TimeGenerated > ago(15m)
// Full query result
SynapseBuiltinSqlPoolRequestsEnded
| project clientRequestId=Properties.clientRequestId, command=Properties.command, dataProcessedBytes=Properties.dataProcessedBytes, distributedStatementId=Properties.distributedStatementId ,
startTime=Properties.startTime, endTime=Properties.endTime, error=Properties.error, queryHash=Properties.queryHash,
queryText=Properties.queryText
// Queries by completion type
SynapseBuiltinSqlPoolRequestsEnded
| summarize count() by ResultType
| render piechart
// Most Recent Query Runs
SynapseBuiltinSqlPoolRequestsEnded
| order by TimeGenerated
| project StartTime=Properties.startTime, EndTime=Properties.endTime, Identity, CommandType=Properties.command, Result=ResultType, MbProcessed=(Properties.dataProcessedBytes/1024/1024), QueryText=Properties.queryText, _ResourceId
| extend elapsedTime_sec =(todatetime(EndTime) - StartTime)/1s
| extend Approx_Cost=((MbProcessed)*0.000476837158203125)/100
// Data Processed By Query
SynapseBuiltinSqlPoolRequestsEnded
| project BytesProcessed=Properties.dataProcessedBytes,tostring(Identity), QueryHash=tostring(Properties.queryHash),QueryText=tostring(Properties.queryText)
| summarize MaxSingleQueryMBProcessed=max(toint(BytesProcessed)/1024/1024), TotalMBProcessed=sum(toint(BytesProcessed)/1024/1024), QueryCount=count(), QueryHash=any(QueryHash) by QueryText
| extend Approx_Accumulated_Cost=((TotalMBProcessed)*0.000476837158203125)/100 //multiply by price per mb ($5 per TB from Microsoft Docs: https://azure.microsoft.com/en-us/pricing/details/synapse-analytics/)
| extend Approx_Cost_Per_Execution=((TotalMBProcessed)*0.000476837158203125)/100/ QueryCount
| order by TotalMBProcessed
// Data Processed By User with $5 per TB cost
SynapseBuiltinSqlPoolRequestsEnded
| project BytesProcessed=Properties.dataProcessedBytes,tostring(Identity)
| summarize TotalMBProcessed=sum(toint(BytesProcessed)/1024/1024),QueryCount=count(),MaxSingleQueryMBProcessed=max(toint(BytesProcessed)/1024/1024) by Identity
| extend Approx_Cost=((TotalMBProcessed)*0.000476837158203125)/100 //multiply by price per mb ($5 per TB from Microsoft Docs: https://azure.microsoft.com/en-us/pricing/details/synapse-analytics/)
| order by TotalMBProcessed
Let's go to the icing on the cake:
The Azure Synapse Toolbox is a place for useful tools and scripts to use with Azure Synapse Analytics. There you can find a lot of rich workbook templates to monitor all your Synapse environment.
Now let's focus on the serverless workbook which will includes query execution information including costing.
As pre-requisite you must have 'diagnostic settings' enabled at the workspace level to send 'BuiltinSqlReqsEnded' to Log Analytics. I have already proceeded with these settings. If you do not have it enabled, please go back to the begin of this article and follow all steps.
Follow the steps below to setup the serverless workbook. More information here.
1. From Azure Portal, search for your log analytics workspace, from the log analytics page go to left-menu "General" -> "Workbook" -> "+ New".
2. Click on "</>" (advanced editor).
3. Overwrite the code shown on the gallery template with the content available here. Select all content of the git page and paste on the gallery template, and then "Apply".
4. Click on "Save As" and provide the Title for your workbook.
And finally, you are ready to enjoy of the rich metrics and logs monitoring capability by using workbooks.
Conclusion
When using the Serverless SQL Pool to run your queries, users can take advantage of the diagnostic logs to monitor sql ended requests data efficiently. The log analytics provides a really simple way to run KQL queries and gain useful insight about sql query execution.
References:
Diagnostic settings in Azure Monitor
Use Azure Monitor with your Azure Synapse Analytics workspace
Note: Thank you Silas Mendes for reviewing it.
If you are unable to complete the steps in the article, please do reach out to us by logging a Support Request.
SIDNEY CIRQUEIRA
Synapse Analytics Support Engineer - Americas