Internals
11 TopicsAzure metrics Dashboard for Azure Synapse Analytics - Part 1
This article is part 1 of a 3 part monitoring series to meet the increasing needs of customers to proactively monitor the Synapse pool resource consumption, workload patterns and other key performance metrics. In this post, we will cover the dashboards that are available today for us on the Azure portal. These don’t require heavy customization and are very easy to set up. To create your dashboard, navigate to the Azure Portal Home page -> Synapse Pool resource blade -> Monitoring -> Metrics Although numerous metrics are available for building dashboards, this segment will cover the 4 most important ones for monitoring your DataWarehouse in this article. We will be using these metrics in the upcoming posts in the series as well. 1. Resource Utilization - CPU, DWU, IO percentages From the drop down shown above choose max CPU percentage, add max DWU percentage metric and max Data IO Percentage as shown below. Click on the pencil symbol and edit the name of the chart to your preference. Please note that DWU Percentage and CPU percentage overlap and you may see only one of them. The DWU percentage is usually either the CPU or IO percentage, whichever is higher. Now save the chart to a dashboard by clicking the 'pin to dashboard' option on the top righthand corner. You will be asked to choose between pinning it to an existing dashboard vs a new one as shown below. Once the chart is saved/pinned to the dashboard, follow the same process to create the remaining charts as well, as shown below. 2. Active and Queued queries - Concurrency details Following the same process as above, create another chart on the same dashboard blade by adding active queries and queued queries aggregating on 'Sum' 3. Workload Group Allocation - Resource classes and their percentage allocation details For this chart, select workload group allocation by system percent aggregating on 'Max' and split by 'Workload group'. Please note that there is a limit on the number of workload groups you can monitor. 4. Tempdb Utilization - tempdb usage across all the nodes Add the below mentioned metric to your chart aggregating on 'Max'. It is important to note that the chart below is the minimum, average or maximum value over a 5 minute window of the average tempdb utilization across all the nodes. In general, tempdb is located on each of the nodes, however, Azure metrics do not show the individual node level tempdb utilization as of yet. This has been brought to the attention of the development teams. Once all the 4 charts are pinned to the dashboard, resize the charts so that they all fit on one screen like below. Now that you have the important dashboards setup, you can build additional custom dashboards to get into more granular details about what queries/workloads are affecting your resources. This is not done by graphical user interface entirely and the second part of this post will provide you the step by step process for setting up the same.6.7KViews8likes1CommentMSSparkUtils is the Swiss Army knife inside Synapse Spark
One of those questions was how to share results between notebooks. Every time anotebook starts a new Spark cluster will also start which means they would be using different sessions making it impossible to share results between executions of notebooks. However, MSSparkUtils offers a solution to handle this scenario.10KViews2likes12CommentsAlways use UTF-8 collations to read UTF-8 text in serverless SQL pool
Serverless SQL pool in Azure Synapse Analytics enables you to read UTF-8 encoded text from CSV and PARQUET files. In this article you will learn how to properly configure a serverless SQL database to get the UTF-8 data without unwanted conversions.62KViews5likes6CommentsAzure metrics Dashboard (Custom dashboards using Log Analytics) for Azure Synapse Analytics - Part 2
Prerequisites: Complete basic dashboards Setup (Part 1 in this series); Log Analytics Workspace configured (ref - Monitor workload - Azure portal - Azure Synapse Analytics | Microsoft Docs) As a first step, please download the log analytics queries in this GitHub repo that we will be using to set up the following custom dashboards. We are going to create the following charts in this article: Top 20 Long Running Queries by Execution time in a given interval Number of Rows processed per hour Most Rows Processed Per Query step Number of Rows processed per Query Navigate to the Log Analytics workspace you configured to receive Synapse diagnostic data as shown below. Close out the ‘queries’ popup and paste in the text from 'Top20LongrunningQueries.txt' file in the GitHub link provided above (image below for your reference). Replace the database name with your database name. You can save the query for future use with a name that corresponds to the query goal. For example, 'LongRunningQueriesbyexecutiontime' Press ctrl-A to select all the contents in the query window, including the commented out portions. Press Run to execute the Query. Once you get the results, select ‘Pin to dashboard’ then in the popup choose the dashboard we are working on and click ‘pin’ Navigate to your dashboard and find the chart that you just pinned. Click the pencil icon to rename the chart to fit the purpose. In the popup at the top of the window click ‘save’ – you will want to do this each time you make a change or all the changes you made will be discarded. Navigate back to the log analytics workspace and paste in the text from the file ‘Top20Stepsthatmovedmostnumberofrows.txt’ that you downloaded. Save the query, select all the contents in query window and run. Once you get the results, pin it to the dashboard. Without leaving the query window, click the ‘Chart’ button. The output will switch to a chart view. Set the 3 parameters as shown in the screenshot below – note that you will need to set ‘OperationType_s’ first to be able to set the first parameter to RequestID_s. Pin this chart to the dashboard as well. Navigate back to the dashboard and rename both the charts and save the dashboard. Repeat the process for query, RowsProcessedPerHour.txt as well and pin the chart to the dashboard. Now for all the above 4 charts, click the context menu option (ellipsis button to the top right corner) and select configure tile settings. Make sure to uncheck the "Override the dashboard time settings at the tile level" option. This will let you automatically refresh the chart for the interval selected on the dashboard. It is important to configure this setting to interpret the results which will be discussed in the 3rd part in this series.7.3KViews6likes2CommentsInvalid operation. The connection is closed or Query too complex
Scenario: When executed more than 325 columns in a select query the error bellow was thrown: [110813] Invalid operation. The connection is closed. The same query with fewer columns than that worked. So it seems it was hitting some kind of limit.8.8KViews0likes0Comments