Automatic pause all Synapse Pools and keeping your subscription costs under control
Published Jan 16 2021 11:21 AM 20.9K Views
Microsoft

* Update 2022-11-16: Updating scripts / procedures to use MSI, instead of Automation Run As Account

https://learn.microsoft.com/en-us/azure/automation/manage-run-as-account

Azure Automation Run As Account will retire on September 30, 2023 and will be replaced with Managed Identities. Before that date, you'll need to start migrating your runbooks to use managed identities. For more information, see migrating from an existing Run As accounts to managed identity to start migrating the runbooks from Run As account to managed identities before 30 September 2023.

 

 

As Synapse engineer or Synapse Support Engineer you may need to start and test some Pools, and you want this to be the most cost efficient possible. Leaving some Synapse with a lot of DWU left turned on during the weekend because you forget to pause the DW after you shutdown your computers is not a good approach and we can quickly resolve this by using Powershell + Automation accounts.

 

Before we get into the procedure on the automation, just want to show you some details.

 

We currently have two flavors of Synapse

  • Dedicated SQL pools (formerly SQL DW)
    • Old mode where the SQL DW lives in an Azure SQL DB Server that can be shared with regular Azure SQL databases.
    • Internally is a resource of type (Microsoft.SQL)
    • Sample: /subscriptions/xxxxxx/resourceGroups/yyyyyyy/providers/Microsoft.Sql/servers/yyyyyyyy/databases/olddwpool
  • Azure Synapse Analytics - Dedicated SQL pool
    • SQL DW database inside a workspace
    • Internally is a resource of type (Microsoft.Synapse)
    • Sample: /subscriptions/xxxxxx/resourceGroups/yyyyyyy/providers/Microsoft.Synapse/workspaces/yyyyyyyy/sqlPools/dwpool
    • NOTE. Still exists an hidden internal is a resource of type (Microsoft.SQL) under managed resource group. However you cannot send requests to it. Powershell commands tagging this managed resource should fail by design

 

 

This is important to know because we are going to use 2 different scripts to pause the pool

  • Suspend-AzSynapseSqlPool (Az.Synapse)
  • Suspend-AzSqlDatabase (Az.SQL)

 

 

There are 2 versions of the script

  1. Powershell to run from your machine
  2. Using Azure Automation Account

 

1. Powershell to run from your machine

This script you can run from your machine and you can get last version of script at Synapse - Pause all DWs.ps1 - GitHub 

(This code will not work on Automation account - Find procedure below at "2. Using Azure Automation Account")

 

You are going to need the modules installed

  • Az.Accounts
  • Az.Sql
  • Az.Synapse

 

 

For below script I'm considering that your user is same that you use to admin your Azure Subscription

 

Sample:

Context exists
Current credential is sefonsec@microsoft.com
Current subscription is SEFONSEC Microsoft Azure Internal Consumption
---------------------------------------------------------------------------------------------------
Get SQL / Synapse RESOURCES
---------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------
Loop through all Synapse Workspaces
---------------------------------------------------------------------------------------------------
Checking Azure Synapse Workspace [xxxxxxxxxxx_synapse] for Synapse SQL Pools
-> Synapse SQL Pool [dwpool] found with status [Online]
-> Pausing Synapse SQL Pool [dwpool]
-> Synapse SQL Pool [dwpool] paused in 0 hours, 2 minutes and 32 seconds. Current status [Paused]
---------------------------------------------------------------------------------------------------
Loop through all SQL Servers (former SQLDW)
---------------------------------------------------------------------------------------------------
Checking SQL Server [xxxxxxxxxxx-eastus] in Resource Group [CSSAzureDB_OtherRegions] for Synapse SQL Pools
Checking SQL Server [xxxxxxxxxxx-eastus2] in Resource Group [CSSAzureDB_OtherRegions] for Synapse SQL Pools
Checking SQL Server [xxxxxxxxxxx-northeu] in Resource Group [CSSAzureDB_OtherRegions] for Synapse SQL Pools
Checking SQL Server [xxxxxxxxxxx-southcentralus] in Resource Group [CSSAzureDB_OtherRegions] for Synapse SQL Pools
Checking SQL Server [xxxxxxxxxxx-uksouth] in Resource Group [CSSAzureDB_OtherRegions] for Synapse SQL Pools
Checking SQL Server [xxxxxxxxxxx-ukwest] in Resource Group [CSSAzureDB_OtherRegions] for Synapse SQL Pools
Checking SQL Server [xxxxxxxxxxx] in Resource Group [CSSAzureDB] for Synapse SQL Pools
-> Synapse SQL Pool [SQLDW] found with status [Paused]
Checking SQL Server [xxxxxxxxxxx-byok] in Resource Group [CSSAzureDB_OtherRegions] for Synapse SQL Pools
Checking SQL Server [xxxxxxxxxxx-demo] in Resource Group [CSSAzureDB_OtherRegions] for Synapse SQL Pools
Checking SQL Server [xxxxxxxxxxx_synapse] in Resource Group [synapseworkspace-managedrg-5da694c3-ae72-4f25-9cc6-626adcf858e6] for Synapse SQL Pools
-> This DB is part of Synapse Workspace - Ignore here Should be done above using Az.Synapse Module
Checking SQL Server [xxxxxxxxxxx-westeu] in Resource Group [CSSAzureDB_OtherRegions] for Synapse SQL Pools
Checking SQL Server [xxxxxxxxxxx-westus] in Resource Group [CSSAzureDB_OtherRegions] for Synapse SQL Pools

 

Sample with error:

Checking Azure Synapse Workspace [xxxxxxxxxxx_synapse] for Synapse SQL Pools
Write-Error: -> Checking Synapse SQL Pool [dwpool] found with status [Resuming]

 

 

2. Using Azure Automation Account

Now we want this to be automated, like as a sample to shutdown every day at 11PM and to send alerts if an error happens.

 

You can find last version at: Synapse - Pause all DWs - Automation Acount - MSI.ps1 - GitHub

 

2.1. Create the Automation Account

2021-01-16 17_03_41-Automation - Microsoft Azure and 7 more pages - Work - Microsoft​ Edge.png

2.2. Ensure that the Automation Account has a Managed Identity assigned:

https://learn.microsoft.com/en-us/azure/automation/enable-managed-identity-for-automation#enable-usi...

FonsecaSergio_0-1668613974247.png

Or during creation:

 

FonsecaSergio_2-1668614219350.png

 

 

 

2.3. Add the any required permissions. 

Make sure to use less privileged logic to avoid risks. For example, adding only permission on required resource groups to avoid security risks or pausing production workload.

 

FonsecaSergio_1-1668614110741.png

 

 

2.4. Automation account now comes with Az Modules. If you don't see the Az modules, please check and refer to the following resources:

2.5. Now go to runbooks and create a new Powershell runbook

2021-01-16 17_19_01-Create a runbook - Microsoft Azure and 7 more pages - Work - Microsoft​ Edge.png

 

2.6. Add code from Synapse - Pause all DWs - Automation Acount - MSI.ps1 - GitHub

 

2.7. Save and Publish. And click on Start to test it

2021-01-16 17_20_04-Edit PowerShell Runbook_ - Microsoft Azure and 7 more pages - Work - Microsoft​ .png

 

2021-01-16 17_20_50-PauseALLDW (SynapseAutomation_PauseALLDW) - Microsoft Azure and 7 more pages - W.png

 

2.8. Check the output tab

2021-01-16 17_43_05-PauseALLDW 1_16_2021, 5_42 PM - Microsoft Azure and 6 more pages - Work - Micros.png

 

2.9. Go to schedules and add a new schedule

2021-01-16 17_44_58-PauseALLDW (SynapseAutomation_PauseALLDW) - Microsoft Azure and 6 more pages - W.png

2021-01-16 17_45_54-New Schedule - Microsoft Azure and 6 more pages - Work - Microsoft​ Edge.png

 

2.10. To be alerted if a schedule fails, go to the automation account and click on Alerts

 

2021-01-16 17_46_48-SynapseAutomation - Microsoft Azure and 6 more pages - Work - Microsoft​ Edge.png

 

2.11. You need to add a condition when an alert executes and what actions will be taken. This action could be an email notification, run a process or script

2021-01-16 17_47_40-Create alert rule - Microsoft Azure and 6 more pages - Work - Microsoft​ Edge.png

 

2.12. Configure the condition. You are going to use the metric Total Jobs (Like total jobs failed)

2021-01-16 17_48_09-Configure signal logic - Microsoft Azure and 6 more pages - Work - Microsoft​ Ed.png

 

2.13. Define the following

  •  runbook name
  •  status (Add custom status "Failed")
  •  Threshold > 0
  •  Run every hour

 

2021-01-16 17_49_13-Configure signal logic - Microsoft Azure and 6 more pages - Work - Microsoft​ Ed.png

 

2.14. Add the action group. In this example, send an email to sefonsec@microsoft.

 

2021-01-16 17_50_59-Email_SMS message_Push_Voice - Microsoft Azure and 6 more pages - Work - Microso.png

2021-01-16 17_50_35-Create action group - Microsoft Azure and 6 more pages - Work - Microsoft​ Edge.png

 

You are now set!

 

Your DW will be stopped at 11PM if some error occurs and you will be alerted via email when a DW is in a state that could not be paused.

 

 

Another last tip is to use the Azure Budget control from @Gonçalo Ventura 

 

Create a budget and alert

The budget puts a maximum limit on the cost of the subscription, in case some service is left running the subscription will automatically suspend when the budget is reached.

 

To avoid reaching the max limit and let the subscription go into suspended mode, it is possible to create an alert when the cost reaches a percentage of the budget.

 

To configure a budget, go to “Subscriptions" or “Cost Management + Billing”, then click on Budgets and fill in the parameters for your budget:

 

FonsecaSergio_0-1610820243993.png

 

 

Click next and set an alert:

FonsecaSergio_1-1610820244008.png

 

Synapse may not be the only service in your subscription, you may have VMs, SQL DBs, etc that would impact the budget.

13 Comments

Thank you @FonsecaSergio for Sharing with the Community :cool:

Copper Contributor

This is awesome, thank you! I also made a couple small adjustments to the script so I can run one to resume the SQL Pools as well. Now I can have a script run to resume all pools, schedule all my pipelines and dataflows to run and then have another script run after the flows/pipelines are completed to pause the SQL Pools. Now they'll only be running for the precise hours that they're needed. Thanks again, @FonsecaSergio!

@JordanJohnson What did you add to make the SQL Pool resume?

Copper Contributor

@Lee_Schuenemeyer, I basically wrote the inverse of his script commands. So I used the command "$SynapseSqlPool | Resume-AzSynapseSqlPool" rather than the command to pause the pool. Here's the script for the foreach loop where I iterate through all the sql pools for the subscription.

foreach ($SynapseSqlPool in $SynapseSqlPools) {
        
        ##########################################################################################################################################################
        if ($SynapseSqlPool.Status -eq "Online") {
            Write-Output "  -> Synapse SQL Pool [$($SynapseSqlPool.SqlPoolName)] found with status [Online]"
        }
        ##########################################################################################################################################################
        elseif ($SynapseSqlPool.Status -eq "Paused") {
            Write-Output "  -> Synapse SQL Pool [$($SynapseSqlPool.SqlPoolName)] found with status [Paused]"
            # Start Synapse SQL Pool
            $startTimeResume = Get-Date
            Write-Output "  -> Resuming Synapse SQL Pool [$($SynapseSqlPool.SqlPoolName)]"
            
            if (!$debug) {
                $resultsynapseSqlPool = $SynapseSqlPool | Resume-AzSynapseSqlPool
            }
            
            # Show that the Synapse SQL Pool has been resumed and how long it took
            $endTimeResume = Get-Date
            $durationResume = NEW-TIMESPAN –Start $startTimeResume –End $endTimeResume

            if ($resultsynapseSqlPool.Status -eq "Online") {
                Write-Output "  -> Synapse SQL Pool [$($resultsynapseSqlPool.SqlPoolName)] resumed in $($durationResume.Hours) hours, $($durationResume.Minutes) minutes and $($durationResume.Seconds) seconds. Current status [$($resultsynapseSqlPool.Status)]"
            }
            else {
                if (!$debug) {
                    $iErrorCount += 1;
                    Write-Error "  -> (resultsynapseSqlPool.Status -ne ""Online"") - Synapse SQL Pool [$($resultsynapseSqlPool.SqlPoolName)] resumed in $($durationResume.Hours) hours, $($durationResume.Minutes) minutes and $($durationResume.Seconds) seconds. Current status [$($resultsynapseSqlPool.Status)]"
                }
                else {
                    Write-Host "This is a debug session - Nothing was done" -ForegroundColor Yellow
                }
            }           
        }
        ##########################################################################################################################################################
        else {
            $iErrorCount += 1;
            Write-Error "  -> (SynapseSqlPool.Status -eq ""Online"") Checking Synapse SQL Pool [$($SynapseSqlPool.SqlPoolName)] found with status [$($SynapseSqlPool.Status)]"
        }
        ##########################################################################################################################################################
    }    
}
Copper Contributor

Hi @FonsecaSergio ,

 

I was trying to configure automation such that a failed connection attempt to dedicated SQL pool will trigger a auto-resume script. It appears that the resource in old mode and new mode have some difference that impacts my ability to implement this.

 

The old mode (type Microsoft.Sql/servers/databases) has a metric type named "connection_failed", and it can be used to trigger an action to execute the runbook to resume the pool. The new mode (type Microsoft.Synapse/workspaces/sqlPools) however, is missing an equivalent metric type. With this difference, I will not be able to automatically resume pool triggered by failed connect, with a straightforward configuration. 

 

Can you advise if there's a way to alert on failed connection attempt for the new resource type? I assume switching back to old mode is not an elegant option.

 

 

Copper Contributor

@digihunch, did you try using the BuiltinSqlPoolLoginAttempts metric (type Microsoft.Synapse/workspaces)? It's under the workspaces section, just above the documentation referenced link you included.

Copper Contributor

@JordanJohnson Thanks. I'm not sure if that will work in my scenario. I have a dedicated SQL pool in addition to the default built in (serverless) sql pool. 

digihunch_0-1618421037977.png

 

1. The BuiltinSqlPoolLoginAttempts metric appears to point to that built in pool. and the other pool is what i need auto-pause and auto-resume for.

2. The metric counts successful login attempts as well. On the console, it comes off as simply "Login Attempts" with the description "count of login attempts that succeeded or failed".

 

I know the old mode (sql dw) can resume on failed connection based on this configuration, but it is probably not doable anymore in the new mode. Instead, I am evaluating an alternative (add webhook to runbooks and fire a REST request from web activity within Synapse pipeline

Copper Contributor

@digihunch I was thinking it was a long shot, but since it was the only thing relevant that I've seen and no one else has posted anything I figured it might be worth a shot. Sounds like you're headed in the right direction with it though. Good luck! Hope you're able to figure something out!

 

I was able to convince the powers-at-be to sign up for a 3-year reserved instance on the dedicated pool so we can have our dw running 24/7 and the price is discounted ~65% so I'm glad I don't have to deal with these workarounds any more and the price is just as cheap, if not less than it was when I was spinning the server down as often as we could afford. Highly recommend going that route. Willing to help if I'm able.

Copper Contributor

We do already have a script to pause and resume the dedicated pools, but sometimes a pool was already resumed/ paused by another user. How can I find out when (and by whom) a deicated pool was paused/resumed?

Copper Contributor

Will this work using private endpoint? I read this on the official MS doc - In the current implementation of Private Link, Automation account cloud jobs cannot access Azure resources that are secured using private endpoint. For example, Azure Key Vault, Azure SQL, Azure Storage account, etc. 

Microsoft

@ihaveeyes good point. I never tested this with Priv Endpoints. Only if we use VNET injection to Automation account... Never tested it but these links below should have some information about it
https://learn.microsoft.com/en-us/azure/automation/how-to/private-link-security

https://azure.microsoft.com/en-us/updates/azure-automation-private-link/

 

 

If still struggling with this, I recommend opening a support case to Azure Automation team to support you on that

 

Copper Contributor

I wonder way shutting the pool down by an inactivity timeout hasn't been implemented. The Automation account doesn't work because it can terminate long running queries so we stop using it but developers can forget to run the manual shutdown script and cost us $$$ over the weekends. Is there any plan for inactivity timeout?

Microsoft

@MattPho1265 I suggest you to send your feedback at http://aka.ms/SynapseIdeas and vote on other ideas. The Product Keep eye on that

Co-Authors
Version history
Last update:
‎Dec 04 2023 02:24 AM
Updated by: