As you manage your operations in Azure, your backup estate might expand by including new diverse set of workloads or scale-up in volume. At small-scale, actions such as identifying the right machines to be backed up, configuring backup, monitoring their status, extracting data etc. could be manual or performed with ease. But at large-scale, we understand that such actions could be difficult, complex and could be prone to error. Hence, Azure Backup aims to assist you by providing few automation options for management-at-scale. The 3 key areas that can benefit with such automation options which we heard from our customers are:
Configuring backup when infrastructure gets created or provisioned.
Exporting backup operational and audit data to your own monitoring systems/dashboard for your end-customers
Automating actions/triggers to identify failures and correct them so that backup is ‘healthy’.
A backup admin must deal with new infrastructure getting added periodically and making sure that they are protected as per agreed requirements. The usual approach is to have an automation client, such as PS/CLI, to get all the VMs and then check the backup status of each of them and take appropriate action for un-protected VMs. Now, this also must be performant at-scale and should be scheduled periodically, and each run should be monitored as well. To ease such pains in automation, Azure Backup leverages Azure Policy and provides built-in backup specific Azure policies to govern the backup estate. Essentially, once you assign an Azure policy to a scope, all VMs which meet your criteria will be backed up automatically and newer VMs will be scanned and protected periodically by Azure Policy itself. You can also view a compliance report which will alert you for non-compliant resources as well.
The below video elaborates on how Azure policy works for backup.
In summary, Azure Backup and Azure policy makes sure that you are at peace knowing that you have protected your most critical resources and data.
Exporting backup operational data
Azure Backup users, especially partners, have always looked for options to extract backup operational data for their entire estate and periodically pump it to their monitoring systems/dashboards. At large scales, the data should be retrieved fast, even when querying thousands of records. You should be able to query across resources, subscriptions, and tenants. You should also be able to query from any client (Portal/PS/CLI/Any SDK/REST API) and there should be flexibility in output formatting (table vs Array) as well. Azure Resource Graph (ARG) is built to meet such requirements and query at-scale. Azure Backup leverages ARG as an optimized way to fetch all related data with minimal queries (1 single query for 1 scenario). For example, a single query can fetch you all failed jobs across all vaults in all subscriptions and all tenants. Few ARG sample queries are documented here. Apart from the flexibility of calling the query from any client, the queries are RBAC compliant as well.
Now, you can export relevant backup data in a secure and performant way to your own monitoring systems and dashboards, even for large-scale.
Automating responses or actions
Automatic troubleshooting is one of the most key asks from our customers. If the recommended actions for failures can be automated, the time taken to recover from failure is minimized. You could automate a specific/targeted recommended action such as setting up the right permissions or to just re-trigger failures for outage scenarios/transient errors. You can achieve this by retrieving relevant backup data via ARG and combine it with corrective PS/CLI steps.
As an example, in the below video, we talk about how to re-trigger backup for all failed jobs (across vaults, subscriptions, tenants) using ARG and Powershell.
While transient errors can be corrected, some persistent errors might require in-depth analysis. You may have your own monitoring/ticketing mechanisms to make sure such failures are properly tracked and fixed. Azure Backup now integrates first-class with Azure Monitor which means critical alerts could automatically be integrated with ITSM solutions such as Service Now.
In the below video, we talk about how to leverage Azure Monitor to configure various notification mechanisms for critical alerts.
With automation in place, you can automatically correct few errors across large estates and could leverage relevant notification mechanisms to track and fix others, thereby making sure that your backup estate is always healthy.
Azure Backup offers various automation at-scale options such as Integration with Azure Policy, Azure Resource Graph and Azure Monitor to govern and monitor large backup estates across subscriptions and tenants.
Come chat with us to provide feedback on Azure Backup! We’re hosting a Product Roundtable during Build on Wednesday, May 26, 6:30-7:30 AM PT. Azure Backup: automation and security considerations for databases and storage.