Hello, dear readers! Here is Hélder Pinto again, writing the last post of a series dedicated to the implementation of automated Continuous Optimization with Azure Advisor recommendations. For a contextualization of the solution here described, please read the introductory post for an overview of the solution, the second post for the details and deployment of the main solution components and also the third post to see how the Azure Optimization Engine generates recommendations and reports on it.
If you didn’t have time to read the full post series about the Azure Optimization Engine, let me quickly recap. The Azure Optimization Engine (AOE) is an extensible solution designed to generate custom optimization recommendations for your Azure environment. See it like a fully customizable Azure Advisor. It leverages Azure Resource Graph, Log Analytics, Automation, and, of course, Azure Advisor itself, to build a rich repository of custom optimization opportunities. The first recommendations use-case covered by AOE was augmenting Azure Advisor Cost recommendations, particularly Virtual Machine right-sizing, with VM metrics and properties all enabling for better informed right-size decisions. Other recommendations can be easily added/augmented with AOE, not only for cost optimization but also for security, high-availability, and other Well Architected Framework pillars.
In this last post, I will show you how we can use AOE to automate the remediation of optimization opportunities – the ultimate goal of the engine – and how to extend it with new custom recommendations.
The customer pain that sparked this series was all about remediating dozens or hundreds of VM right-size recommendations, for which one had trouble in reaching information that could help in making well-informed decisions. An Azure administrator can well spend many hours of investigation and many interactions with other colleagues before deciding to downsize a VM. Many times, it becomes an unfeasible task and Azure inefficiencies can last forever.
What if customers could simply automate those recommendation remediations? It may at first seem a reckless and naive option, but let’s face it: if we were highly confident that the recommendation was feasible and the environment we were touching was not critical, wouldn’t we prefer to automate?
With the help of AOE, we now have a database of recommendations that have all the details we need to make an automated decision (see all the details in the previous post)
Based on these details, we can perfectly write a remediation runbook that simply queries the recommendations database for VMs that have been recommended for right-size for the past X weeks and with a fit score larger than Y. The T-SQL query could be this one:
SELECT RecommendationId, InstanceId, Tags, COUNT(InstanceId)
FROM [dbo].[Recommendations]
WHERE RecommendationSubTypeId = '$rightSizeRecommendationId' AND FitScore >= $minFitScore AND GeneratedDate >= GETDATE()-(7*$minWeeksInARow)
GROUP BY InstanceId, InstanceName, Tags
HAVING COUNT(InstanceId) >= $minWeeksInARow
Additionally, we could filter the VMs to remediate to include only those that had a specific tag value. For a scenario where $minWeeksInARow=4 and $minFitScore=4.5 and tag environment=dev, these would be the automated remediation results:
Recommendation |
Fit Score |
Weeks in a Row |
Env. tag |
Action |
we1-prd-dc01 |
4.6 |
6 |
prod |
None |
we1-dev-app02 |
4.7 |
4 |
dev |
Downsize |
we1-dev-sql03 |
4.3 |
5 |
dev |
None |
we1-dev-app03 |
4.8 |
2 |
dev |
None |
The AOE includes a Remediate-AdvisorRightSizeFiltered runbook that implements exactly the algorithm above. After having deployed the solution, you just have to define values for the following Automation variables and finally schedule the runbook for the desired time and frequency. Happy rightsizing!
OK, now you have customized right-size recommendations, but you probably want more. You want to identify other cost saving opportunities that may be specific to the environments you manage and that Advisor does not cover yet, such as underutilized App Service Plans or SQL Databases, ever-growing Storage Accounts, VMs stopped but not deallocated, etc.. In the previous post, you saw that writing a recommendation runbook for orphaned disks was really easy.
In this post, I want to show you that the AOE is not meant only for Cost optimization but can be used for other Well Architected Framework pillars – High Availability, Performance, Operational Excellence and Security. I’ve recently added to the AOE a recommendation for the High Availability pillar, identifying VMs with unmanaged disks. This new recommendation does not need additional data sources, as the Virtual Machine data already being exported from Azure Resource Graph is enough to identify VMs in this situation.
If you want to generate your own custom recommendations, you just have to first make sure you are collecting the required data with the Data Collection runbooks – follow the pattern of the existing runbooks that dump the data as CSV into a Storage Account and then rely on the data source-agnostic Log Analytics ingestion runbook. Having the required data in Log Analytics, you can write a new recommendation runbook that runs a weekly query for optimization opportunities. Looking at the Recommend-VMsWithUnmanagedDisksToBlobStorage runbook, you’ll identify the recommendation generation pattern:
Don’t forget to link the runbook to the AzureOptimization_RecommendationsWeekly schedule and that’s all you must do. On the next scheduled recommendations generation run, you’ll have your new recommendations flowing into the Power BI report!
Thank you for having been following this series! 😉
Disclaimer
The Azure Optimization Engine is not supported under any Microsoft standard support program or service. The scripts are provided AS IS without warranty of any kind. Microsoft further disclaims all implied warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample scripts and documentation remains with you. In no event shall Microsoft, its authors, or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample scripts or documentation, even if Microsoft has been advised of the possibility of such damages.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.