Brad Watts here to talk about a solution that I’ve implemented with several organizations in my role as a CSA. Azure Cost Management is a powerful way to visualize and report on your Azure spend but it doesn’t currently give you the ability to detect when you have cost anomalies. Below we will walk through how to load cost data into Log Analytics to detect anomalies.
This walkthrough uses a template to deploy a complete environment. If you want details on the template or would like a copy of the Azure Function being used, then you can visit App Service Template Repo.
This architecture is made of the following resources:
The below deployment will deploy these to the resource group that you define in Azure.
The meat of this solution is an Azure Function that makes an API call to the Azure Cost Query API. It then takes those results and ingest them into Log Analytics workspace that is created by the template. The function is scheduled to run each afternoon at 2pm EST.
Note There are two parameters that you need to supply when deploying the solution:
Use the link below to deploy this solution to your Azure Subscription!
After deploying this solution, you must give the App Service System Assigned Managed Identity "Read" permissions at the scope or scopes that you are querying. The system assigned managed identity will have the same name as your function app.
If you want to load historical data into Log Analytics you can utilize the function named PreLoadLogAnalytics.
**Note: You could also use the “Code + Test” tab within the function to run it.
Azure Workbooks are a great option to get insights from the data in the csv file. We'll generate a Workbook that looks for anomalies in the cost per Resource Group. Once imported you will be able to select the Resource Group in the workbook to look at the details on what resources are causing the anomalies.
1) Open Azure Monitor and open the Workbooks tab
2) In the main pain click on "New" at the top:
3) On the top toolbar click on Advanced Editor
4) In the editor past the content of CostWorkbook.json in this repo
https://raw.githubusercontent.com/microsoft/CSACostAnomalies/main/CostWorkbook.json
5) Click on Apply to enter the editing Windows.
6) You can now click on Done Editing and start to utilize the workbook.
We can now work on setting up alerting on anomalies. In the below example we'll utilize Logic App to execute once a day and look for any Resource Group that had an anomaly.
Our workflow for alerting will follow this pattern:
Now lets walk through the steps to create the Logic App!
1) Create a new Logic Apps and select "Blank Logic App" Template
2) The Logic App Designer will open with the trigger selection available. Select Recurrence
3) For the Recurrence trigger you'll want to configure this to run every 1 day and you need to add the Start Time Property
4) Below your trigger click on Add New Step and look for Azure Monitor. You'll select the Azure Monitor Logs
5) Select the Run Query and Visualize Results action
6) Fill in the Properties:
let ids=AzureCostAnamolies_CL
| extend UsageDateTime = todatetime(Date_s)
| order by UsageDateTime
| where PreTaxCost_d >= 5
| make-series Cost=sum(PreTaxCost_d) on UsageDateTime in range(startofday(ago(90d)), endofday(ago(1d)), 1d) by ResourceGroup
| extend outliers=series_decompose_anomalies(Cost)
| mvexpand outliers, UsageDateTime
| summarize arg_max(todatetime(UsageDateTime), *) by ResourceGroup
| where outliers>=1
| distinct ResourceGroup;
AzureCostAnamolies_CL
| extend UsageDateTime = todatetime(Date_s)
| where ResourceGroup in (ids)
| where UsageDateTime >= ago(7d)
| summarize PreTaxCost=sum(PreTaxCost_d) by ResourceGroup, UsageDateTime
| order by ResourceGroup, UsageDateTime desc
7) Click on New Action below the Run Query and Visualize Results. Search for Condition and select Control.
8) In the Control actions choose Condition
9) In the Condition use the following properties:
10) In the If true section click on Add an Action
11) Repeat steps 6-8 but this time use the below query
let ids=AzureCostAnamolies_CL
| extend UsageDateTime = todatetime(Date_s)
| order by UsageDateTime
| where PreTaxCost_d >= 5
| make-series Cost=sum(PreTaxCost_d) on UsageDateTime in range(startofday(ago(90d)), endofday(ago(1d)), 1d) by ResourceGroup
| extend outliers=series_decompose_anomalies(Cost)
| mvexpand outliers, UsageDateTime
| summarize arg_max(todatetime(UsageDateTime), *) by ResourceGroup
| where outliers>=1
| distinct ResourceGroup;
AzureCostAnamolies_CL
| extend UsageDateTime = todatetime(Date_s)
| where ResourceGroup in (ids)
| where UsageDateTime >= ago(7d)
| summarize PreTaxCost=sum(PreTaxCost_d) by ResourceId, UsageDateTime
| order by ResourceId, UsageDateTime desc
12) Add a new action after the last step (but still in the if true section) and search for Outlook. Choose the Office 365 Outlook actions
13) In the actions windows search for Send email and choose Send an email (v2). Note: This action will send an email from your email account. For production you would want to setup a shared mailbox and choose the action Send an email from a shared mailbox (v2)
14) The first time using this connector it asks you to login to Office 365 to make the connection. Once you've done this fill in the following properties:
Attachments Content - 1: From the Dynamic Content select the **Attchment Content** from **Run Query and Visualize Results**. It should be the second one in the list.
Attachments Name - 1: RGCost7Days.html
Attachments Content - 2: From the Dynamic Content select the **Attchment Content** from **Run Query and Visualize Results 2**. It should be the first one in the list.
Attachments Name - 2: ResourceIdCost7Days.html
15) Save the Logic App and click on Run. The next time the csv file is updated on the blob this logic app should run and alert if there are any anomalies.
Conclusion
The above solution takes advantage of the Cost Management API along with the anomaly detection algorithms built into Log Analytics to help you quickly discovery cost spikes for dips in your environment. We focused on showing the anomalies based on Resource Group but once the data is in Log Analytics we could group and show anomalies in different ways. For instance, we could show it based on individual resources or resource types. I believe this is a common need for organizations and hopefully this can help some of you fill that requirement!
Disclaimer
The sample scripts are not supported under any Microsoft standard support program or service. The sample scripts are provided AS IS without warranty of any kind. Microsoft further disclaims all implied warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample scripts and documentation remains with you. In no event shall Microsoft, its authors, or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample scripts or documentation, even if Microsoft has been advised of the possibility of such damages.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.