BrunoGabrielli
39 TopicsAzure Monitor: How To Create Overrides for Log Search Alerts
Hello blog readers 😊 How many times have you found yourself in a situation where you created Azure Monitor alerts but the threshold you specified was not applicable to all the targeted resources? The side effects of this were: You were getting unnecessary alerts You were getting sort of false positives To fix the problem you had to create different alerts for different resources with different thresholds I faced this problem long time ago, and I blogged about a possible solution in the post called Azure Monitor: Use Dynamic Thresholds in Log Alerts. Despite that approach being effective, it did not completely resolve the problem. Unfortunately, it was not possible to query Azure resource graph back in April 2023, but nowadays it is 😊😊😊. Thanks to the ability to Create alerts with Azure Resource Graph and Log Analytics, it is now possible to query for the presence of resource tags that can be used to set a new threshold applied to the tagged resources specifically. Each and every resource can have tags and values so they will use specific thresholds. Isn’t this an override? NOTE: Some tables (like the InsightsMetrics table) include a Tag column that does not contain any resource tag. This column is used to store various additional information about the collected data, such as the mountId tag that contains the logical volume letter. Now that theory should be clear, let’s have a look at the practice taking a disk space alert. Normally you have an alert with a query similar to the one in the picture below: That is going to create an alert for disks with space below 10% Using my previous solution, you could have made the thresholds different based on the disk size but, as mentioned, this does not give you the ability to set a resource specific threshold. Hence, considering the theory explained above you need to add the Azure Resource Graph (ARG) to your query and join the above query with the ARG part before the threshold comparison. Talking about the ARG part, you need to decide the tag name you would like to use. I would recommend something easy and mnemonic like DiskSpaceThreshold. In this case, the ARG part of the query would be similar to: let overridenResource = (arg("").resources | where type =~ "Microsoft.Compute/virtualMachines" | project _ResourceId = tolower(id), tags | where tags contains "DiskSpaceOverride"); That part should be added at the top of your query: let overridenResource = (arg("").resources | where type =~ "Microsoft.Compute/virtualMachines" | project _ResourceId = tolower(id), tags | where tags contains "DiskSpaceOverride"); InsightsMetrics | where _ResourceId has "Microsoft.Compute/virtualMachines" | where Origin == "vm.azm.ms" | where Namespace == "LogicalDisk" and Name == "FreeSpacePercentage" | extend Disk=tostring(todynamic(Tags)["vm.azm.ms/mountId"]) | summarize AggregatedValue = avg(Val) by bin(TimeGenerated, 15m), Computer, _ResourceId, Disk | where AggregatedValue < 10 | project TimeGenerated, Computer, _ResourceId, Disk, AggregatedValue Now, you need to join the query results with the ARG part before the threshold comparison. This will change the query into the one below: let overridenResource = (arg("").resources | where type =~ "Microsoft.Compute/virtualMachines" | project _ResourceId = tolower(id), tags | where tags contains "DiskSpaceOverride"); InsightsMetrics | where _ResourceId has "Microsoft.Compute/virtualMachines" | where Origin == "vm.azm.ms" | where Namespace == "LogicalDisk" and Name == "FreeSpacePercentage" | extend Disk=tostring(todynamic(Tags)["vm.azm.ms/mountId"]) | summarize AggregatedValue = avg(Val) by bin(TimeGenerated, 15m), Computer, _ResourceId, Disk | join hint.remote=left kind=leftouter overridenResource on _ResourceId | project-away _ResourceId1 At this point, you need to read the tag value, if it exists, and use it as the new specific threshold. If the tag does not exist, you need to default the generic threshold. The following lines, added to the query will do the trick: | extend appliedThresholdString = iif(tags contains "DiskSpaceOverride", tostring(tags.["DiskSpaceOverride"]), "10") | extend appliedThreshold = toint(appliedThresholdString) | where AggregatedValue < appliedThreshold Last but not least, project the query results. I am gonna suggest you add the appliedThreshold field to the project so you can see the real threshold applied to the resources: let overridenResource = (arg("").resources | where type =~ "Microsoft.Compute/virtualMachines" | project _ResourceId = tolower(id), tags | where tags contains "DiskSpaceOverride"); InsightsMetrics | where _ResourceId has "Microsoft.Compute/virtualMachines" | where Origin == "vm.azm.ms" | where Namespace == "LogicalDisk" and Name == "FreeSpacePercentage" | extend Disk=tostring(todynamic(Tags)["vm.azm.ms/mountId"]) | summarize AggregatedValue = avg(Val) by bin(TimeGenerated, 15m), Computer, _ResourceId, Disk | join hint.remote=left kind=leftouter overridenResource on _ResourceId | project-away _ResourceId1 | extend appliedThresholdString = iif(tags contains "DiskSpaceOverride", tostring(tags.["DiskSpaceOverride"]), "10") | extend appliedThreshold = toint(appliedThresholdString) | where AggregatedValue < appliedThreshold | project TimeGenerated, Computer, _ResourceId, Disk, AggregatedValue, appliedThreshold How do you put that into action? Just tag the resources you need to override, run the query manually to check the results and then update your alerts using the same approach. For instance, I tagged my vm-Win-Demos-01 with the suggested tag name (DiskSpaceOverride) and a tag value of 90. This will apply the threshold of 90% of free disk space for this machine only Running the query, we just assembled, I will get these results: With clear evidence that the threshold applied to vm-Win-Demos-01 was 90 instead of 10 This method is much more flexible and gives granular control over the resources compared to the one I previously blogged on. Let me know if you like it 😊 … That’s all folks, thanks for reading through😊 Disclaimer The sample scripts are not supported under any Microsoft standard support program or service. The sample scripts are provided AS IS without a warranty of any kind. Microsoft further disclaims all implied warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample scripts and documentation remains with you. In no event shall Microsoft, its authors, or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample scripts or documentation, even if Microsoft has been advised of the possibility of such damages.Azure Monitor: Gain Observability Over Guest Users
Have you sent invitations to guest users and don't know about the status? Not sure if guest users are still active or can be removed? Do you want to check if guest users have App roles and group membership assigned? In this post, I provide one possible solution to answer the aforementioned questions.Azure Kubernetes Services - Start & Stop Your AKS Cluster on Schedule using Azure Automation
Hi everybody, here I am again to show you a possible way to start and stop your AKS cluster on schedule. Are you interested in spending diligently your money? Are you curious to find a way to align with Cost Optimization recommendation about shutting down unused virtual machine during non business hours or in dev/test environment on AKS? If your answer is 'Yes', then please read through.Azure Monitor - Alert Notification via Teams
Hi there, Bruno Gabrielli here again to talk about how to get alert notification using a Teams channel. Lots of customers are using Teams channel as notification mechanism in their alert management process. They find it very helpful because Teams can be used over mobile devices and browsers without relying on your company laptop.Azure Monitor: Manage Data Access for Your Log Analytics Workspace
Struggling with giving very specific access to Log Analytics data, whether they be Security or Monitoring data? Managing your access list through IAM on the workspace”, is not enough? Read ahead to discover how to better assign specific permission to specific data or, even more, to data coming from specific resource(s).