Howdy folks!
Michele Ferrari here from the Premier Field Engineer- Azure Team in San Francisco, here today for a new release of Mix & Match series.
Azure Application Gateway Alert. What are the ip addresses or VM in the BackEnd currently Unhealthy?
If you have been approached with this question, you might know by now that application gateway’s v1/v2 logs currently don’t provide this information as part of the Alert.
The current alert only tells you which “Backend Name” is Unhealthy but a Backend is made by different VM or IPs and the alert does not tell you which of the specific resources are currently unhealthy.
So far, the only way to get that information it is from the Azure Portal or PowerShell:
Portal-Backend health
|
cli – show-backend-health | powershell Get-AzureRmApplicationGatewayBackendHealth
PS C:\MISTERMIK> az network application-gateway show-backend-health --resource-group xxx-APPGW-RSG --name APPGW1 { "backendAddressPools": [ { "backendAddressPool": { "backendAddresses": null, "backendIpConfigurations": null, "etag": null, "id": "/subscriptions/XXXX/resourceGroups/xx-APPGW-RSG/providers/Microsoft.Network/applicationGateways/APPGW1/backendAddressPool s/BackendPool", "name": null, "provisioningState": null, "resourceGroup": "xx-APPGW-RSG", "type": null }, "backendHttpSettingsCollection": [ { "backendHttpSettings": { "affinityCookieName": null, "authenticationCertificates": null, "connectionDraining": null, "cookieBasedAffinity": null, "etag": null, "hostName": null, "id": "/subscriptions/XXXX-/resourceGroups/xx-APPGW-RSG/providers/Microsoft.Network/applicationGateways/APPGW1/backendHttpSet tingsCollection/HTTP80setting", "name": null, "path": null, "pickHostNameFromBackendAddress": null, "port": null, "probe": null, "probeEnabled": null, "protocol": null, "provisioningState": null, "requestTimeout": null, "resourceGroup": "xx-APPGW-RSG", "trustedRootCertificates": null, "type": null }, "servers": [ { "address": "10.0.0.1", "health": "Unhealthy", "healthProbeLog": "Cannot connect to server. Check whether any NSG/UDR/Firewall is blocking access to server. Check if application is running on correct port." , "ipConfiguration": null }, { "address": "10.0.0.2", "health": "Unhealthy", "healthProbeLog": "Cannot connect to server. Check whether any NSG/UDR/Firewall is blocking access to server. Check if application is running on correct port." , "ipConfiguration": null } ] } ] } ] }
|
All right, this is one of my branded Mix&Match series and my manifesto requires to include different technologies to provide a solution which must solve a problem
Technologies used:
Architectural Flow:
Now that we covered the solution let’s approach the How To DO practical part of this post.
Create Logic App
1. Open the Azure Portal if you have not opened the Portal yet and search for Logic App.
2. Click on Create button to continue with the creation of the Logic App.
3. Enter properties for Logic App and click on Create button.
4. Open Logic App Designer for Logic App
5. Click on When a HTTP request is received
6. Add a Parse JSON step and set Content to Body
7. Copy and paste the Schema below:
{ "properties": { "ApplicationGatewayName": { "type": "string" }, "ApplicationGatewayResourceGroup": { "type": "string" }, "BackendPoolName": { "type": "string" }, "vms": { "items": { "properties": { "Name": { "type": "string" }, "ResourceGroupName": { "type": "string" } }, "required": [ "Name", "ResourceGroupName" ], "type": "object" }, "type": "array" } }, "type": "object" }
|
8. Create HTML table step as below:
9. Add Send an email step. You can personalize your own email message using the tokens extrapolated by the PARSE DATA.
Connection is important as it will connect to an Office365 mail which will be used to send this email from
10. Save
11. Take note of the HTTP POST URL and save it in a Notepad as we will need it pretty soon for the Automation piece.
Create Automation account
1. Click the Create a resource button found on the upper left-hand corner of Azure.
2. Select Management Tools, and then select Automation.
3. Enter the account information. For Create Azure Run As account, choose Yes so that the artifacts to
simplify authentication to Azure are enabled automatically.
4. Create runbook
5. Click on Variables and then click on Add a variable
6. Call it SendAppGatewayAlertEmailWebhook
7. Paste into Value the HTTP POST URL from LogicApp you should have noted
8. Click on RunBook and click on Create a runbook
9. Type a Name and select Powershell as type
10. Type in the script below
<# Script to demo how to get data from a posted webhook Notice: This Sample Code is provided for the purpose of illustration only and is not intended to be used in a production environment. THIS SAMPLE CODE AND ANY RELATED INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A PARTICULAR PURPOSE. We grant You a nonexclusive, royalty-free right to use and modify the Sample Code and to reproduce and distribute the object code form of the Sample Code, provided that You agree: (i) to not use Our name, logo, or trademarks to market Your software product in which the Sample Code is embedded; (ii) to include a valid copyright notice on Your software product in which the Sample Code is embedded; and (iii) to indemnify, hold harmless, and defend Us and Our suppliers from and against any claims or lawsuits, including attorneys’ fees, that arise or result from the use or distribution of the Sample Code. #> Param ( [object]$WebhookData )
#====================START OF CONNECTION SETUP====================== $Conn = Get-AutomationConnection -Name "AzureRunAsConnection" Add-AzureRMAccount -ServicePrincipal -Tenant $Conn.TenantID ` -ApplicationId $Conn.ApplicationID -CertificateThumbprint $Conn.CertificateThumbprint Get-AzureRmSubscription #====================END OF CONNECTION SETUP======================= $actionWebhookURI = Get-AutomationVariable -Name 'SendAppGatewayAlertEmailWebhook' $Active=$false
if ($WebhookData -ne $null) { # Collect properties of WebhookData. $WebhookName = $WebhookData.WebhookName $WebhookBody = $WebhookData.RequestBody $WebhookHeaders = $WebhookData.RequestHeader
# Information on the webhook name that called This Write-Output "This runbook was started from webhook $WebhookName."
# Obtain the WebhookBody containing the AlertContext $WebhookBody = (ConvertFrom-Json -InputObject $WebhookBody) Write-Output "`nWEBHOOK BODY" Write-Output "=============" Write-Output $WebhookBody
# This is the common Metric Alert schema (released March 2019) $Essentials = [object] ($WebhookBody.data).essentials # Get the first target only as this script doesn't handle multiple $alertTargetIdArray = (($Essentials.alertTargetIds)[0]).Split("/") $SubId = ($alertTargetIdArray)[2] $ResourceGroupName = ($alertTargetIdArray)[4] $ResourceType = ($alertTargetIdArray)[6] + "/" + ($alertTargetIdArray)[7] $ResourceName = ($alertTargetIdArray)[-1] $status = $Essentials.monitorCondition
Write-Output "SubscriptionID: " $SubId Write-Output "ResourceGroupName: " $ResourceGroupName Write-Output "ResourceType: " $ResourceType Write-Output "ResourceName: " $ResourceName Write-Output "Status:" $status Write-Output "Essentials: " $Essentials
#get the backend health status for the impacted app gw $backendHealth = Get-AzureRmApplicationGatewayBackendHealth -Name $ResourceName -ResourceGroupName $ResourceGroupName Write-Output $backendHealth
### FOREACH foreach($backendPool in $backendHealth.BackendAddressPools){ $vmList = @()
#Display name of the pool is last part of the ID $poolName = ($backendPool.backendaddresspool.id).Split("/") $poolName = $($poolName[$poolName.Length-1]) Write-output "Pool Name: $($poolName)" foreach ($backendRefState in $backendPool.BackendHttpSettingsCollection.Servers){ if($backendRefState.Health -ne "Healthy"){ $Active=$true $nic = $null
#VM added by NIC reference: if ($backendRefState.IpConfiguration -ne $null){ #parse out the NIC name and resource group from the long form ID #Doing this because doesnt accept Resource ID as a param $nicRGName = ($backendRefState.IpConfiguration.Id).split("/")[4] $nicName = ($backendRefState.IpConfiguration.Id).split("/")[8]
$nic = Get-AzureRmNetworkInterface -Name $nicName -ResourceGroupName $nicRGName if($nic -ne $null){ $vmref = Get-AzureRmResource -ResourceId $nic.VirtualMachine.Id $vm = Get-AzureRmVM -ResourceGroupName $vmref.ResourceGroupName -Name $vmref.Name Write-output "Unhealthy VM Name: $($vm.Name) in RG: $($vm.ResourceGroupName)" $vmList += $vm } #VM added by IP Address: } else { #Find which NIC the IP address is attached to #Potential issue: if there are multiple VNets in the Subscription with the same address space, this could return >1 NIC $nic = Get-AzureRmNetworkInterface | ?{$_.IpConfigurations.PrivateIpAddress -eq $backendRefState.Address} if($nic -ne $null){ foreach ($NicId in $nic){ Write-Output "NicId :" $NicId $vmref = Get-AzureRmResource -ResourceId $nicId.VirtualMachine.Id $vm = Get-AzureRmVM -ResourceGroupName $vmref.ResourceGroupName -Name $vmref.Name Write-output "Unhealthy VM Name: $($vm.Name) in RG: $($vm.ResourceGroupName)" $vmList += $vm } } } } } if($Active) { #add captured data to a json formatted list $listForWebHook = @{ApplicationGatewayName = $ResourceName; ApplicationGatewayResourceGroup = $ResourceGroupName; BackendPoolName = $poolName; vms = @($vmList | select Name, ResourceGroupName)} $listForWebHookjson = $listForWebHook | ConvertTo-Json Write-Output $listForWebHookjson -Verbose #trigger webhook Invoke-RestMethod -Method Post -Uri $actionWebhookURI -Body $listForWebHookjson -ContentType 'application/json' } Else {Write-Output "Alert is resolved, no need to send an Email"} } ### end FOREACH
} Else { Write-Output 'No data received' }
|
11. As you can notice, the RunBook Runs as the Automation Account created when you created the Automation resource. This account has contributor access into all your subscription’s resources. I’m not going to cover it in this post but you can find it in AzureAD going into Applications and also, going into Subscription and selecting Access control IAM you can see that the contributor role is given at subscription root level and inherited into all resource groups. You will need to pay attention and eventually restrict permission for this automation account.
As you might have noticed, the powershell receives data from the Alert:
"SubscriptionID: " $SubId
"ResourceGroupName: " $ResourceGroupName
"ResourceName: " $ResourceName //BackEnd Name which is Unhealthy
The script executes the Get-AzureRmApplicationGatewayBackendHealth -Name $ResourceName -ResourceGroupName $ResourceGroupName
And then it will find which Virtual Machine corresponds to the IP address specified as Unhealthy into the BackEndHealth data. The code creates a System.Collections.Hashtable variable type containing:
$listForWebHook = @{ApplicationGatewayName = $ResourceName;
ApplicationGatewayResourceGroup = $ResourceGroupName;
BackendPoolName = $poolName;
vms = @($vmList | select Name, ResourceGroupName)}
$listForWebHookjson = $listForWebHook | ConvertTo-Json
This is then sent to LogicApp that will take parse this data to generate the email we defined before:
Invoke-RestMethod -Method Post -Uri $actionWebhookURI -Body $listForWebHookjson -ContentType 'application/json'
12. Click on Webhooks and then Add Webhook
A webhook allows you to start a particular runbook in Azure Automation through a single HTTP request
13. Specify a Name and an expiration and then Copy in a local notepad the URL for the webhook that we will need when defining the Alert
Create the Application Gateway Alert and connect all together
1. Browse to the Application Gateway blade, under Monitoring click on Alerts
2. Click on New alert rule
3. Define the Alert Condition when you want to trigger an alert (in this example I’m using ≥ 1)
4. For the purpose of this test, I’m defining a really short evaluation period:
5. Let’s create a new Action Group:
6. Define the Action Group using the Webhook URL you should have noted from the previous step when creating the RunBook in Azure Automation
Test
I have a CentOS Linux 7 as a backend for my Application Gateway running NGINX, I’ll show you what happens when I stop the NGINX service
1. Stop NGINX
2. Alert is triggered
3. The Action Group sends the standard Email Alert:
4. Calls the WebHook which will request info from the Backend and then pass the information to LogicApp that will finally generates and send this email containing which specific VM or list of VM are actually not Healthy.
5. Alert will show a Fired Sev3 alert
This concludes the post. I hope I gave you reasons to explore more other components we have available in Azure and a real example on how things can work well together.
If you asked me, hey MisterMik would this be the only way to get this done? Mmm obviously not, you could also Run an automation to inject data into LogAnalytics and trigger the Alert from LogAnalytics itself….but that it’s a completely different blog post !
Take care and I see you soon !
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.