azure event grid
4 TopicsHow to Fix Azure Event Grid Entra Authentication issue for ACS and Dynamics 365 integrated Webhooks
Introduction: Azure Event Grid is a powerful event routing service that enables event-driven architectures in Azure. When delivering events to webhook endpoints, security becomes paramount. Microsoft provides a secure webhook delivery mechanism using Microsoft Entra ID (formerly Azure Active Directory) authentication through the AzureEventGridSecureWebhookSubscriber role. Problem Statement: When integrating Azure Communication Services with Dynamics 365 Contact Center using Microsoft Entra ID-authenticated Event Grid webhooks, the Event Grid subscription deployment fails with an error: "HTTP POST request failed with unknown error code" with empty HTTP status and code. For example: Important Note: Before moving forward, please verify that you have the Owner role assigned on app to create event subscription. Refer to the Microsoft guidelines below to validate the required prerequisites before proceeding: Set up incoming calls, call recording, and SMS services | Microsoft Learn Why This Happens: This happens because AzureEventGridSecureWebhookSubscriber role is NOT properly configured on Microsoft EventGrid SP (Service Principal) and event subscription entra ID or application who is trying to create event grid subscription. What is AzureEventGridSecureWebhookSubscriber Role: The AzureEventGridSecureWebhookSubscriber is an Azure Entra application role that: Enables your application to verify the identity of event senders Allows specific users/applications to create event subscriptions Authorizes Event Grid to deliver events to your webhook How It Works: Role Creation: You create this app role in your destination webhook application's Azure Entra registration Role Assignment: You assign this role to: Microsoft Event Grid service principal (so it can deliver events) Either Entra ID / Entra User or Event subscription creator applications (so they can create event grid subscriptions) Token Validation: When Event Grid delivers events, it includes an Azure Entra token with this role claim Authorization Check: Your webhook validates the token and checks for the role Key Participants: Webhook Application (Your App) Purpose: Receives and processes events App Registration: Created in Azure Entra Contains: The AzureEventGridSecureWebhookSubscriber app role Validates: Incoming tokens from Event Grid Microsoft Event Grid Service Principal Purpose: Delivers events to webhooks App ID: Different per Azure cloud (Public, Government, etc.) Public Azure: 4962773b-9cdb-44cf-a8bf-237846a00ab7 Needs: AzureEventGridSecureWebhookSubscriber role assigned Event Subscription Creator Entra or Application Purpose: Creates event subscriptions Could be: You, Your deployment pipeline, admin tool, or another application Needs: AzureEventGridSecureWebhookSubscriber role assigned Although the full PowerShell script is documented in the below Event Grid documentation, it may be complex to interpret and troubleshoot. Azure PowerShell - Secure WebHook delivery with Microsoft Entra Application in Azure Event Grid - Azure Event Grid | Microsoft Learn To improve accessibility, the following section provides a simplified step-by-step tested solution along with verification steps suitable for all users including non-technical: Steps: STEP 1: Verify/Create Microsoft.EventGrid Service Principal Azure Portal → Microsoft Entra ID → Enterprise applications Change filter to Application type: Microsoft Applications Search for: Microsoft.EventGrid Ideally, your Azure subscription should include this application ID, which is common across all Azure subscriptions: 4962773b-9cdb-44cf-a8bf-237846a00ab7. If this application ID is not present, please contact your Azure Cloud Administrator. STEP 2: Create the App Role "AzureEventGridSecureWebhookSubscriber" Using Azure Portal: Navigate to your Webhook App Registration: Azure Portal → Microsoft Entra ID → App registrations Click All applications Find your app by searching OR use the Object ID you have Click on your app Create the App Role: Display name: AzureEventGridSecureWebhookSubscriber Allowed member types: Both (Users/Groups + Applications) Value: AzureEventGridSecureWebhookSubscriber Description: Azure Event Grid Role Do you want to enable this app role?: Yes In left menu, click App roles Click + Create app role Fill in the form: Click Apply STEP 3: Assign YOUR USER to the Role Using Azure Portal: Switch to Enterprise Application view: Azure Portal → Microsoft Entra ID → Enterprise applications Search for your webhook app (by name) Click on it Assign yourself: In left menu, click Users and groups Click + Add user/group Under Users, click None Selected Search for your user account (use your email) Select yourself Click Select Under Select a role, click None Selected Select AzureEventGridSecureWebhookSubscriber Click Select Click Assign STEP 4: Assign Microsoft.EventGrid Service Principal to the Role This step MUST be done via PowerShell or Azure CLI (Portal doesn't support this directly as we have seen) so PowerShell is recommended You will need to execute this step with the help of your Entra admin. # Connect to Microsoft Graph Connect-MgGraph -Scopes "AppRoleAssignment.ReadWrite.All" # Replace this with your webhook app's Application (client) ID $webhookAppId = "YOUR-WEBHOOK-APP-ID-HERE" #starting with c5 # Get your webhook app's service principal $webhookSP = Get-MgServicePrincipal -Filter "appId eq '$webhookAppId'" Write-Host " Found webhook app: $($webhookSP.DisplayName)" # Get Event Grid service principal $eventGridSP = Get-MgServicePrincipal -Filter "appId eq '4962773b-9cdb-44cf-a8bf-237846a00ab7'" Write-Host " Found Event Grid service principal" # Get the app role $appRole = $webhookSP.AppRoles | Where-Object {$_.Value -eq "AzureEventGridSecureWebhookSubscriber"} Write-Host " Found app role: $($appRole.DisplayName)" # Create the assignment New-MgServicePrincipalAppRoleAssignment ` -ServicePrincipalId $eventGridSP.Id ` -PrincipalId $eventGridSP.Id ` -ResourceId $webhookSP.Id ` -AppRoleId $appRole.Id Write-Host "Successfully assigned Event Grid to your webhook app!" Verification Steps: Verify the App Role was created: Your App Registration → App roles You should see: AzureEventGridSecureWebhookSubscriber Verify your user assignment: Enterprise application (your webhook app) → Users and groups You should see your user with role AzureEventGridSecureWebhookSubscriber Verify Event Grid assignment: Same location → Users and groups You should see Microsoft.EventGrid with role AzureEventGridSecureWebhookSubscriber Sample Flow: Analogy For Simplification: Lets think it similar to the construction site bulding where you are the owner of the building. Building = Azure Entra app (webhook app) Building (Azure Entra App Registration for Webhook) ├─ Building Name: "MyWebhook-App" ├─ Building Address: Application ID ├─ Building Owner: You ├─ Security System: App Roles (the security badges you create) └─ Security Team: Azure Entra and your actual webhook auth code (which validates tokens) like doorman Step 1: Creat the badge (App role) You (the building owner) create a special badge: - Badge name: "AzureEventGridSecureWebhookSubscriber" - Badge color: Let's say it's GOLD - Who can have it: Companies (Applications) and People (Users) This badge is stored in your building's system (Webhook App Registration) Step 2: Give badge to the Event Grid Service: Event Grid: "Hey, I need to deliver messages to your building" You: "Okay, here's a GOLD badge for your SP" Event Grid: *wears the badge* Now Event Grid can: - Show the badge to Azure Entra - Get tokens that say "I have the GOLD badge" - Deliver messages to your webhook Step 3: Give badge to yourself (or your deployment tool) You also need a GOLD badge because: - You want to create event grid event subscriptions - Entra checks: "Does this person have a GOLD badge?" - If yes: You can create subscriptions - If no: "Access denied" Your deployment pipeline also gets a GOLD badge: - So it can automatically set up event subscriptions during CI/CD deployments Disclaimer: The sample scripts provided in this article are provided AS IS without warranty of any kind. The author is not responsible for any issues, damages, or problems that may arise from using these scripts. Users should thoroughly test any implementation in their environment before deploying to production. Azure services and APIs may change over time, which could affect the functionality of the provided scripts. Always refer to the latest Azure documentation for the most up-to-date information. Thanks for reading this blog! I hope you found it helpful and informative for this specific integration use case 😀219Views2likes0CommentsAzure Event Grid Domain Creation: Overcoming AZ CLI's TLS Parameter Limitations with Workaround
Introduction: The Intersection of Security Policies and DevOps Automation In the modern cloud landscape, organizations increasingly enforce strict security requirements through platform policies. One common requirement is mandating latest TLS versions for example TLS 1.2 across all deployed resources to protect data in transit. While this is an excellent security practice, it can sometimes conflict with the available configuration options in deployment tools, particularly in the Azure CLI. This blog explores a specific scenario that many Azure DevOps teams encounter: how to deploy an Azure Event Grid domain when your organization has a custom policy requiring latest version considering TLS 1.2, but the Azure CLI command doesn't provide a parameter to configure this setting. The Problem: Understanding the Gap Between Policy and Tooling What Is Azure Event Grid? Azure Event Grid is a serverless event routing service that enables event-driven architectures. It manages the routing of events from various sources (like Azure services, custom applications, or SaaS products) to different handlers such as Azure Functions, Logic Apps, or custom webhooks. An Event Grid domain provides a custom topic endpoint that can receive events from multiple sources, offering a way to organize and manage events at scale. The Policy Requirement: Many organizations implement Azure Policy to enforce security standards across their cloud infrastructure. A common policy might look like this: { "policyRule": { "if": { "allOf": [ { "field": "type", "equals": "Microsoft.EventGrid/domains" }, { "anyOf": [ { "field": "Microsoft.EventGrid/domains/minimumTlsVersion", "exists": false }, { "field": "Microsoft.EventGrid/domains/minimumTlsVersion", "notEquals": "1.2" } ] } ] }, "then": { "effect": "deny" } } } This policy blocks the creation of any Event Grid domain that doesn't explicitly set TLS 1.2 as the minimum TLS version. The CLI Limitation: Now, let's examine the Azure CLI command to create an Event Grid domain: az eventgrid domain | Microsoft Learn TLS property is unrecognized with the latest version of AZ CLI version. Current Status of This Limitation: It's worth noting that this limitation has been recognized by the Azure team. There is an official GitHub feature request tracking this issue, which you can find at => Please add TLS support while creation of Azure Event Grid domain through CLI · Issue #31278 · Azure/azure-cli Before implementing this workaround described in this article, I recommend checking the current status of this feature request. The Azure CLI is continuously evolving, and by the time you're reading this, the limitation might have been addressed. However, as of April 2025, this remains a known limitation in the Azure CLI, necessitating the alternative approach outlined below. Why This Matters: This limitation becomes particularly problematic in CI/CD pipelines or Infrastructure as Code (IaC) scenarios where you want to automate the deployment of Event Grid domain resources. Workaround: You can utilize below ARM template and deploy it through AZ CLI in your deployment pipeline as below: Working ARM template: { "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#", "contentVersion": "1.0.0.0", "parameters": { "domainName": { "type": "string", "metadata": { "description": "Name of the Event Grid Domain" } }, "location": { "type": "string", "defaultValue": "[resourceGroup().location]", "metadata": { "description": "Azure region for the domain" } } }, "resources": [ { "type": "Microsoft.EventGrid/domains", "apiVersion": "2025-02-15", "name": "[parameters('domainName')]", "location": "[parameters('location')]", "properties": { "minimumTlsVersionAllowed": "1.2" } } ] } Please note I've used latest API version from below official Microsoft documentation : Microsoft.EventGrid/domains - Bicep, ARM template & Terraform AzAPI reference | Microsoft Learn Working AZ CLI command: az deployment group create --resource-group <rg> --template-file <armtemplate.json> --parameters domainName=<event grid domain name> You can store this ARM template in your configuration directory with replacement for Azure CLI command. It explicitly sets TLS 1.2 for Event Grid domains, ensuring security compliance where the CLI lacks this parameter. For example: az deployment group create --resource-group <rg> --template-file ./config/<armtemplate.json> --parameters domainName=<event grid domain name> Disclaimer: The sample scripts provided in this article are provided AS IS without warranty of any kind. The author is not responsible for any issues, damages, or problems that may arise from using these scripts. Users should thoroughly test any implementation in their environment before deploying to production. Azure services and APIs may change over time, which could affect the functionality of the provided scripts. Always refer to the latest Azure documentation for the most up-to-date information. Thanks for reading this blog! I hope you've found this workaround valuable for addressing the Event Grid domain TLS parameter limitation in Azure CLI. 😊222Views4likes0CommentsAzure Event Grid CLI Identity Gaps & Workarounds with Python REST and ARM Templates
Azure Event Grid has become a cornerstone service for building event-driven architectures in the cloud. It provides a scalable event routing service that enables reactive programming patterns, connecting event sources to event handlers seamlessly. However, when working with Event Grid through the Azure CLI, developers often encounter a significant limitation: the inability to configure system-assigned managed identities using CLI commands. In this blog post, I'll explore this limitation and provide practical workarounds using Python REST API calls and ARM templates with CLI to ensure your Event Grid deployments can leverage the security benefits of managed identities without being blocked by tooling constraints. Problem Statement: Unlike many other Azure resources that support the --identity or ---assign-identity parameter for enabling system-assigned managed identities, Event Grid's CLI commands lack this capability while creating event subscription for system topic at the moment. This means that while the Azure Portal and other methods support managed identities for Event Grid, you can't configure them directly through the CLI in case of system topic event subscriptions For example you can add managed identity for delivery through portal but not through AZ CLI: If you try to use the following CLI command: az eventgrid system-topic event-subscription create \ --name my-sub \ --system-topic-name my-topic \ --resource-group my-rg \ --endpoint <EH resource id> --endpoint-type eventhub \ --identity systemassigned You'll run into a limitation: The --identity flag is not supported or unrecognized for system topic subscriptions in Azure CLI. Also, --delivery-identity is in preview and under development Current Status of This Limitation: It's worth noting that this limitation has been recognized by the Azure team. There is an official GitHub feature request tracking this issue, which you can find at Use managed identity to command creates an event subscription for an event grid system topic · Issue #26910 · Azure/azure-cli. Before implementing any of the workarounds described in this article, I recommend checking the current status of this feature request. The Azure CLI is continuously evolving, and by the time you're reading this, the limitation might have been addressed. However, as of April 2025, this remains a known limitation in the Azure CLI, necessitating the alternative approaches outlined below. Why This Matters: This limitation becomes particularly problematic in CI/CD pipelines or Infrastructure as Code (IaC) scenarios where you want to automate the deployment of Event Grid resources with managed identities. Solution 1: Using Azure REST API with Python request library: The first approach to overcome this limitation is to use the Azure REST API with Python. This provides the most granular control over your Event Grid resources and allows you to enable system-assigned managed identities programmatically. System Topic Event Subscriptions - Create Or Update - REST API (Azure Event Grid) | Microsoft Learn You can retrieve Azure Entra token using below CLI command: az account get-access-token Sample working code & payload: import requests import json subscription_id = <> resource_group = <> system_topic_name = <> event_subscription_name = <> event_hub_resource_id = <> access_token = <> url = f"https://management.azure.com/subscriptions/{subscription_id}/resourceGroups/{resource_group}/providers/Microsoft.EventGrid/systemTopics/{system_topic_name}/eventSubscriptions/{event_subscription_name}?api-version=2024-12-15-preview" payload = { "identity": { "type": "SystemAssigned" }, "properties": { "topic": "/subscriptions/<>/resourceGroups/<>/providers/Microsoft.EventGrid/systemTopics/<>", "filter": { "includedEventTypes": [ "Microsoft.Storage.BlobCreated", "Microsoft.Storage.BlobDeleted" ], "advancedFilters": [], "enableAdvancedFilteringOnArrays": True }, "labels": [], "eventDeliverySchema": "EventGridSchema", "deliveryWithResourceIdentity": { "identity": { "type": "SystemAssigned" }, "destination": { "endpointType": "EventHub", "properties": { "resourceId": "/subscriptions/<>/resourceGroups/rg-sch/providers/Microsoft.EventHub/namespaces/<>/eventhubs/<>", "deliveryAttributeMappings": [ { "name": "test", "type": "Static", "properties": { "value": "test", "isSecret": False, "sourceField": "" } }, { "name": "id", "type": "Dynamic", "properties": { "value": "abc", "isSecret": False, "sourceField": "data.key" } } ] } } } } } headers = { "Authorization": f"Bearer {access_token}", "Content-Type": "application/json" } response = requests.put(url, headers=headers, data=json.dumps(payload)) if response.status_code in [200, 201]: print("Event subscription created successfully!") Remember that these tokens are sensitive security credentials, so handle them with appropriate care. They should never be exposed in logs, shared repositories, or other insecure locations. Solution 2: Using ARM Templates & deploying it through CLI Another solution is to use Azure Resource Manager (ARM) templates, which fully support system-assigned managed identities for Event Grid. This approach works well in existing IaC workflows. Microsoft.EventGrid/systemTopics/eventSubscriptions - Bicep, ARM template & Terraform AzAPI reference | Microsoft Learn Here's a sample ARM template that creates an Event Grid topic with a system-assigned managed identity: { "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#", "contentVersion": "1.0.0.0", "parameters": { "systemTopicName": { "type": "string", "metadata": { "description": "Name of the existing system topic" } }, "eventSubscriptionName": { "type": "string", "metadata": { "description": "Name of the event subscription to create" } }, "eventHubResourceId": { "type": "string", "metadata": { "description": "Resource ID of the Event Hub to send events to" } }, "includedEventType": { "type": "string", "defaultValue": "Microsoft.Storage.BlobCreated", "metadata": { "description": "Event type to filter on" } } }, "resources": [ { "type": "Microsoft.EventGrid/systemTopics/eventSubscriptions", "apiVersion": "2024-06-01-preview", "name": "[format('{0}/{1}', parameters('systemTopicName'), parameters('eventSubscriptionName'))]", "identity": { "type": "SystemAssigned" }, "properties": { "deliveryWithResourceIdentity": { "destination": { "endpointType": "EventHub", "properties": { "resourceId": "[parameters('eventHubResourceId')]" } }, "identity": { "type": "SystemAssigned" } }, "eventDeliverySchema": "EventGridSchema", "filter": { "includedEventTypes": [ "[parameters('includedEventType')]" ] } } } ] } How to deploy via Azure CLI: az deployment group create \ --resource-group <your-resource-group> \ --template-file eventgridarmtemplate.json \ --parameters \ systemTopicName=<system-topic-name> \ eventSubscriptionName=<event-subscription-name> \ eventHubResourceId="/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.EventHub/namespaces/<namespace>/eventhubs/<hub>" Disclaimer The sample scripts provided in this article are provided AS IS without warranty of any kind. The author is not responsible for any issues, damages, or problems that may arise from using these scripts. Users should thoroughly test any implementation in their environment before deploying to production. Azure services and APIs may change over time, which could affect the functionality of the provided scripts. Always refer to the latest Azure documentation for the most up-to-date information. Thanks for reading this blog! I hope you've found these workarounds valuable for addressing the Event Grid identity parameter limitation in Azure CLI. 😊197Views3likes0CommentsCreating a Reliable Notification System for Azure Spot VM Evictions (preempt) events
Introduction Azure Spot VMs offer significant cost savings but come with a trade-off: they can be evicted with minimal notice when Azure needs the capacity back or price change. Building a reliable notification system for these evictions is critical for applications that need to respond gracefully to these events. What are Azure Spot VMs? Azure Spot VMs are virtual machines that use spare capacity in Azure data centers, available at significantly discounted prices compared to regular pay-as-you-go VMs. Microsoft offers this unused capacity at discounts of up to 90% off the standard prices, making Spot VMs an extremely cost-effective option for many workloads. However, there's an important caveat: when Azure needs this capacity back for regular pay-as-you-go customers, your Spot VMs can be evicted (reclaimed) with minimal notice - typically just 30 seconds. This eviction mechanism is what allows Microsoft to offer such deep discounts, as we maintain the flexibility to reclaim these resources when needed. https://azure.microsoft.com/en-gb/products/virtual-machines/spot Benefits of Spot VMs Significant cost savings: The most obvious benefit is the substantial discount, which can be up to 90% off standard VM prices. Same VM types and features: Spot VMs provide the same performance, features, and capabilities as regular VMs - the only difference is the eviction possibility. Ideal for interruptible workloads: For workloads that can handle interruptions, such as batch processing jobs, dev/test environments, or stateless applications, Spot VMs offer enormous value. Flexible sizing options: Spot VMs are available in most VM series and regions, giving you access to a wide range of computing options. Scaling opportunities: The cost savings enable you to run larger clusters or more powerful VMs than might be financially feasible with regular VMs. Effective for burst capacity: When you need additional capacity for temporary workloads, Spot VMs can provide it at minimal cost. Great for fault-tolerant applications: Modern cloud-native applications designed with redundancy and resilience can leverage Spot VMs excellently since they're built to handle node failures. Why Not Just Use Azure Resource Events? A common question is: "Why not simply listen for Azure Resource events like ResourceActionSuccess for VM evictions?" While Azure does emit platform events when resources change state through resource group as source for Azure Event Grid topic subscription, there are several critical limitations when relying on these for Spot VM evictions: Timing issues: By the time a ResourceActionSuccess event is generated for a VM eviction, it is possible that the VM is already being evicted. This gives you no time to perform graceful shutdown procedures. Reliability concerns: These events pass through multiple Azure systems before reaching your event handlers, adding potential points of failure and latency. Ambiguous events: Resource action events don't clearly distinguish between a normal VM shutdown and a Spot VM eviction, making it difficult to trigger the right response. For example: I initially attempted to capture Azure Spot VM eviction events by setting up event notifications on an Azure resource group and publishing them to Service Bus. While this configuration successfully captured some Azure Resource events, it ultimately proved unreliable for eviction monitoring. The solution missed several critical eviction events and, more problematically, could not reliably distinguish between intentional VM shutdowns and actual eviction events. This lack of differentiation made automated response handling impossible, as the system couldn't determine whether a VM was being evicted by Azure or simply stopped through normal administrative actions. Azure resource group as an Event Grid source - Azure Event Grid | Microsoft Learn For these reasons, the most reliable approach is to detect eviction events directly from within the VM using the Azure Instance Metadata Service (IMDS) Scheduled Events API, which is specifically designed to provide advance notice of impending VM state changes. This blog post will guide you through implementing a solution that: Detects Spot VM eviction events from within the VM Formats these events properly Sends them to an Azure Event Grid custom topic Sets up proper event handling downstream Understanding Spot VM Eviction Notices Spot VMs receive eviction notifications approximately 30 seconds before being reclaimed. These notifications are delivered through the Azure Instance Metadata Service (IMDS) Scheduled Events API - an endpoint available from within the VM at http://169.254.169.254/metadata/scheduledevents. When a Spot VM is about to be evicted, a "Preempt" event appears in the Scheduled Events data. Your application needs to poll this endpoint regularly to detect these events in time to take action. https://learn.microsoft.com/en-us/azure/virtual-machines/windows/scheduled-events Solution overview Our solution consists of below components: A custom Event Grid topic to receive and distribute the events - optional if you wish to handle on own from VM A monitoring script running inside the Spot VM - actual script to poll events running on VM Logic to format and send events from the VM to Event Grid Event subscribers that take action when evictions occur A) Setting Up the Event Grid Custom Topic First, create an Event Grid custom topic that will serve as the distribution mechanism for your eviction events - this can be optional if you plan to take actions from VM only like gracefully shutting down any existing processes. You can use below documentation to create custom event grid topic: Custom topics in Azure Event Grid - Azure Event Grid | Microsoft Learn B) Creating a Windows-Based Eviction Monitor For Windows Spot VMs, we'll use below PowerShell to poll preempt events & send it to custom event grid. Create a script file named SpotMonitor.ps1: Powershell script : SpotMonitor.ps1 # Configuration variables - replace with your values $EventGridTopicEndpoint = "https://<EG topic name>.westeurope-1.eventgrid.azure.net/api/events" $EventGridKey = "<EG key>" $CheckInterval = 3 # seconds between checks - feel free to modify as per your requirement $LogFile = "C:\Logs\spot-monitor.log" # Create log directory if it doesn't exist if (-not (Test-Path (Split-Path $LogFile))) { New-Item -ItemType Directory -Path (Split-Path $LogFile) -Force } function Write-Log { param ([string]$Message) $timestamp = Get-Date -Format "yyyy-MM-dd HH:mm:ss" "$timestamp - $Message" | Out-File -FilePath $LogFile -Append } Write-Log "Starting Spot VM eviction monitor..." while ($true) { try { # Get the VM's metadata including scheduled events $headers = @{"Metadata" = "true"} $scheduledEvents = Invoke-RestMethod -Uri "http://169.254.169.254/metadata/scheduledevents?api-version=2020-07-01" -Headers $headers # Check if there are any events if ($scheduledEvents.Events -and $scheduledEvents.Events.Count -gt 0) { Write-Log "Found $($scheduledEvents.Events.Count) scheduled events" # Get VM metadata for context $vmName = Invoke-RestMethod -Uri "http://169.254.169.254/metadata/instance/compute/name?api-version=2020-09-01&format=text" -Headers $headers $resourceGroup = Invoke-RestMethod -Uri "http://169.254.169.254/metadata/instance/compute/resourceGroupName?api-version=2020-09-01&format=text" -Headers $headers $subscription = Invoke-RestMethod -Uri "http://169.254.169.254/metadata/instance/compute/subscriptionId?api-version=2020-09-01&format=text" -Headers $headers # Process each event foreach ($event in $scheduledEvents.Events) { if ($event.EventType -eq "Preempt") { Write-Log "ALERT: Spot VM preemption detected!" # Extract event details $eventId = $event.EventId $notBefore = $event.NotBefore Write-Log "VM $vmName will be preempted not before $notBefore" # Create Event Grid event as an array (critical for EventGrid schema) $eventGridEvent = @( @{ subject = "/subscriptions/$subscription/resourceGroups/$resourceGroup/providers/Microsoft.Compute/virtualMachines/$vmName" eventType = "SpotVM.Preemption" eventTime = (Get-Date).ToUniversalTime().ToString("o") id = [Guid]::NewGuid().ToString() data = @{ vmName = $vmName resourceGroup = $resourceGroup subscription = $subscription preemptionTime = $notBefore eventId = $eventId eventType = $event.EventType } dataVersion = "1.0" } ) # Convert to JSON - ensuring it stays as an array $eventGridPayload = ConvertTo-Json -InputObject $eventGridEvent -Depth 10 # Send to Event Grid $eventGridHeaders = @{ "Content-Type" = "application/json" "aeg-sas-key" = $EventGridKey } try { $response = Invoke-RestMethod -Uri $EventGridTopicEndpoint -Method Post -Body $eventGridPayload -Headers $eventGridHeaders Write-Log "Successfully sent event to Event Grid" # Take actions to prepare for shutdown Write-Log "Taking actions to prepare for shutdown..." # Example: Stop services gracefully # Stop-Service -Name "YourServiceName" -Force } catch { Write-Log "Failed to send to Event Grid: $_" } } } } } catch { Write-Log "Error checking for events: $_" } # Wait before checking again Start-Sleep -Seconds $CheckInterval } The script above checks for eviction events every 3 seconds by default. You can adjust this polling frequency by changing the "Check_Interval" variable in the script to better match your specific system requirements and performance considerations. More frequent polling provides faster detection but increases resource usage, while less frequent polling reduces overhead but might slightly delay event detection. B) Running monitor script as a scheduler or service For Windows Spot VMs, we'll use PowerShell to create a monitoring service. Run a script file named SpotMonitor.ps1 created in last step: You can use Windows Task Scheduler to run the script at startup or to run as a service and the logs will looks like this: Logs: 2025-03-19 18:48:27 - Starting Spot VM eviction monitor... 2025-03-19 20:04:33 - Found 1 scheduled events 2025-03-19 20:04:33 - ALERT: Spot VM preemption detected! 2025-03-19 20:04:33 - VM anivmnew will be preempted not before Wed, 19 Mar 2025 20:04:47 GMT 2025-03-19 20:04:33 - Sending payload: [ { "eventTime": "2025-03-19T20:04:33.4655660Z", "data": { "eventId": "DE2EC5FA-AF0A-4D59-85D2-677C66A6BC12", "preemptionTime": "Wed, 19 Mar 2025 20:04:47 GMT", "eventType": "Preempt", "resourceGroup": "RG-TEST", "subscription": "azure-sub-id", "vmName": "anivmnew" }, "id": "5d3e6430-dff5-45da-ae90-992e3e342d37", "subject": "/subscriptions/azure-sub-id/resourceGroups/RG-TEST/providers/Microsoft.Compute/virtualMachines/anivmnew", "eventType": "SpotVM.Preemption", "dataVersion": "1.0" } ] 2025-03-19 20:04:33 - Event Grid response: 2025-03-19 20:04:33 - Taking actions to prepare for shutdown... 2025-03-19 20:04:36 - Found 1 scheduled events 2025-03-19 20:04:36 - ALERT: Spot VM preemption detected! 2025-03-19 20:04:36 - VM anivmnew will be preempted not before Wed, 19 Mar 2025 20:04:47 GMT 2025-03-19 20:04:36 - Sending payload: [ { "eventTime": "2025-03-19T20:04:36.6382480Z", "data": { "eventId": "DE2EC5FA-AF0A-4D59-85D2-677C66A6BC12", "preemptionTime": "Wed, 19 Mar 2025 20:04:47 GMT", "eventType": "Preempt", "resourceGroup": "RG-TEST", "subscription": "azure-sub-id", "vmName": "anivmnew" }, "id": "b6152429-f4cb-43b9-8c53-b6ceb08946e5", "subject": "/subscriptions/azure-sub-id/resourceGroups/RG-TEST/providers/Microsoft.Compute/virtualMachines/anivmnew", "eventType": "SpotVM.Preemption", "dataVersion": "1.0" } ] 2025-03-19 20:04:36 - Event Grid response: 2025-03-19 20:04:36 - Taking actions to prepare for shutdown... 2025-03-19 20:04:39 - Found 1 scheduled events 2025-03-19 20:04:39 - ALERT: Spot VM preemption detected! 2025-03-19 20:04:39 - VM anivmnew will be preempted not before Wed, 19 Mar 2025 20:04:47 GMT 2025-03-19 20:04:39 - Sending payload: [ { "eventTime": "2025-03-19T20:04:39.7567285Z", "data": { "eventId": "DE2EC5FA-AF0A-4D59-85D2-677C66A6BC12", "preemptionTime": "Wed, 19 Mar 2025 20:04:47 GMT", "eventType": "Preempt", "resourceGroup": "RG-TEST", "subscription": "azure-sub-id", "vmName": "anivmnew" }, "id": "e0bde6d0-ae27-4c01-8e69-621e57d70f8d", "subject": "/subscriptions/azure-sub-id/resourceGroups/RG-TEST/providers/Microsoft.Compute/virtualMachines/anivmnew", "eventType": "SpotVM.Preemption", "dataVersion": "1.0" } ] 2025-03-19 20:04:39 - Event Grid response: 2025-03-19 20:04:39 - Taking actions to prepare for shutdown... 2025-03-19 20:04:42 - Found 1 scheduled events 2025-03-19 20:04:42 - ALERT: Spot VM preemption detected! 2025-03-19 20:04:42 - VM anivmnew will be preempted not before Wed, 19 Mar 2025 20:04:47 GMT 2025-03-19 20:04:42 - Sending payload: [ { "eventTime": "2025-03-19T20:04:42.8339675Z", "data": { "eventId": "DE2EC5FA-AF0A-4D59-85D2-677C66A6BC12", "preemptionTime": "Wed, 19 Mar 2025 20:04:47 GMT", "eventType": "Preempt", "resourceGroup": "RG-TEST", "subscription": "azure-sub-id", "vmName": "anivmnew" }, "id": "ab7a3b84-bcd8-4651-829e-c57043c54b92", "subject": "/subscriptions/azure-sub-id/resourceGroups/RG-TEST/providers/Microsoft.Compute/virtualMachines/anivmnew", "eventType": "SpotVM.Preemption", "dataVersion": "1.0" } ] 2025-03-19 20:04:42 - Event Grid response: 2025-03-19 20:04:42 - Taking actions to prepare for shutdown... 2025-03-19 20:04:45 - Found 1 scheduled events 2025-03-19 20:04:45 - ALERT: Spot VM preemption detected! 2025-03-19 20:04:45 - VM anivmnew will be preempted not before Wed, 19 Mar 2025 20:04:47 GMT 2025-03-19 20:04:45 - Sending payload: [ { "eventTime": "2025-03-19T20:04:45.9317109Z", "data": { "eventId": "DE2EC5FA-AF0A-4D59-85D2-677C66A6BC12", "preemptionTime": "Wed, 19 Mar 2025 20:04:47 GMT", "eventType": "Preempt", "resourceGroup": "RG-TEST", "subscription": "azure-sub-id", "vmName": "anivmnew" }, "id": "eacfae6b-4ea5-426d-8bc2-659320a7baf0", "subject": "/subscriptions/azure-sub-id/resourceGroups/RG-TEST/providers/Microsoft.Compute/virtualMachines/anivmnew", "eventType": "SpotVM.Preemption", "dataVersion": "1.0" } ] 2025-03-19 20:04:45 - Event Grid response: 2025-03-19 20:04:45 - Taking actions to prepare for shutdown... 2025-03-19 20:04:48 - Found 1 scheduled events 2025-03-19 20:04:49 - ALERT: Spot VM preemption detected! 2025-03-19 20:04:49 - VM anivmnew will be preempted not before Wed, 19 Mar 2025 20:04:47 GMT 2025-03-19 20:04:49 - Sending payload: [ { "eventTime": "2025-03-19T20:04:49.0666732Z", "data": { "eventId": "DE2EC5FA-AF0A-4D59-85D2-677C66A6BC12", "preemptionTime": "Wed, 19 Mar 2025 20:04:47 GMT", "eventType": "Preempt", "resourceGroup": "RG-TEST", "subscription": "azure-sub-id", "vmName": "anivmnew" }, "id": "b2142ee8-9ecf-441d-846e-c8ed663a949e", "subject": "/subscriptions/azure-sub-id/resourceGroups/RG-TEST/providers/Microsoft.Compute/virtualMachines/anivmnew", "eventType": "SpotVM.Preemption", "dataVersion": "1.0" } ] 2025-03-19 20:04:49 - Event Grid response: 2025-03-19 20:04:49 - Taking actions to prepare for shutdown... 2025-03-19 20:04:52 - Found 1 scheduled events 2025-03-19 20:04:52 - ALERT: Spot VM preemption detected! 2025-03-19 20:04:52 - VM anivmnew will be preempted not before Wed, 19 Mar 2025 20:04:47 GMT 2025-03-19 20:04:52 - Sending payload: [ { "eventTime": "2025-03-19T20:04:52.1310990Z", "data": { "eventId": "DE2EC5FA-AF0A-4D59-85D2-677C66A6BC12", "preemptionTime": "Wed, 19 Mar 2025 20:04:47 GMT", "eventType": "Preempt", "resourceGroup": "RG-TEST", "subscription": "azure-sub-id", "vmName": "anivmnew" }, "id": "d9eba318-9773-4e73-a694-dd1c1bf89c10", "subject": "/subscriptions/azure-sub-id/resourceGroups/RG-TEST/providers/Microsoft.Compute/virtualMachines/anivmnew", "eventType": "SpotVM.Preemption", "dataVersion": "1.0" } ] 2025-03-19 20:04:52 - Event Grid response: 2025-03-19 20:04:52 - Taking actions to prepare for shutdown... 2025-03-19 20:04:55 - Found 1 scheduled events 2025-03-19 20:04:55 - ALERT: Spot VM preemption detected! 2025-03-19 20:04:55 - VM anivmnew will be preempted not before Wed, 19 Mar 2025 20:04:47 GMT 2025-03-19 20:04:55 - Sending payload: [ { "eventTime": "2025-03-19T20:04:55.2171546Z", "data": { "eventId": "DE2EC5FA-AF0A-4D59-85D2-677C66A6BC12", "preemptionTime": "Wed, 19 Mar 2025 20:04:47 GMT", "eventType": "Preempt", "resourceGroup": "RG-TEST", "subscription": "azure-sub-id", "vmName": "anivmnew" }, "id": "c358c433-50f5-496d-8823-c2ffddd03390", "subject": "/subscriptions/azure-sub-id/resourceGroups/RG-TEST/providers/Microsoft.Compute/virtualMachines/anivmnew", "eventType": "SpotVM.Preemption", "dataVersion": "1.0" } ] 2025-03-19 20:04:55 - Event Grid response: 2025-03-19 20:04:55 - Taking actions to prepare for shutdown... 2025-03-19 20:04:58 - Found 1 scheduled events 2025-03-19 20:04:58 - ALERT: Spot VM preemption detected! 2025-03-19 20:04:58 - VM anivmnew will be preempted not before Wed, 19 Mar 2025 20:04:47 GMT 2025-03-19 20:04:58 - Sending payload: [ { "eventTime": "2025-03-19T20:04:58.3040422Z", "data": { "eventId": "DE2EC5FA-AF0A-4D59-85D2-677C66A6BC12", "preemptionTime": "Wed, 19 Mar 2025 20:04:47 GMT", "eventType": "Preempt", "resourceGroup": "RG-TEST", "subscription": "azure-sub-id", "vmName": "anivmnew" }, "id": "3eacba95-e05f-41dc-b9e7-1593fe2a71e2", "subject": "/subscriptions/azure-sub-id/resourceGroups/RG-TEST/providers/Microsoft.Compute/virtualMachines/anivmnew", "eventType": "SpotVM.Preemption", "dataVersion": "1.0" } ] 2025-03-19 20:04:58 - Event Grid response: 2025-03-19 20:04:58 - Taking actions to prepare for shutdown... 2025-03-19 20:05:01 - Found 1 scheduled events 2025-03-19 20:05:01 - ALERT: Spot VM preemption detected! 2025-03-19 20:05:01 - VM anivmnew will be preempted not before 2025-03-19 20:05:01 - Sending payload: [ { "eventTime": "2025-03-19T20:05:01.3842973Z", "data": { "eventId": "DE2EC5FA-AF0A-4D59-85D2-677C66A6BC12", "preemptionTime": "", "eventType": "Preempt", "resourceGroup": "RG-TEST", "subscription": "azure-sub-id", "vmName": "anivmnew" }, "id": "85c058fc-4f2e-49ec-a027-6fcca60f7935", "subject": "/subscriptions/azure-sub-id/resourceGroups/RG-TEST/providers/Microsoft.Compute/virtualMachines/anivmnew", "eventType": "SpotVM.Preemption", "dataVersion": "1.0" } ] 2025-03-19 20:05:01 - Event Grid response: 2025-03-19 20:05:01 - Taking actions to prepare for shutdown... C) Configuring event subscribers Now that your Spot VMs are sending eviction events to Event Grid, set up subscribers to take action when these events occur. For example sending event to service bus queue: Conclusion By implementing this solution, you've created a reliable way to detect and respond to Spot VM evictions. This approach gives your applications precious time to react to evictions, significantly improving reliability while still benefiting from the cost savings of Spot VMs. While Azure does provide resource-level events through system topics, they simply don't provide the reliability, timing, and clarity needed for mission-critical workloads running on Spot VMs. The combination of the Azure Instance Metadata Service Scheduled Events API and custom Event Grid topics creates a powerful pattern for building resilient, event-driven architectures. This approach ensures you're getting the most accurate and timely notifications possible, giving your applications the best chance to gracefully handle Spot VM evictions while enjoying the substantial cost benefits that Spot VMs offer. Disclaimer The sample scripts provided in this article are provided AS IS without warranty of any kind. The author is not responsible for any issues, damages, or problems that may arise from using these scripts. Users should thoroughly test any implementation in their environment before deploying to production. Azure services and APIs may change over time, which could affect the functionality of the provided scripts. Always refer to the latest Azure documentation for the most up-to-date information. Thanks for reading this blog! I hope you've found this approach to handling Spot VM evictions helpful1.1KViews2likes0Comments