analytics
807 TopicsAzure Databricks & Fabric Disaster Recovery: The Better Together Story
Author's: Amudha Palani amudhapalani, Oscar Alvarado oscaralvarado, Eric Kwashie ekwashie, Peter Lo PeterLo and Rafia Aqil Rafia_Aqil Disaster recovery (DR) is a critical component of any cloud-native data analytics platform, ensuring business continuity even during rare regional outages caused by natural disasters, infrastructure failures, or other disruptions. Identify Business Critical Workloads Before designing any disaster recovery strategy, organizations must first identify which workloads are truly business‑critical and require regional redundancy. Not all Databricks or Fabric processes need full DR protection; instead, customers should evaluate the operational impact of downtime, data freshness requirements, regulatory obligations, SLAs, and dependencies across upstream and downstream systems. By classifying workloads into tiers and aligning DR investments accordingly, customers ensure they protect what matters most without over‑engineering the platform. Azure Databricks Azure Databricks requires a customer‑driven approach to disaster recovery, where organizations are responsible for replicating workspaces, data, infrastructure components, and security configurations across regions. Full System Failover (Active-Passive) Strategy A comprehensive approach that replicates all dependent services to the secondary region. Implementation requirements include: Infrastructure Components: Replicate Azure services (ADLS, Key Vault, SQL databases) using Terraform Deploy network infrastructure (subnets) in the secondary region Establish data synchronization mechanisms Data Replication Strategy: Use Deep Clone for Delta tables rather than geo-redundant storage Implement periodic synchronization jobs using Delta's incremental replication Measure data transfer results using time travel syntax Workspace Asset Synchronization: Co-deploy cluster configurations, notebooks, jobs, and permissions using CI/CD Utilize Terraform and SCIM for identity and access management Keep job concurrencies at zero in the secondary region to prevent execution Fully Redundant (Active-Active) Strategy The most sophisticated approach where all transactions are processed in multiple regions simultaneously. While providing maximum resilience, this strategy: Requires complex data synchronization between regions Incurs highest operational costs due to duplicate processing Typically needed only for mission-critical workloads with zero-tolerance for downtime Can be implemented as partial active-active, processing most workload in primary with subset in secondary Enabling Disaster Recovery Create a secondary workspace in a paired region. Use CI/CD to keep Workspace Assets Synchronized continuously. Requirement Approach Tools Cluster Configurations Co-deploy to both regions as code Terraform Code (Notebooks, Libraries, SQL) Co-deploy with CI/CD pipelines Git, Azure DevOps, GitHub Actions Jobs Co-deploy with CI/CD, set concurrency to zero in secondary Databricks Asset Bundles, Terraform Permissions (Users, Groups, ACLs) Use IdP/SCIM and infrastructure as code Terraform, SCIM Secrets Co-deploy using secret management Terraform, Azure Key Vault Table Metadata Co-deploy with CI/CD workflows Git, Terraform Cloud Services (ADLS, Network) Co-deploy infrastructure Terraform Update your orchestrator (ADF, Fabric pipelines, etc.) to include a simple region toggle to reroute job execution. Replicate all dependent services (Key Vault, Storage accounts, SQL DB). Implement Delta “Deep Clone” synchronization jobs to keep datasets continuously aligned between regions. Introduce an application‑level “Sync Tool” that redirects: data ingestion compute execution Enable parallel processing in both regions for selected or all workloads. Use bi‑directional synchronization for Delta data to maintain consistency across regions. For performance and cost control, run most workloads in primary and only subset workloads in secondary to keep it warm. Implement Three-Pillar DR Design Primary Workspace: Your production Databricks environment running normal operations Secondary Workspace: A standby Databricks workspace in a different(paired) Azure region that remains ready to take over if the primary fails. This architecture ensures business continuity while optimizing costs by keeping the secondary workspace dormant until needed. The DR solution is built on three fundamental pillars that work together to provide comprehensive protection: 1. Infrastructure Provisioning (Terraform) The infrastructure layer creates and manages all Azure resources required for disaster recovery using Infrastructure as Code (Terraform). What It Creates: Secondary Resource Group: A dedicated resource group in your paired DR region (e.g., if primary is in East US, secondary might be in West US 2) Secondary Databricks Workspace: A standby Databricks workspace with the same SKU as your primary, ready to receive failover traffic DR Storage Account: An ADLS Gen2 storage account that serves as the backup destination for your critical data Monitoring Infrastructure: Azure Monitor Log Analytics workspace and alert action groups to track DR health Protection Locks: Management locks to prevent accidental deletion of critical DR resources Key Design Principle: The Terraform configuration references your existing primary workspace without modifying it. It only creates new resources in the secondary region, ensuring your production environment remains untouched during setup. 2. Data Synchronization (Delta Notebooks) The data synchronization layer ensures your critical data is continuously backed up to the secondary region. How It Works: The solution uses a Databricks notebook that runs in your primary workspace on a scheduled basis. This notebook: Connects to Backup Storage: Uses Unity Catalog with Azure Managed Identity for secure, credential-free authentication to the secondary storage account Identifies Critical Tables: Reads from a configuration list you define (sales data, customer data, inventory, financial transactions, etc.) Performs Deep Clone: Uses Delta Lake's native CLONE functionality to create exact copies of your tables in the backup storage Tracks Sync Status: Logs each synchronization operation, tracks row counts, and reports on data freshness Authentication Flow: The synchronization process leverages Unity Catalog's managed identity capabilities: An existing Access Connector for Unity Catalog is granted "Storage Blob Data Contributor" permissions on the backup storage. Storage credentials are created in Databricks that reference this Access Connector. The notebook uses these credentials transparently—no storage keys or secrets are required. What Gets Synced: You define which tables are critical to your business operations. The notebook creates backup copies including: Full table data and schema Table partitioning structure Delta transaction logs for point-in-time recovery 3. Failover Automation (Python Scripts) The failover automation layer orchestrates the switch from primary to secondary workspace when disaster strikes. Microsoft Fabric Microsoft Fabric provides built‑in disaster recovery capabilities designed to keep analytics and Power BI experiences available during regional outages. Fabric simplifies continuity for reporting workloads, while still requiring customer planning for deeper data and workload replication. Power BI Business Continuity Power BI, now integrated into Fabric, provides automatic disaster recovery as a default offering: No opt-in required: DR capabilities are automatically included. Azure storage geo-redundant replication: Ensures backup instances exist in other regions. Read-only access during disasters: Semantic models, reports, and dashboards remain accessible. Always supported: BCDR for Power BI remains active regardless of OneLake DR setting. Microsoft Fabric Fabric's cross-region DR uses a shared responsibility model between Microsoft and customers: Microsoft's Responsibilities: Ensure baseline infrastructure and platform services availability Maintain Azure regional pairings for geo-redundancy. Provide DR capabilities for Power BI as default. Customer Responsibilities: Enable disaster recovery settings for capacities Set up secondary capacity and workspaces in paired regions Replicate data and configurations Enabling Disaster Recovery Organizations can enable BCDR through the Admin portal under Capacity settings: Navigate to Admin portal → Capacity settings Select the appropriate Fabric Capacity Access Disaster Recovery configuration Enable the disaster recovery toggle Critical Timing Considerations: 30-day minimum activation period: Once enabled, the setting remains active for at least 30 days and cannot be reverted. 72-hour activation window: Initial enablement can take up to 72 hours to become fully effective. Azure Databricks & Microsoft Fabric DR Considerations Building a resilient analytics platform requires understanding how disaster recovery responsibilities differ between Azure Databricks and Microsoft Fabric. While both platforms operate within Azure’s regional architecture, their DR models, failover behaviors, and customer responsibilities are fundamentally different. Recovery Procedures Procedure Databricks Fabric Failover Stop workloads, update routing, resume in secondary region. Microsoft initiates failover; customers restore services in DR capacity. Restore to Primary Stop secondary workloads, replicate data/code back, test, resume production. Recreate workspaces and items in new capacity; restore Lakehouse and Warehouse data. Asset Syncing Use CI/CD and Terraform to sync clusters, jobs, notebooks, permissions. Use Git integration and pipelines to sync notebooks and pipelines; manually restore Lakehouses. Business Considerations Consideration Databricks Fabric Control Customers manage DR strategy, failover timing, and asset replication. Microsoft manages failover; customers restore services post-failover. Regional Dependencies Must ensure secondary region has sufficient capacity and services. DR only available in Azure regions with Fabric support and paired regions. Power BI Continuity Not applicable. Power BI offers built-in BCDR with read-only access to semantic models and reports. Activation Timeline Immediate upon configuration. DR setting takes up to 72 hours to activate; 30-day wait before changes allowed.847Views4likes0CommentsRestricting Survey Access by Section
Hello, Is there a way to create multiple sections in a survey so that managers can view only one section and administrators can view another? The documentation I found only mentions response thresholds, not how to mask section results. Any guidance would be appreciated.Azure Databricks Lakebase is now generally available
Modern applications are built on real-time, intelligent, and increasingly powered by AI agents that need fast, reliable access to operational data—without sacrificing governance, scale, or simplicity. To solve for this, Azure Databricks Lakebase introduces a serverless, Postgres database architecture that separates compute from storage and integrates natively with the Databricks Data Intelligence Platform on Azure. Lakebase is now generally available in Azure Databricks enabling you and your team to start building and validating real-time and AI-driven applications directly on your lakehouse foundation. Why Azure Databricks Lakebase? Lakebase was created for modern workloads and reduce silos. By decoupling compute from storage, Lakebase treats infrastructure as an on-demand service—scaling automatically with workload needs and scaling to zero when idle. Key capabilities include: Serverless Postgres for Production Workloads: Lakebase delivers a managed Postgres experience with predictable performance and built-in reliability features suitable for production applications, while abstracting away infrastructure management. Instant Branching and Point-in-Time Recovery: Teams can create zero-copy branches of production data in seconds for testing, debugging, or experimentation, and restore databases to precise points in time to recover from errors or incidents. Unified Governance with Unity Catalog: Operational data in Lakebase can be governed using the same Unity Catalog policies that secure analytics and AI workloads, enabling consistent access control, auditing, and compliance across the platform. Built for AI and Real-Time Applications: Lakebase is designed to support AI-native patterns such as real-time feature serving, agent memory, and low-latency application state—while keeping data directly connected to the lakehouse for analytics and learning workflows. Lakebase allows applications to operate directly on governed, lake-backed data—reducing complexity with pipeline synchronization or duplicating storage On Azure Databricks, this unlocks new scenarios such as: Real-time applications built on lakehouse data AI agents with persistent, governed memory Faster release cycles with safe, isolated database branches Simplified architectures with fewer moving parts All while using familiar Postgres interfaces and tools. Get Started with Azure Databricks Lakebase Lakebase is integrated into the Azure Databricks experience and can be provisioned directly within Azure Databricks workspaces. For Azure Databricks customers building intelligent, real-time applications, it offers a new foundation—one designed for the pace and complexity of modern data-driven systems. We’re excited to see what you build, get started today!618Views0likes0CommentsMcasShadowItReporting / Cloud Discovery in Azure Sentinel
Hi! I´m trying to Query the McasShadowItReporting Table, for Cloud App DISCOVERYs The Table is empty at the moment, the connector is warning me that the Workspace is onboarded to Unified Security Operations Platform So I cant activate it here I cant mange it via https://security.microsoft.com/, too The Documentation ( https://learn.microsoft.com/en-us/defender-cloud-apps/siem-sentinel#integrating-with-microsoft-sentinel ) Leads me to the SIEM Integration, which is configured for (for a while) I wonder if something is misconfigured here and why there is no log ingress / how I can query them60Views0likes1CommentUnderstand New Sentinel Pricing Model with Sentinel Data Lake Tier
Introduction on Sentinel and its New Pricing Model Microsoft Sentinel is a cloud-native Security Information and Event Management (SIEM) and Security Orchestration, Automation, and Response (SOAR) platform that collects, analyzes, and correlates security data from across your environment to detect threats and automate response. Traditionally, Sentinel stored all ingested data in the Analytics tier (Log Analytics workspace), which is powerful but expensive for high-volume logs. To reduce cost and enable customers to retain all security data without compromise, Microsoft introduced a new dual-tier pricing model consisting of the Analytics tier and the Data Lake tier. The Analytics tier continues to support fast, real-time querying and analytics for core security scenarios, while the new Data Lake tier provides very low-cost storage for long-term retention and high-volume datasets. Customers can now choose where each data type lands—analytics for high-value detections and investigations, and data lake for large or archival types—allowing organizations to significantly lower cost while still retaining all their security data for analytics, compliance, and hunting. Please flow diagram depicts new sentinel pricing model: Now let's understand this new pricing model with below scenarios: Scenario 1A (PAY GO) Scenario 1B (Usage Commitment) Scenario 2 (Data Lake Tier Only) Scenario 1A (PAY GO) Requirement Suppose you need to ingest 10 GB of data per day, and you must retain that data for 2 years. However, you will only frequently use, query, and analyze the data for the first 6 months. Solution To optimize cost, you can ingest the data into the Analytics tier and retain it there for the first 6 months, where active querying and investigation happen. After that period, the remaining 18 months of retention can be shifted to the Data Lake tier, which provides low-cost storage for compliance and auditing needs. But you will be charged separately for data lake tier querying and analytics which depicted as Compute (D) in pricing flow diagram. Pricing Flow / Notes The first 10 GB/day ingested into the Analytics tier is free for 31 days under the Analytics logs plan. All data ingested into the Analytics tier is automatically mirrored to the Data Lake tier at no additional ingestion or retention cost. For the first 6 months, you pay only for Analytics tier ingestion and retention, excluding any free capacity. For the next 18 months, you pay only for Data Lake tier retention, which is significantly cheaper. Azure Pricing Calculator Equivalent Assuming no data is queried or analyzed during the 18-month Data Lake tier retention period: Although the Analytics tier retention is set to 6 months, the first 3 months of retention fall under the free retention limit, so retention charges apply only for the remaining 3 months of the analytics retention window. Azure pricing calculator will adjust accordingly. Scenario 1B (Usage Commitment) Now, suppose you are ingesting 100 GB per day. If you follow the same pay-as-you-go pricing model described above, your estimated cost would be approximately $15,204 per month. However, you can reduce this cost by choosing a Commitment Tier, where Analytics tier ingestion is billed at a discounted rate. Note that the discount applies only to Analytics tier ingestion—it does not apply to Analytics tier retention costs or to any Data Lake tier–related charges. Please refer to the pricing flow and the equivalent pricing calculator results shown below. Monthly cost savings: $15,204 – $11,184 = $4,020 per month Now the question is: What happens if your usage reaches 150 GB per day? Will the additional 50 GB be billed at the Pay-As-You-Go rate? No. The entire 150 GB/day will still be billed at the discounted rate associated with the 100 GB/day commitment tier bucket. Azure Pricing Calculator Equivalent (100 GB/ Day) Azure Pricing Calculator Equivalent (150 GB/ Day) Scenario 2 (Data Lake Tier Only) Requirement Suppose you need to store certain audit or compliance logs amounting to 10 GB per day. These logs are not used for querying, analytics, or investigations on a regular basis, but must be retained for 2 years as per your organization’s compliance or forensic policies. Solution Since these logs are not actively analyzed, you should avoid ingesting them into the Analytics tier, which is more expensive and optimized for active querying. Instead, send them directly to the Data Lake tier, where they can be retained cost-effectively for future audit, compliance, or forensic needs. Pricing Flow Because the data is ingested directly into the Data Lake tier, you pay both ingestion and retention costs there for the entire 2-year period. If, at any point in the future, you need to perform advanced analytics, querying, or search, you will incur additional compute charges, based on actual usage. Even with occasional compute charges, the cost remains significantly lower than storing the same data in the Analytics tier. Realized Savings Scenario Cost per Month Scenario 1: 10 GB/day in Analytics tier $1,520.40 Scenario 2: 10 GB/day directly into Data Lake tier $202.20 (without compute) $257.20 (with sample compute price) Savings with no compute activity: $1,520.40 – $202.20 = $1,318.20 per month Savings with some compute activity (sample value): $1,520.40 – $257.20 = $1,263.20 per month Azure calculator equivalent without compute Azure calculator equivalent with Sample Compute Conclusion The combination of the Analytics tier and the Data Lake tier in Microsoft Sentinel enables organizations to optimize cost based on how their security data is used. High-value logs that require frequent querying, real-time analytics, and investigation can be stored in the Analytics tier, which provides powerful search performance and built-in detection capabilities. At the same time, large-volume or infrequently accessed logs—such as audit, compliance, or long-term retention data—can be directed to the Data Lake tier, which offers dramatically lower storage and ingestion costs. Because all Analytics tier data is automatically mirrored to the Data Lake tier at no extra cost, customers can use the Analytics tier only for the period they actively query data, and rely on the Data Lake tier for the remaining retention. This tiered model allows different scenarios—active investigation, archival storage, compliance retention, or large-scale telemetry ingestion—to be handled at the most cost-effective layer, ultimately delivering substantial savings without sacrificing visibility, retention, or future analytical capabilities.Solved1.8KViews2likes4CommentsHow to Include Custom Details from an Alert in Email Generated by a Playbook
I have created an analytics rule that queries Sentinel for security events pertaining to group membership additions, and triggers an alert for each event found. The rule does not create an incident. Within the rule logic, I have created three "custom details" for specific fields within the event (TargetAccount, MemberName, SubjectAccount). I have also created a corresponding playbook for the purpose of sending an email to me when an alert is triggered. The associated automation rule has been configured and is triggered in the analytics rule. All of this is working as expected - when a member is added to a security group, I receive an email. The one remaining piece is to populate the email message with the custom details that I've identified in the rule. However, I'm not sure how to do this. Essentially, I would like the values of the three custom details shown in the first screenshot below to show up in the body of the email, shown in the second screenshot, next to their corresponding names. So, for example, say Joe Smith is added to the group "Admin" by Tom Jones. These are the fields and values in the event that I want to pull out. TargetAccount = Admin MemberName = Joe Smith Subject Account = Tom Jones The custom details would then be populated as such: Security_Group = Admin Member_Added = Joe Smith Added_By = Tom Jones and then, the body of the email would contain: Group: Admin Member Added: Joe Smith Added By: Tom Jones1.7KViews0likes6CommentsEmpowering multi-modal analytics with the medical imaging capability in Microsoft Fabric
This blog is part of a series that explores the recent announcement of the public preview of healthcare data solutions in Microsoft Fabric. The DICOM® (Digital Imaging and Communications in Medicine) data ingestion capability within the healthcare data solutions in Microsoft Fabric enables the storage, management, and analysis of imaging metadata from various modalities, including X-rays, CT scans, and MRIs, directly within Microsoft Fabric. It fosters collaboration, R&D and AI innovation for healthcare and life science use cases. Our customers and partners can now integrate DICOM® imaging datasets with clinical data stored in FHIR® (Fast Healthcare Interoperability Resources) format. By making imaging pixels and metadata accessible alongside clinical history and laboratory data, it enables clinicians and researchers to interpret imaging findings in the appropriate clinical context. This leads to enhanced diagnostic accuracy, informative clinical decision-making, and ultimately, improved patient outcomes.Update: Changing the Account Name Entity Mapping in Microsoft Sentinel
The upcoming update introduces more consistent and predictable entity data across analytics, incidents, and automation by standardizing how the Account Name property is populated when using UPN‑based mappings in analytic rules. Going forward, Account Name property will consistently contain only the UPN prefix, with new dedicated fields added for the full UPN and UPN suffix. While this improves consistency and enables more granular automation, customers who rely on specific Account Name values in automation rules or Logic App playbooks may need to take action. Timeline Effective date: July 1, 2026. The change will apply automatically - no opt-in is required. Scope of impact Analytics Rules which include mapping of User Principal Name (UPN) to the Account Name entity field, where the resulting alerts are processed by Automation Rules or Logic App Playbooks that reference the AccountName property. What’s changing Currently When an Analytic Rule includes mapping of a full UPN (for example: 'user@domain.com') to the Account Name field, the resulting input value for the Automation Rule or/and Logic App Playbook is inconsistent. In some cases, it contains only the UPN prefix: 'user', and in other cases the full UPN ('user@domain.com'). After July 1, 2026 Account Name property will consistently contain only the UPN prefix The following new fields will be added to the entity object: AccountName (UPN prefix) UPNSuffix UserPrincipalName (full UPN) This change provides an enhanced filtering and automation logic based on these new fields. Example Before Analytics Rule maps: 'user@domain.com' Automation Rule receives: Account Name: 'user' or 'user@domain.com' (inconsistent) After Analytics Rule maps: 'user@domain.com' Automation Rule receives: Account Name: 'user' UPNSuffix: 'domain.com' Logic App Playbook and SecurityAlert table receives: AccountName: 'user' UPNSuffix: 'domain.com' UserPrincipalName: 'user@domain.com' Feature / Location Before After SecurityAlert table Logic App Playbook Entity Why does it matter? If your automation logic relies on exact string comparisons against the full UPN stored in Account Name, those conditions may no longer match after the update. This most commonly affects: Automation Rules using "Equals" condition on Account Name Logic App Playbooks comparing entity field 'accountName' to a full UPN value Call to action Avoid strict equality checks against Account Name Use flexible operators such as: Contains Starts with Leverage the new UPNSuffix field for clearer intent Example update Before - Account name will show as 'user' or 'user@domain.com' After - Account Name will show as 'user' Recommended changes: Account Name Contains/Startswith 'user' UPNSuffix Equals/Startswith/Contains 'domain.com' This approach ensures compatibility both before and after the change takes effect. Where to update Review any filters, conditions, or branching logic that depend on Account Name values. Automation Rules: Use the 'Account name' field Logic App Playbooks: Update conditions referencing the entity: 'accountName' For example: Automation Rule before the change: Automation Rule after the change: Summary A consistency improvement to Account Name mapping is coming on July 1, 2026 The change affects Automation Rules and Logic App Playbooks that rely on UPN to Account Name mappings New UPN related fields provide better structure and control Customers should follow the recommendations above before the effective change date834Views0likes0CommentsHow Should a Fresher Learn Microsoft Sentinel Properly?
Hello everyone, I am a fresher interested in learning Microsoft Sentinel and preparing for SOC roles. Since Sentinel is a cloud-native enterprise tool and usually used inside organizations, I am unsure how individuals without company access are expected to gain real hands-on experience. I would like to hear from professionals who actively use Sentinel: - How do freshers typically learn and practice Sentinel? - What learning resources or environments are commonly used by beginners? - What level of hands-on experience is realistically expected at entry level? I am looking for guidance based on real industry practice. Thank you for your time.61Views0likes1CommentHow to Fix Azure Event Grid Entra Authentication issue for ACS and Dynamics 365 integrated Webhooks
Introduction: Azure Event Grid is a powerful event routing service that enables event-driven architectures in Azure. When delivering events to webhook endpoints, security becomes paramount. Microsoft provides a secure webhook delivery mechanism using Microsoft Entra ID (formerly Azure Active Directory) authentication through the AzureEventGridSecureWebhookSubscriber role. Problem Statement: When integrating Azure Communication Services with Dynamics 365 Contact Center using Microsoft Entra ID-authenticated Event Grid webhooks, the Event Grid subscription deployment fails with an error: "HTTP POST request failed with unknown error code" with empty HTTP status and code. For example: Important Note: Before moving forward, please verify that you have the Owner role assigned on app to create event subscription. Refer to the Microsoft guidelines below to validate the required prerequisites before proceeding: Set up incoming calls, call recording, and SMS services | Microsoft Learn Why This Happens: This happens because AzureEventGridSecureWebhookSubscriber role is NOT properly configured on Microsoft EventGrid SP (Service Principal) and event subscription entra ID or application who is trying to create event grid subscription. What is AzureEventGridSecureWebhookSubscriber Role: The AzureEventGridSecureWebhookSubscriber is an Azure Entra application role that: Enables your application to verify the identity of event senders Allows specific users/applications to create event subscriptions Authorizes Event Grid to deliver events to your webhook How It Works: Role Creation: You create this app role in your destination webhook application's Azure Entra registration Role Assignment: You assign this role to: Microsoft Event Grid service principal (so it can deliver events) Either Entra ID / Entra User or Event subscription creator applications (so they can create event grid subscriptions) Token Validation: When Event Grid delivers events, it includes an Azure Entra token with this role claim Authorization Check: Your webhook validates the token and checks for the role Key Participants: Webhook Application (Your App) Purpose: Receives and processes events App Registration: Created in Azure Entra Contains: The AzureEventGridSecureWebhookSubscriber app role Validates: Incoming tokens from Event Grid Microsoft Event Grid Service Principal Purpose: Delivers events to webhooks App ID: Different per Azure cloud (Public, Government, etc.) Public Azure: 4962773b-9cdb-44cf-a8bf-237846a00ab7 Needs: AzureEventGridSecureWebhookSubscriber role assigned Event Subscription Creator Entra or Application Purpose: Creates event subscriptions Could be: You, Your deployment pipeline, admin tool, or another application Needs: AzureEventGridSecureWebhookSubscriber role assigned Although the full PowerShell script is documented in the below Event Grid documentation, it may be complex to interpret and troubleshoot. Azure PowerShell - Secure WebHook delivery with Microsoft Entra Application in Azure Event Grid - Azure Event Grid | Microsoft Learn To improve accessibility, the following section provides a simplified step-by-step tested solution along with verification steps suitable for all users including non-technical: Steps: STEP 1: Verify/Create Microsoft.EventGrid Service Principal Azure Portal → Microsoft Entra ID → Enterprise applications Change filter to Application type: Microsoft Applications Search for: Microsoft.EventGrid Ideally, your Azure subscription should include this application ID, which is common across all Azure subscriptions: 4962773b-9cdb-44cf-a8bf-237846a00ab7. If this application ID is not present, please contact your Azure Cloud Administrator. STEP 2: Create the App Role "AzureEventGridSecureWebhookSubscriber" Using Azure Portal: Navigate to your Webhook App Registration: Azure Portal → Microsoft Entra ID → App registrations Click All applications Find your app by searching OR use the Object ID you have Click on your app Create the App Role: Display name: AzureEventGridSecureWebhookSubscriber Allowed member types: Both (Users/Groups + Applications) Value: AzureEventGridSecureWebhookSubscriber Description: Azure Event Grid Role Do you want to enable this app role?: Yes In left menu, click App roles Click + Create app role Fill in the form: Click Apply STEP 3: Assign YOUR USER to the Role Using Azure Portal: Switch to Enterprise Application view: Azure Portal → Microsoft Entra ID → Enterprise applications Search for your webhook app (by name) Click on it Assign yourself: In left menu, click Users and groups Click + Add user/group Under Users, click None Selected Search for your user account (use your email) Select yourself Click Select Under Select a role, click None Selected Select AzureEventGridSecureWebhookSubscriber Click Select Click Assign STEP 4: Assign Microsoft.EventGrid Service Principal to the Role This step MUST be done via PowerShell or Azure CLI (Portal doesn't support this directly as we have seen) so PowerShell is recommended You will need to execute this step with the help of your Entra admin. # Connect to Microsoft Graph Connect-MgGraph -Scopes "AppRoleAssignment.ReadWrite.All" # Replace this with your webhook app's Application (client) ID $webhookAppId = "YOUR-WEBHOOK-APP-ID-HERE" #starting with c5 # Get your webhook app's service principal $webhookSP = Get-MgServicePrincipal -Filter "appId eq '$webhookAppId'" Write-Host " Found webhook app: $($webhookSP.DisplayName)" # Get Event Grid service principal $eventGridSP = Get-MgServicePrincipal -Filter "appId eq '4962773b-9cdb-44cf-a8bf-237846a00ab7'" Write-Host " Found Event Grid service principal" # Get the app role $appRole = $webhookSP.AppRoles | Where-Object {$_.Value -eq "AzureEventGridSecureWebhookSubscriber"} Write-Host " Found app role: $($appRole.DisplayName)" # Create the assignment New-MgServicePrincipalAppRoleAssignment ` -ServicePrincipalId $eventGridSP.Id ` -PrincipalId $eventGridSP.Id ` -ResourceId $webhookSP.Id ` -AppRoleId $appRole.Id Write-Host "Successfully assigned Event Grid to your webhook app!" Verification Steps: Verify the App Role was created: Your App Registration → App roles You should see: AzureEventGridSecureWebhookSubscriber Verify your user assignment: Enterprise application (your webhook app) → Users and groups You should see your user with role AzureEventGridSecureWebhookSubscriber Verify Event Grid assignment: Same location → Users and groups You should see Microsoft.EventGrid with role AzureEventGridSecureWebhookSubscriber Sample Flow: Analogy For Simplification: Lets think it similar to the construction site bulding where you are the owner of the building. Building = Azure Entra app (webhook app) Building (Azure Entra App Registration for Webhook) ├─ Building Name: "MyWebhook-App" ├─ Building Address: Application ID ├─ Building Owner: You ├─ Security System: App Roles (the security badges you create) └─ Security Team: Azure Entra and your actual webhook auth code (which validates tokens) like doorman Step 1: Creat the badge (App role) You (the building owner) create a special badge: - Badge name: "AzureEventGridSecureWebhookSubscriber" - Badge color: Let's say it's GOLD - Who can have it: Companies (Applications) and People (Users) This badge is stored in your building's system (Webhook App Registration) Step 2: Give badge to the Event Grid Service: Event Grid: "Hey, I need to deliver messages to your building" You: "Okay, here's a GOLD badge for your SP" Event Grid: *wears the badge* Now Event Grid can: - Show the badge to Azure Entra - Get tokens that say "I have the GOLD badge" - Deliver messages to your webhook Step 3: Give badge to yourself (or your deployment tool) You also need a GOLD badge because: - You want to create event grid event subscriptions - Entra checks: "Does this person have a GOLD badge?" - If yes: You can create subscriptions - If no: "Access denied" Your deployment pipeline also gets a GOLD badge: - So it can automatically set up event subscriptions during CI/CD deployments Disclaimer: The sample scripts provided in this article are provided AS IS without warranty of any kind. The author is not responsible for any issues, damages, or problems that may arise from using these scripts. Users should thoroughly test any implementation in their environment before deploying to production. Azure services and APIs may change over time, which could affect the functionality of the provided scripts. Always refer to the latest Azure documentation for the most up-to-date information. Thanks for reading this blog! I hope you found it helpful and informative for this specific integration use case 😀228Views2likes0Comments