security & compliance
193 TopicsHow to Fix Azure Event Grid Entra Authentication issue for ACS and Dynamics 365 integrated Webhooks
Introduction: Azure Event Grid is a powerful event routing service that enables event-driven architectures in Azure. When delivering events to webhook endpoints, security becomes paramount. Microsoft provides a secure webhook delivery mechanism using Microsoft Entra ID (formerly Azure Active Directory) authentication through the AzureEventGridSecureWebhookSubscriber role. Problem Statement: When integrating Azure Communication Services with Dynamics 365 Contact Center using Microsoft Entra ID-authenticated Event Grid webhooks, the Event Grid subscription deployment fails with an error: "HTTP POST request failed with unknown error code" with empty HTTP status and code. For example: Important Note: Before moving forward, please verify that you have the Owner role assigned on app to create event subscription. Refer to the Microsoft guidelines below to validate the required prerequisites before proceeding: Set up incoming calls, call recording, and SMS services | Microsoft Learn Why This Happens: This happens because AzureEventGridSecureWebhookSubscriber role is NOT properly configured on Microsoft EventGrid SP (Service Principal) and event subscription entra ID or application who is trying to create event grid subscription. What is AzureEventGridSecureWebhookSubscriber Role: The AzureEventGridSecureWebhookSubscriber is an Azure Entra application role that: Enables your application to verify the identity of event senders Allows specific users/applications to create event subscriptions Authorizes Event Grid to deliver events to your webhook How It Works: Role Creation: You create this app role in your destination webhook application's Azure Entra registration Role Assignment: You assign this role to: Microsoft Event Grid service principal (so it can deliver events) Either Entra ID / Entra User or Event subscription creator applications (so they can create event grid subscriptions) Token Validation: When Event Grid delivers events, it includes an Azure Entra token with this role claim Authorization Check: Your webhook validates the token and checks for the role Key Participants: Webhook Application (Your App) Purpose: Receives and processes events App Registration: Created in Azure Entra Contains: The AzureEventGridSecureWebhookSubscriber app role Validates: Incoming tokens from Event Grid Microsoft Event Grid Service Principal Purpose: Delivers events to webhooks App ID: Different per Azure cloud (Public, Government, etc.) Public Azure: 4962773b-9cdb-44cf-a8bf-237846a00ab7 Needs: AzureEventGridSecureWebhookSubscriber role assigned Event Subscription Creator Entra or Application Purpose: Creates event subscriptions Could be: You, Your deployment pipeline, admin tool, or another application Needs: AzureEventGridSecureWebhookSubscriber role assigned Although the full PowerShell script is documented in the below Event Grid documentation, it may be complex to interpret and troubleshoot. Azure PowerShell - Secure WebHook delivery with Microsoft Entra Application in Azure Event Grid - Azure Event Grid | Microsoft Learn To improve accessibility, the following section provides a simplified step-by-step tested solution along with verification steps suitable for all users including non-technical: Steps: STEP 1: Verify/Create Microsoft.EventGrid Service Principal Azure Portal → Microsoft Entra ID → Enterprise applications Change filter to Application type: Microsoft Applications Search for: Microsoft.EventGrid Ideally, your Azure subscription should include this application ID, which is common across all Azure subscriptions: 4962773b-9cdb-44cf-a8bf-237846a00ab7. If this application ID is not present, please contact your Azure Cloud Administrator. STEP 2: Create the App Role "AzureEventGridSecureWebhookSubscriber" Using Azure Portal: Navigate to your Webhook App Registration: Azure Portal → Microsoft Entra ID → App registrations Click All applications Find your app by searching OR use the Object ID you have Click on your app Create the App Role: Display name: AzureEventGridSecureWebhookSubscriber Allowed member types: Both (Users/Groups + Applications) Value: AzureEventGridSecureWebhookSubscriber Description: Azure Event Grid Role Do you want to enable this app role?: Yes In left menu, click App roles Click + Create app role Fill in the form: Click Apply STEP 3: Assign YOUR USER to the Role Using Azure Portal: Switch to Enterprise Application view: Azure Portal → Microsoft Entra ID → Enterprise applications Search for your webhook app (by name) Click on it Assign yourself: In left menu, click Users and groups Click + Add user/group Under Users, click None Selected Search for your user account (use your email) Select yourself Click Select Under Select a role, click None Selected Select AzureEventGridSecureWebhookSubscriber Click Select Click Assign STEP 4: Assign Microsoft.EventGrid Service Principal to the Role This step MUST be done via PowerShell or Azure CLI (Portal doesn't support this directly as we have seen) so PowerShell is recommended You will need to execute this step with the help of your Entra admin. # Connect to Microsoft Graph Connect-MgGraph -Scopes "AppRoleAssignment.ReadWrite.All" # Replace this with your webhook app's Application (client) ID $webhookAppId = "YOUR-WEBHOOK-APP-ID-HERE" #starting with c5 # Get your webhook app's service principal $webhookSP = Get-MgServicePrincipal -Filter "appId eq '$webhookAppId'" Write-Host " Found webhook app: $($webhookSP.DisplayName)" # Get Event Grid service principal $eventGridSP = Get-MgServicePrincipal -Filter "appId eq '4962773b-9cdb-44cf-a8bf-237846a00ab7'" Write-Host " Found Event Grid service principal" # Get the app role $appRole = $webhookSP.AppRoles | Where-Object {$_.Value -eq "AzureEventGridSecureWebhookSubscriber"} Write-Host " Found app role: $($appRole.DisplayName)" # Create the assignment New-MgServicePrincipalAppRoleAssignment ` -ServicePrincipalId $eventGridSP.Id ` -PrincipalId $eventGridSP.Id ` -ResourceId $webhookSP.Id ` -AppRoleId $appRole.Id Write-Host "Successfully assigned Event Grid to your webhook app!" Verification Steps: Verify the App Role was created: Your App Registration → App roles You should see: AzureEventGridSecureWebhookSubscriber Verify your user assignment: Enterprise application (your webhook app) → Users and groups You should see your user with role AzureEventGridSecureWebhookSubscriber Verify Event Grid assignment: Same location → Users and groups You should see Microsoft.EventGrid with role AzureEventGridSecureWebhookSubscriber Sample Flow: Analogy For Simplification: Lets think it similar to the construction site bulding where you are the owner of the building. Building = Azure Entra app (webhook app) Building (Azure Entra App Registration for Webhook) ├─ Building Name: "MyWebhook-App" ├─ Building Address: Application ID ├─ Building Owner: You ├─ Security System: App Roles (the security badges you create) └─ Security Team: Azure Entra and your actual webhook auth code (which validates tokens) like doorman Step 1: Creat the badge (App role) You (the building owner) create a special badge: - Badge name: "AzureEventGridSecureWebhookSubscriber" - Badge color: Let's say it's GOLD - Who can have it: Companies (Applications) and People (Users) This badge is stored in your building's system (Webhook App Registration) Step 2: Give badge to the Event Grid Service: Event Grid: "Hey, I need to deliver messages to your building" You: "Okay, here's a GOLD badge for your SP" Event Grid: *wears the badge* Now Event Grid can: - Show the badge to Azure Entra - Get tokens that say "I have the GOLD badge" - Deliver messages to your webhook Step 3: Give badge to yourself (or your deployment tool) You also need a GOLD badge because: - You want to create event grid event subscriptions - Entra checks: "Does this person have a GOLD badge?" - If yes: You can create subscriptions - If no: "Access denied" Your deployment pipeline also gets a GOLD badge: - So it can automatically set up event subscriptions during CI/CD deployments Disclaimer: The sample scripts provided in this article are provided AS IS without warranty of any kind. The author is not responsible for any issues, damages, or problems that may arise from using these scripts. Users should thoroughly test any implementation in their environment before deploying to production. Azure services and APIs may change over time, which could affect the functionality of the provided scripts. Always refer to the latest Azure documentation for the most up-to-date information. Thanks for reading this blog! I hope you found it helpful and informative for this specific integration use case 😀149Views2likes0CommentsConditional Access for Canvas Apps with Entra
In today's Power Platform landscape, administrators have a tough task securing the ever-increasing inventory of Canvas Apps across their tenant. Canvas apps often connect to sensitive data, run on a variety of devices, and serve diverse groups of users. That is why Conditional Access has become one of the most powerful tools in an admin’s toolkit, giving you fine grained control over how, where, and under what conditions users can access your apps. In this post, I will walk through what Conditional Access means for canvas apps, how it empowers admins to maintain strong security without adding friction for legitimate users, and example steps to apply your own conditional access policies to an app with PowerShell. What Conditional Access Brings to Canvas Apps Conditional Access brings granular, app-level security controls from Microsoft Entra ID directly into Power Apps. Instead of applying blanket restrictions across the entire tenant, you can enforce requirements—like MFA, compliant devices, or trusted networks—only on the apps that need them. This lets you match security to the sensitivity of each individual app. Key Benefits for Admins Tailored Protection for Sensitive Apps Not every app requires strict controls. Conditional Access allows you to tighten security only for apps that handle sensitive or regulated data, without over restricting everything else. Control Access by Device Type Admins can easily block or allow specific device categories—like preventing mobile access to a high-risk app or requiring managed devices for apps that contain confidential information. Alignment With Zero Trust Conditional Access enforces identity, device, and session checks in real time, supporting a Zero Trust approach without adding unnecessary friction for legitimate users. Environment-Specific Flexibility You can apply stricter policies in production and lighter ones in development or testing, helping teams build efficiently while keeping sensitive environments locked down. A Stronger Security Model Conditional Access does not replace existing apps or data permissions—it complements them. App-level security roles control what users can do inside an app, while Conditional Access governs whether they can get into the app at all. Together, they create a much more robust security posture. How to enable conditional access for a Canvas App example In this example, I will detail steps to set up conditional access for a Canvas App to ensure tenant guest users are not able to access the app. Step 1: Create an Authentication Context in Entra ID Go to the Microsoft Entra Admin Center. Navigate to Protection → Conditional Access → Authentication context. Click + New authentication context. Name it (e.g., BlockGuests_PowerAppX) Enable Publish to apps Save and note the Authentication Context ID Step 2: Create a Conditional Access Policy Go to Conditional Access → Policies → + New policy. Name the policy (e.g., Block Guests from Power App X). Assignments: Users or workload identities: Include: Guest or external users Target resources: Choose Authentication context Select the one you created earlier Access controls: Grant: Select Block access Enable the policy and click Create. Step 3: Assign the Authentication Context to the Power App Use PowerShell to bind the Authentication Context to the specific Power App: Open PowerShell as Administrator. Connect to Power Apps Add-PowerAppsAccount Run the command to attach the context to your canvas app Set-AdminPowerAppConditionalAccessAuthenticationContextIds -EnvironmentName "<your-environment-name>" ` -AppName "<your-app-id>" ` -AuthenticationContextIds "<your-auth-context-id>" This binding tells Power Apps: “When this app opens, trigger the Conditional Access policy tied to this context.” Step 4: Test the Policy Try accessing the app as a guest user. You should see access blocked based on the Conditional Access policy. Wrap Up A Stronger Security Model Conditional Access does not replace existing apps or data permissions—it complements them. App-level security roles control what users can do inside an app, while Conditional Access governs whether they can get into the app at all. Together, they create a much more robust security posture. Bottom Line Conditional Access gives admins the flexibility to apply the right security to the right app. Whether you are enforcing MFA, restricting device types, or securing production environments, it helps you protect sensitive data without slowing down the organization. Documentation for further reading: Manage Power Apps - Power Platform | Microsoft Learn Demo from Power CAT: Conditional Access Policies for Canvas Apps - Power CAT Live333Views2likes0CommentsMicrosoft Sentinel and Dataverse Integration
Integrating Microsoft Sentinel with Microsoft Dataverse brings advanced, unified security monitoring to your Power Platform environments. Microsoft Sentinel is a cloud-native SIEM (Security Information and Event Management) that collects and correlates logs from across Azure, Microsoft 365, and more to detect and respond to threats in real time. By streaming Dataverse audit logs into Sentinel, organizations gain centralized visibility into Power Platform activity and can leverage Sentinel’s analytics and automation to rapidly detect suspicious behavior and enhance governance. Benefits of Connecting Dataverse to Sentinel Unified Visibility and Threat Detection: All Dataverse audit events (e.g. record access, changes, logins) are ingested into Sentinel, where they can be correlated with signals from identities, devices, and other applications. This holistic view enables detection of suspicious patterns that might be missed in isolation. For example, security analysts can spot anomalies like: mass data exports unusual access locations sudden policy changes in Dataverse operations Sentinel provides out-of-the-box analytics rules to flag many of these risky behaviors (such as a departing user downloading large datasets or deleting records) without requiring custom queries. 2. Faster Investigation and Response: With Dataverse logs in Sentinel, analysts can use Kusto Query Language (KQL) to quickly search and correlate Dataverse activity with other security logs (identity sign-ins, Office 365 events, etc.). This speeds up root-cause analysis during incidents. Moreover, Sentinel’s SOAR capabilities mean you can trigger automated playbooks in response to Dataverse threats. For instance, if Sentinel detects an anomalous privilege escalation in Dataverse, it could automatically disable the user’s account or alert an admin via Teams. This rapid, automated response helps contain threats immediately, reducing the time to mitigate incidents. 3) Improved Governance and Compliance: Integrating Dataverse with Sentinel strengthens oversight of Power Platform usage. All audit logs are stored in Sentinel’s scalable data lake, allowing long-term retention for compliance at a lower cost than storing in Dataverse. By correlating Dataverse activity with other enterprise logs, organizations can ensure that Power Platform apps adhere to security policies and can prove compliance. Prerequisites Before deploying the integration, ensure the following prerequisites are in place: Microsoft Sentinel workspace: Microsoft Sentinel enabled in the workspace: (See Here to Onboard to a Microsoft Sentinel Workspace) 2. You must have permission to create Data Collection Rules (DCR) and Data Collection Endpoints in that workspace. Roles and permissions in the Microsoft Sentinel platform | Microsoft Learn Data collection rules in Azure Monitor - Azure Monitor | Microsoft Learn Dataverse environment: 3. The Dataverse environment must be a production environment (Dataverse logging for Sentinel is only supported for production, not sandbox). 4. Your organization should be using Dynamics 365 Customer Engagement and/or Power Platform apps (since those rely on Dataverse). 5. Audit logging enabled: Auditing must be turned on for the Dataverse environment in the Power Platform admin settings (which also routes audit logs to Microsoft Purview). Setup With prerequisites met, follow these high-level steps to set up the Sentinel integration: Enable Dataverse Auditing (if not already enabled).In the Power Platform admin center, ensuretenant-level and environment-level auditing is on. Dataverse auditing is not enabled by default, so this is critical. You’ll also need to enable auditing on the entity (table) level for all relevant Dataverse tables. Microsoft provides a managed solution to simplify this: Import the Audit Settings solution: If your environment uses Dynamics 365 CE apps, import the solution from aka.ms/AuditSettings/Dynamics; otherwise use aka.ms/AuditSettings/DataverseOnly. This managed solution will turn on detailed auditing for all standard tables (entities) in Dataverse. For any custom tables, manually enable auditing in their settings (toggle on Auditing for the entity). In each entity’s settings, under Auditing, enable the options for “Single record auditing” and “Multiple record auditing” to capture detailed create/update/delete events. Then save and publish these changes. Enabling these ensures you capture granular audit logs needed for Sentinel. 2. Install the Sentinel Solution and Connect Data Sources. In the Azure Sentinel portal Azure Portal or the Security Admin Portal Defender Portal navigate to the Content hub (or in Microsoft Defender portal, go to Content hub for Sentinel) and install the “Microsoft Sentinel Solution for Microsoft Business Applications.” This deploys all the components (analytics rules, workbooks, connectors, etc.) for Power Platform integration. Once installed, go to Configuration > Data connectors in Sentinel. You will see new connectors available for: Microsoft Dataverse Microsoft Power Platform Admin Activity Microsoft Power Automate (A connector for Dynamics 365 Finance & Operations is also included for organizations using Dynamics F&O). Follow this link for more details on F&O connector: Connect Microsoft Dynamics 365 Finance and Operations to Microsoft Sentinel | Microsoft Learn For each relevant data connector (Dataverse, Power Platform Admin, Power Automate), open its page and select Connect. This step links your Dataverse environment’s audit stream to Sentinel via the Azure Monitor DCR infrastructure. After connecting, Sentinel will start listening for the audit logs from Purview. ValidateData Ingestion. With auditing enabled and connectors in place, perform some sample activities in your Dataverse environment to generate logs (for example, create or update a row in a Dataverse table, change a Power Platform environment setting, run a Power Automate flow). In general, Dataverse and Power Automate activity logs should begin flowing into Sentinel within a few minutes, whereas Power Platform admin logs (e.g. environment or D365 admin actions) may take up to an hour for the first time. Be patient and then verify ingestion by querying the logs in Sentinel. You can run KQL queries in the Logs blade of the Sentinel Azure portal (or the Hunting section of Defender portal) to confirm data is arriving. For example, run a query on the PowerPlatformAdminActivity table to see if recent records exist. After a successful setup, Microsoft Sentinel will be populating three main Log Analytics tables with your Dataverse-related logs (as listed below): Log Analytics Table Data Collected PowerPlatformAdminActivity Power Platform administrative logs (e.g. environment settings changes, user role assignments) PowerAutomateActivity Power Automate (Flow) activity logs (creation, runs, etc.) DataverseActivity Dataverse and model-driven app business data activity logs (create, update, delete events on records, etc.) These tables contain the audit data that Sentinel will use for analysis. For instance, an update to a row in a Dataverse table will generate an event in the DataverseActivity table, whereas an administrative action like changing a Data Loss Prevention (DLP) policy or adding a user to an environment appears in PowerPlatformAdminActivity. Analytics and Threat Detection Capabilities Once the data is flowing, you can leverage Microsoft Sentinel’s powerful analytics and automation on your Dataverse logs. The Microsoft Business Apps Sentinel solution you installed comes with prebuilt analytics rules tailored to Dataverse and Power Platform scenarios. These rules will automatically generate incidents for suspicious patterns, such as: Mass record downloads or deletions (which could indicate a potential data theft or misuse) Anomalous access, like a user accessing Dataverse from an unusual location or at an odd time Privilege escalations or policy changes, e.g. a user suddenly gaining a system admin role, or a DLP policy being turned off. Security teams can also create custom detection rules using KQL to address organization-specific threats. All the Dataverse audit logs in Sentinel are fully queryable – for example, you could write a query to find when a particular record was modified and by whom, or to detect an unusual spike in Power Automate flow failures. These queries can be turned into new alert rules as needed. Sentinel provides interactive workbooks and a hunting interface to help visualize and drill into this data for proactive threat hunting. In addition to detection, Sentinel enables automated response for Power Platform incidents. Using Playbooks (Logic Apps), you might automate actions like disabling a user’s Power Platform account, alerting the IT team via Microsoft Teams, or creating a ticket in ServiceNow whenever a high-severity Dataverse incident is detected. For example, if multiple deletion events are detected in a short span on a sensitive table, a playbook could immediately notify a security channel and suspend the user's access pending investigation. This kind of end-to-end automation greatly improves your security posture by reducing response times and ensuring consistent actions. In summary, connecting Dataverse auditing to Microsoft Sentinel equips organizations with a unified view of business application activity and the advanced tools to detect, investigate, and respond to potential security issues in their Power Platform environments. It marries the rich audit data from Dataverse with Sentinel’s powerful SIEM capabilities – enabling proactive monitoring of user actions, quick identification of anomalies, and automated defense measures to protect your low-code applications. With the integration set up, Power Platform admins and security analysts can rest easier knowing that any unusual or malicious activity in Dataverse will light up on the Sentinel radar and be handled swiftly. For detailed step-by-step guidance, refer to Microsoft’s documentation on connecting Power Platform (Dataverse) to Sentinel and Connect Microsoft Dynamics 365 Finance and Operations to Microsoft Sentinel | Microsoft Learn These resources provide deeper instructions and additional tips for a successful integration.509Views0likes0CommentsSyncing Security Groups with team membership
In this post, I present a PowerShell script to synchronize the membership between security groups and Office 365 groups, a long-standing request from Microsoft Teams admins and team owners. Source code on GitHub at https://github.com/danspot/Danspot-Scripts-and-Samples-Emporium.100KViews23likes29CommentsPreparing for Azure PostgreSQL Certificate Authority Rotation: A Comprehensive Operational Guide
The Challenge It started with a standard notification in the Azure Portal: Tracking-ID YK3N-7RZ. A routine Certificate Authority (CA) rotation for Azure Database for PostgreSQL. As Cloud Solution Architects, we’ve seen this scenario play out many times. The moment “certificate rotation” is mentioned, a wave of unease ripples through engineering teams. Let’s be honest: for many of us—ourselves included—certificates represent the edge of our technical “comfort zone.” We know they are critical for security, but the complexity of PKI chains, trust stores, and SSL handshakes can be intimidating. There is a silent fear: “If we touch this, will we break production?” We realized we had a choice. We could treat this as an opportunity, and we could leave that comfort zone. We approached our customer with a proactive proposal: Let’s use this event to stop fearing certificates and start mastering them. Instead of just patching the immediate issue, we used this rotation as a catalyst to review and upgrade the security posture of their database connections. We wanted to move from “hoping it works” to “knowing it’s secure.” The response was overwhelmingly positive. The teams didn’t just want a quick fix; they wanted “help for self-help.” They wanted to understand the mechanics behind sslmode and build the confidence to manage trust stores proactively. This guide is the result of that journey. It is designed to help you navigate the upcoming rotation not with anxiety, but with competence—turning a mandatory maintenance window into a permanent security improvement. Two Levels of Analysis A certificate rotation affects your environment on two distinct levels, requiring different expertise and actions: Level Responsibility Key Questions Actions Platform Level Cloud/Platform Teams Which clusters, services, and namespaces are affected? How do we detect at scale? Azure Service Health monitoring, AKS scanning, infrastructure-wide assessment Application Level Application/Dev Teams What SSL mode? Which trust store? How to update connection strings? Code changes, dependency updates, trust store management This article addresses both levels - providing platform-wide detection strategies (Section 5) and application-specific remediation guidance (Platform-Specific Remediation). Business Impact: In production environments, certificate validation failures cause complete database connection outages. A single missed certificate rotation has caused hours of downtime for enterprise customers, impacting revenue and customer trust. Who’s Affected: DevOps engineers, SREs, database administrators, and platform engineers managing Azure PostgreSQL instances - especially those using: - Java applications with custom JRE cacerts - Containerized workloads with baked-in trust stores - Strict SSL modes (sslmode=verify-full, verify-ca) The Solution What we’ll cover: 🛡️ Reliability: How to prevent database connection outages through proactive certificate management 🔄 Resiliency: Automation strategies that ensure your trust stores stay current 🔒 Security: Maintaining TLS security posture while rotating certificates safely Key Takeaway: This rotation is a client trust topic, not a server change. Applications trusting root CAs (DigiCert Global Root G2, Microsoft RSA Root CA 2017) without intermediate pinning are unaffected. Risk concentrates where strict validation meets custom trust stores. 📦 Platform-Specific Implementation: Detailed remediation guides for Java, .NET, Python, Node.js, and Kubernetes are available in our GitHub Repository. Note: The GitHub Repository. contains community-contributed content provided as-is. Test all scripts in non-production environments before use. 1. Understanding Certificate Authority Rotation What Changes During CA Rotation? Azure Database for PostgreSQL uses TLS/SSL to encrypt client-server connections. The database server presents a certificate chain during the TLS handshake: Certificate Chain Structure: Figure: Certificate chain structure showing the rotation from old intermediate (red, deprecated) to new intermediate (blue, active after rotation). Client applications must trust the root certificates (green) to validate the chain. 📝 Diagram Source: The Mermaid source code for this diagram is available in certificate-chain-diagram.mmd. Why Root Trust Matters Key Principle: If your application trusts the root certificate and allows the chain to be validated dynamically, you are not affected. The risk occurs when: Custom trust stores contain only the old intermediate certificate (not the root) Certificate pinning is implemented at the intermediate level Strict validation is enabled (sslmode=verify-full in PostgreSQL connection strings) 2. Who Is Affected and Why Risk Assessment Matrix Application Type Trust Store SSL Mode Risk Level Action Required Cloud-native app (Azure SDK) OS Trust Store require 🟢 Low None - Azure SDK handles automatically Java app (default JRE) System cacerts verify-ca 🟡 Medium Verify JRE version (11.0.16+, 17.0.4+, 8u381+) Java app (custom cacerts) Custom JKS file verify-full 🔴 High Update custom trust store with new intermediate .NET app (Windows) Windows Cert Store require 🟢 Low None - automatic via Windows Update Python app (certifi) certifi bundle verify-ca 🟡 Medium Update certifi package (pip install --upgrade certifi) Node.js app (default) Built-in CAs verify-ca 🟢 Low None - Node.js 16+, 18+, 20+ auto-updated Container (Alpine) /etc/ssl/certs verify-full 🔴 High Update base image or install ca-certificates-bundle Container (custom) Baked-in certs verify-full 🔴 High Rebuild image with updated trust store How to Read This Matrix Use the above matrix to quickly assess whether your applications are affected by CA rotation. Here is an overview, how you read the matrix: Column Meaning Application Type What kind of application do you have? (e.g., Java, .NET, Container) Trust Store Where does the application store its trusted certificates? SSL Mode How strictly does the application validate the server certificate? Risk Level 🟢 Low / 🟡 Medium / 🔴 High - How likely is a connection failure? Action Required What specific action do you need to take? Risk Level Logic: Risk Level Why? 🟢 Low Automatic updates (OS/Azure SDK) or no certificate validation 🟡 Medium Manual update required but straightforward (e.g., pip install --upgrade certifi) 🔴 High Custom trust store must be manually updated - highest outage risk SSL Mode Security Posture Understanding SSL modes is critical because they determine both security posture AND rotation impact. This creates a dual consideration: SSL Mode Certificate Validation Rotation Impact Security Level Recommendation disable ❌ None ✅ No impact 🔴 INSECURE Never use in production allow ❌ None ✅ No impact 🟠 WEAK Not recommended prefer ❌ Optional ✅ Minimal 🟡 WEAK Not recommended require ❌ No (Npgsql 6.0+) ✅ No impact 🟡 WEAK Upgrade to verify-full verify-ca ✅ Chain only 🔴 Critical 🔵 MODERATE Update trust stores verify-full ✅ Chain + hostname 🔴 Critical 🟢 SECURE Recommended - Update trust stores Key Insight: Applications using weak SSL modes (everything below verify-ca) are technically unaffected by CA rotation but represent security vulnerabilities. The safest path is verify-full with current trust stores. ⚖️ The Security vs. Resilience Trade-off The Paradox: Secure applications (verify-full) have the highest rotation risk 🔴, while insecure applications (require) are unaffected but have security gaps. Teams discovering weak SSL modes during rotation preparation face a critical decision: Option Approach Rotation Impact Security Impact Recommended For 🚀 Quick Fix Keep weak SSL mode (require) ✅ No action needed ⚠️ Security debt remains Emergency situations only 🛡️ Proper Fix Upgrade to verify-full 🔴 Requires trust store updates ✅ Improved security posture All production systems Our Recommendation: Use CA rotation events as an opportunity to improve your security posture. The effort to update trust stores is a one-time investment that pays off in long-term security. Common Scenarios Scenario 1: Enterprise Java Application Problem: Custom trust store created 2+ years ago for PCI compliance Risk: High - contains only old intermediate certificates Solution: Export new intermediate from Azure, import to custom cacerts Scenario 2: Kubernetes Microservices Problem: Init container copies trust store from ConfigMap at startup Risk: High - ConfigMap never updated since initial deployment Solution: Update ConfigMap, redeploy pods with new trust store Scenario 3: Legacy .NET Application Problem: .NET Framework 4.6 on Windows Server 2016 (no Windows Update) Risk: Medium - depends on manual certificate store updates Solution: Import new intermediate to Windows Certificate Store manually 3. Trust Store Overview A trust store is the collection of root and intermediate CA certificates that your application uses to validate server certificates during TLS handshakes. Understanding where your application’s trust store is located determines how you’ll update it for CA rotations. Trust Store Locations by Platform Category Platform Trust Store Location Update Method Auto-Updated? OS Level Windows Cert:\LocalMachine\Root Windows Update ✅ Yes Debian/Ubuntu /etc/ssl/certs/ca-certificates.crt apt upgrade ca-certificates ✅ Yes (with updates) Red Hat/CentOS /etc/pki/tls/certs/ca-bundle.crt yum update ca-certificates ✅ Yes (with updates) Runtime Level Java JRE $JAVA_HOME/lib/security/cacerts Java security updates ✅ With JRE updates Python (certifi) site-packages/certifi/cacert.pem pip install --upgrade certifi ❌ Manual Node.js Bundled with runtime Node.js version upgrade ✅ With Node.js updates Custom Custom JKS Application-specific path keytool -importcert ❌ Manual Container image /etc/ssl/certs (baked-in) Rebuild container image ❌ Manual ConfigMap mount Kubernetes ConfigMap Update ConfigMap, redeploy ❌ Manual Why This Matters for CA Rotation Applications using auto-updated trust stores (OS-managed, current runtime versions) generally handle CA rotations automatically. The risk concentrates in: Custom trust stores created for compliance requirements (PCI-DSS, SOC 2) that are rarely updated Baked-in container certificates from images built months or years ago Outdated runtimes (old JRE versions, frozen Python environments) that haven’t received security updates Air-gapped environments where automatic updates are disabled When planning for CA rotation, focus your assessment efforts on applications in the “Manual” update category. 4. Platform-Specific Remediation 📦 Detailed implementation guides are available in our GitHub repository: azure-certificate-rotation-guide Quick Reference: Remediation by Platform Platform Trust Store Location Update Method Guide Java $JAVA_HOME/lib/security/cacerts Update JRE or manual keytool import java-cacerts.md .NET (Windows) Windows Certificate Store Windows Update (automatic) dotnet-windows.md Python certifi package pip install --upgrade certifi python-certifi.md Node.js Built-in CA bundle Update Node.js version nodejs.md Containers Base image /etc/ssl/certs Rebuild image or ConfigMap containers-kubernetes.md Scripts & Automation Script Purpose Download State Scan-AKS-TrustStores.ps1 Scan all pods in AKS for trust store configurations PowerShell tested validate-connection.sh Test PostgreSQL connection with SSL validation Bash not tested update-cacerts.sh Update Java cacerts with new intermediate Bash not tested 5. Proactive Detection Strategies Database-Level Discovery: Identifying Connected Clients One starting point for impact assessment is querying the PostgreSQL database itself to identify which applications are connecting. We developed a SQL query that joins pg_stat_ssl with pg_stat_activity to reveal active TLS connections, their SSL version, and cipher suites. 🔍 Get the SQL Query: Download the complete detection script from our GitHub repository: detect-clients.sql Important Limitations This query has significant constraints that you must understand before relying on it for CA rotation planning: Limitation Impact Mitigation Point-in-time snapshot Only shows currently connected clients Run query repeatedly over days/weeks to capture periodic jobs and batch processes No certificate details Cannot identify which CA certificate the client is using Requires client-side investigation (trust store analysis) Connection pooling May show pooler instead of actual application Use application_name in connection strings to identify true source Idle connections Long-running connections may be dormant Cross-reference with application activity logs Recommended approach: Use this query to create an initial inventory, then investigate each unique application_name and client_addr combination to determine their trust store configuration and SSL mode. Proactive Monitoring with Azure Monitor To detect certificate-related issues before and after CA rotation, configure Azure Monitor alerts. This enables early warning when SSL handshakes start failing. Why this matters: After CA rotation, applications with outdated trust stores will fail to connect. An alert allows you to detect affected applications quickly rather than waiting for user reports. Official Documentation: For complete guidance on creating and managing alerts, see Azure Monitor Alerts Overview and Create a Log Search Alert. Here is a short example of an Azure Monitor Alert definition as a starting point. { "alertRule": { "name": "PostgreSQL SSL Connection Failures", "severity": 2, "condition": { "query": "AzureDiagnostics | where ResourceType == 'SERVERS' and Category == 'PostgreSQLLogs' and Message contains 'SSL error' | summarize count() by bin(TimeGenerated, 5m)", "threshold": 5, "timeAggregation": "Total", "windowSize": "PT5M" } } } Alert Configuration Notes: Setting Recommended Value Rationale Severity 2 (Warning) Allows investigation without triggering critical incident response Threshold 5 failures/5min Filters noise while catching genuine issues Evaluation Period 5 minutes Balances responsiveness with alert fatigue Action Group Platform Team Ensures quick triage and coordination 6. Production Validation Pre-Rotation Validation Checklist Inventory all applications connecting to Azure PostgreSQL Identify trust store locations for each application Verify root certificate presence in trust stores Test connection with new intermediate in non-production environment Update monitoring alerts for SSL connection failures Prepare rollback plan if issues occur Schedule maintenance window (if required) Notify stakeholders of potential impact Testing Procedure We established a systematic 3-step validation process to ensure zero downtime. This approach moves from isolated testing to gradual production rollout. 🧪 Technical Validation Guide: For the complete list of psql commands, connection string examples for Windows/Linux, and automated testing scripts, please refer to our Validation Guide in the GitHub repository. Connection Testing Strategy The core of our validation strategy was testing connections with explicit sslmode settings. We used the psql command-line tool to simulate different client behaviors. Test Scenario Purpose Expected Result Encryption only (sslmode=require) Verify basic connectivity Connection succeeds even with unknown CA CA validation (sslmode=verify-ca) Verify trust store integrity Connection succeeds only if CA chain is valid Full validation (sslmode=verify-full) Verify strict security compliance Connection succeeds only if CA chain AND hostname match Pro Tip: Test with verify-full and an explicit root CA file containing the new Microsoft/DigiCert root certificates before the rotation date. This validates that your trust stores will work after the intermediate certificate changes. Step 1: Test in Non-Production Validate connections against a test server using the new intermediate certificate (Azure provides test endpoints during the rotation window). Step 2: Canary Deployment Deploy the updated trust store to a single “canary” instance or pod. Monitor: - Connection success rate - Error logs - Response times Step 3: Gradual Rollout Once the canary is stable, proceed with a phased rollout: 1. Update 10% of pods 2. Monitor for 1 hour 3. Update 50% of pods 4. Monitor for 1 hour 5. Complete rollout 7. Best Practices and Lessons Learned Certificate Management Best Practices Practice Guidance Example Trust Root CAs, Not Intermediates Configure trust stores with root CA certificates only. This provides resilience against intermediate certificate rotations. Trust Microsoft TLS RSA Root G2 and DigiCert Global Root G2 instead of specific intermediates Automate Trust Store Updates Use OS-provided trust stores when possible (automatically updated). For custom trust stores, implement CI/CD pipelines. Schedule bi-annual trust store audits Use SSL Mode Appropriately Choose SSL mode based on security requirements. verify-ca is recommended for most scenarios. See Security Posture Matrix in Section 2 Maintain Container Images Rebuild container images monthly to include latest CA certificates. Use init containers for runtime updates. Multi-stage builds with CA certificate update step Avoid Certificate Pinning Never pin intermediate certificates. If pinning is required for compliance, implement automated update processes. Pin only root CA certificates if absolutely necessary SSL Mode Decision Guide SSL Mode Security Level Resilience When to Use require Medium High Encrypted traffic without certificate validation. Use when CA rotation resilience is more important than MITM protection. verify-ca High Medium Validates certificate chain. Recommended for most production scenarios. verify-full Highest Low Strictest validation with hostname matching. Use only when compliance requires it. Organizational Communication Model Effective certificate rotation requires structured communication across multiple layers: Layer Responsibility Key Action Azure Service Health Microsoft publishes announcements to affected subscriptions Monitor Azure Service Health proactively Platform/Cloud Team Receives Azure announcements, triages criticality Follow ITSM processes, assess impact Application Teams Execute application-level changes Update trust stores, validate connections Security Teams Define certificate validation policies Set compliance requirements Ownership and Responsibility Matrix Team Responsibility Deliverable Platform/Cloud Team Monitor Azure Service Health, coordinate response Impact assessment, team notifications Application Teams Application-level changes (connection strings, trust stores) Updated configurations, validation results Security Teams Define certificate policies, compliance requirements Policy documentation, audit reports All Teams (Shared) Certificate lifecycle collaboration Playbooks, escalation paths, training Certificate Rotation Playbook Components Organizations should establish documented playbooks including: Component Recommended Frequency Purpose Trust Store Audits Bi-annual (every 6 months) Ensure certificates are current Certificate Inventory Quarterly review Know what certificates exist where Playbook Updates Annual or after incidents Keep procedures current Team Training Annual Build knowledge and confidence Field Observations: Common Configuration Patterns Pattern Observation Risk Implicit SSL Mode Teams don’t explicitly set sslmode, relying on framework defaults Unexpected behavior during CA rotation Copy-Paste Configurations Connection strings copied without understanding options Works until certificate changes expose gaps Framework-Specific Defaults Java uses JRE trust store, .NET uses Windows Certificate Store, Python depends on certifi package Some require manual updates, some are automatic Framework Trust Store Defaults Framework Default Trust Store Update Method Risk Level Java/Quarkus JRE cacerts Manual or JRE update Medium - requires awareness .NET Windows Certificate Store Windows Update Low - automatic Node.js Bundled certificates Node.js version update Low - automatic Python certifi package pip install --upgrade certifi High - manual intervention required Knowledge and Confidence Challenges Challenge Impact Mitigation Limited certificate knowledge Creates uncertainty and risk-averse behavior Proactive education, hands-on workshops Topic intimidation “Certificates” can seem complex, leading to avoidance Reality: Implementation is straightforward once understood Previous negative experiences Leadership concerns based on past incidents Document successes, share lessons learned Visibility gaps Lack of visibility into application dependencies Maintain certificate inventory, use discovery tools Monitoring Strategy (Recommended for Post-Rotation): While pre-rotation monitoring focuses on inventory, post-rotation monitoring should track: Key Metrics: - Connection failure rates (group by application, SSL error types) - SSL handshake duration (detect performance degradation) - Certificate validation errors (track which certificates fail) - Application error logs (filter for “SSL”, “certificate”, “trust”) Recommended Alerts: - Threshold: >5 SSL connection failures in 5 minutes - Anomaly detection: Connection failure rate increases >50% - Certificate expiry warnings: 30, 14, 7 days before expiration Dashboard Components: - Connection success rate by application - SSL error distribution (validation failures, expired certificates, etc.) - Certificate inventory with expiry dates - Trust store update status across infrastructure These metrics, alerts and thresholds are only starting points and need to be adjusted based on your environment and needs. Post-Rotation Validation and Telemetry Note: This article focuses on preparation for upcoming certificate rotations. Post-rotation metrics and incident data will be collected after the rotation completes and can inform future iterations of this guidance. Recommended Post-Rotation Activities: Here are some thoughts on post-rotation activities that could create more insights on the effectiveness of the preparation. Incident Tracking: After rotation completes, organizations should track: - Production incidents related to SSL/TLS connection failures - Services affected and their business criticality - Mean Time to Detection (MTTD) for certificate-related issues - Mean Time to Resolution (MTTR) from detection to fix Success Metrics to Measure Pre-Rotation Validation: - Number of services inventoried and assessed - Percentage of services requiring trust store updates - Testing coverage (dev, staging, production) Post-Rotation Outcomes: - Zero-downtime success rate (percentage of services with no impact) - Applications requiring emergency patching - Time from rotation to full validation Impact Assessment Telemetry to Collect: - Total connection attempts vs. failures (before and after rotation) - Duration of any service degradation or outages - ustomer-facing impact (user-reported issues, support tickets) - Geographic or subscription-specific patterns Continuous Improvement Post-Rotation Review: - What worked well in the preparation phase? - Which teams or applications were unprepared? - What gaps exist in monitoring or alerting? - How can communication be improved for future rotations? Documentation Updates: - Update playbooks with lessons learned - Refine monitoring queries based on observed patterns - Enhance team training materials - Share anonymized case studies across the organization 8. Engagement & Next Steps Discussion Questions We’d love to hear from the community: What’s your experience with certificate rotations? Have you encountered unexpected connection failures during CA rotation events? Which trust store update method works best for your environment? OS-managed, runtime-bundled, or custom trust stores? How do you handle certificate management in air-gapped environments? What strategies have worked for your organization? Share Your Experience If you’ve implemented proactive certificate management strategies or have lessons learned from CA rotation incidents, we encourage you to: Comment below with your experiences and tips Contribute to the GitHub repository with additional platform guides or scripts Connect with us on LinkedIn to continue the conversation Call to Action Take these steps now to prepare for the CA rotation: Assess your applications - Use the Risk Assessment Matrix (Section 2) to identify which applications use sslmode=verify-ca or verify-full with custom trust stores Import root CA certificates - Add DigiCert Global Root G2 and Microsoft RSA Root CA 2017 to your trust stores Upgrade SSL mode - Change your connection strings to at least sslmode=verify-ca (recommended: verify-full) for improved security Document your changes - Record which applications were updated, what trust stores were modified, and the validation results Automate for the future - Implement proactive certificate management so future CA rotations are handled automatically (OS-managed trust stores, CI/CD pipelines for container images, scheduled trust store audits) 9. Resources Official Documentation Azure PostgreSQL: Azure PostgreSQL SSL/TLS Concepts Azure PostgreSQL - Connect with TLS/SSL PostgreSQL & libpq: PostgreSQL libpq SSL Support - SSL mode options and environment variables PostgreSQL psql Reference - Command-line tool documentation PostgreSQL Server SSL/TLS Configuration Certificate Authorities: DigiCert Root Certificates Microsoft PKI Repository Microsoft Trusted Root Program Community Resources Let’s Encrypt Root Expiration (2021 Incident) NIST SP 800-57: Key Management Guidelines OWASP Certificate Pinning Cheat Sheet Neon Blog: PostgreSQL Connection Security Defaults Tools and Scripts PowerShell AKS Trust Store Scanner (see Platform-Specific Remediation) PostgreSQL Interactive Terminal (psql) PostgreSQL JDBC SSL Documentation Industry Context Certificate rotation challenges are not unique to Azure PostgreSQL. Similar incidents have occurred across the industry: Historical Incidents: - Let’s Encrypt Root Expiration (2021): Widespread impact when DST Root CA X3 expired, affecting older Android devices and legacy systems - DigiCert Root Transitions: Multiple cloud providers experienced customer impact during CA changes - Internal PKI Rotations: Enterprises face similar challenges when rotating internally-issued certificates Relevant Standards: - NIST SP 800-57: Key Management Guidelines (certificate lifecycle best practices) - OWASP Certificate Pinning: Guidance on balancing security and operational resilience - CIS Benchmarks: Recommendations for TLS/SSL configuration in cloud environments Authors Author Role Contact Andreas Semmelmann Cloud Solution Architect, Microsoft LinkedIn Mpho Muthige Cloud Solution Architect, Microsoft LinkedIn Disclaimers Disclaimer: The information in this blog post is provided for general informational purposes only and does not constitute legal, financial, or professional advice. While every effort has been made to ensure the accuracy of the information at the time of publication, Microsoft makes no warranties or representations as to its completeness or accuracy. Product features, availability, and timelines are subject to change without notice. For specific guidance, please consult your legal or compliance advisor. Microsoft Support Statement: This article represents field experiences and community best practices. For official Microsoft support and SLA-backed guidance: Azure Support: https://azure.microsoft.com/support/ Official Documentation: https://learn.microsoft.com/azure/ Microsoft Q&A: https://learn.microsoft.com/answers/ Production Issues: Always open official support tickets for production-impacting problems. Customer Privacy Notice: This article describes real-world scenarios from customer engagements. All customer-specific information has been anonymized. No NDAs or customer confidentiality agreements were violated in creating this content. AI-generated content disclaimer: This content was generated in whole or in part with the assistance of AI tools. AI-generated content may be incorrect or incomplete. Please review and verify before relying on it for critical decisions. See terms Community Contribution: The GitHub repository referenced in this article contains community-contributed scripts and guides. These are provided as-is for educational purposes and should be tested in non-production environments before use. Tags: #AzurePostgreSQL #CertificateRotation #TLS #SSL #TrustStores #Operations #DevOps #SRE #CloudSecurity #AzureDatabaseAzure passowrd protection
We have a hybrid Azure infrastructure with an AD Connector installed on-prem and configured for PTA. We installed the password protection server and registered it with the Azure tenant, then deployed the DC agent on all domain controllers. Both the proxy and agents are operational. We published a few banned words to block in case anyone uses them. For testing, I changed my password to include one of the banned words. To my surprise, I was able to change the password. I checked the corresponding logon server, and the DC event viewer showed that the password was validated, but the banned word was in the password list that Azure set to enforce. Why is it not blocking the change?Solved71Views0likes1CommentSecuring Data with Microsoft Purview IRM + Defender: A Hands-On Lab
Hi everyone I recently explored how Microsoft Purview Insider Risk Management (IRM) integrates with Microsoft Defender to secure sensitive data. This lab demonstrates how these tools work together to identify, investigate, and mitigate insider risks. What I covered in this lab: Set up Insider Risk Management policies in Microsoft Purview Connected Microsoft Defender to monitor risky activities Walkthrough of alerts triggered → triaged → escalated into cases Key governance and compliance insights Key learnings from the lab: Purview IRM policies detect both accidental risks (like data spillage) and malicious ones (IP theft, fraud, insider trading) IRM principles include transparency (balancing privacy vs. protection), configurable policies, integrations across Microsoft 365 apps, and actionable alerts IRM workflow follows: Define policies → Trigger alerts → Triage by severity → Investigate cases (dashboards, Content Explorer, Activity Explorer) → Take action (training, legal escalation, or SIEM integration) Defender + Purview together provide unified coverage: Defender detects and responds to threats, while Purview governs compliance and insider risk This was part of my ongoing series of security labs. Curious to hear from others — how are you approaching Insider Risk Management in your organizations or labs?370Views0likes6CommentsCan't setup MFA on Azure personal account
I'm unable to use the az cli with a personal account because of MFA requirements. I have a free trial of Azure which I'm using for some basic testing. I'd like to deploy some Bicep from az cli, when I do I get: AADSTS50076: Due to a configuration change made by your administrator, or because you moved to a new location, you must use multi-factor authentication to access 'subscription GUID'. The docs say to go to Per-user multifactor authentication and enable MFA. I did this, but I can't set it up. Trying to login to https://aka.ms/MFASetup gives: You can't sign in here with a personal account. Use your work or school account instead. My personal account is the only user on the tenant. There is no other account I can use and I'm prevented from setting up MFA on it.164Views0likes2CommentsOath hardware token
Hi All, I just received my hardware tokens to set up for a few users in our organization that do not have access to company mobile devices. I have uploaded the .csv files with the required information in our Azure portal and it successfully uploaded. I am not able to activate the token, it keeps failing but I’m not sure why and I don’t really get a reason. Is there a clearer way to set this up or do I need to enable something before I set this up. I would like this set up before the end of the week, any help is appreciated. Thanks,6KViews0likes10Comments🔒 Strengthening Azure DNS Zone Security with RBAC and Resource Locks
🔎 DNS security is more than just configuration it’s about protecting critical assets against unauthorized changes and accidental deletions. 🔎 Managing DNS zones effectively requires a layered security approach. 🔎 Two powerful mechanisms in Azure : Role-Based Access Control (RBAC) and Resource Locks 🚀 Role-Based Access Control (RBAC) 🚀 * Granular DNS Access Control * RBAC ensures controlled access management at both the DNS zone and record set levels. * Instead of assigning broad permissions, RBAC enables precise delegation using built-in roles such as: 🔹 Owner – Full control over the DNS zone, including configurations and deletions. 🔹 Contributor – Can modify DNS settings but cannot change access permissions. 🔹 Network Contributor – Can manage networking configurations related to DNS, but not modify records. 🔹 DNS Zone Contributor – Dedicated role for managing DNS zones without broader networking privileges. ✅ Key Advantages of RBAC in DNS Security: ✔ Prevent unauthorized modifications by restricting access to only necessary roles. ✔ Ensure operational integrity by limiting exposure to critical configurations. ✔ Improve governance by aligning roles with organizational security policies. 🔐 Resource Locks 🔐 * Guardrails for DNS Protection * Even with well-defined RBAC settings, accidental deletions can still occur. * Azure Resource Locks add an additional safeguard by preventing changes to a DNS zone or specific record sets. 🔹 Zone Lock ----> Protects an entire DNS zone from being deleted, preserving all associated record sets. 🔹 SOA Lock ----> Prevents unintentional zone deletions while allowing record modifications within the zone. ✅ How Resource Locks Enhance Security: ✔ Shields DNS zones from accidental or malicious deletions. ✔ Maintains continuity by ensuring record sets remain intact. ✔ Strengthens compliance controls for critical infrastructure. 🛠 Best Practices for Securing DNS with RBAC & Resource Locks 🔸 Assign least privilege roles—never give unnecessary access. 🔸 Implement locks on essential zones to prevent configuration errors. 🔸 Regularly audit access permissions using Azure Policy & Activity Logs. 🔸 Use Automation & Alerts to track modifications for enhanced security. 🔹 Implementing RBAC & Resource Locks ensures your cloud environment remains secure, operational, and fault-tolerant.530Views0likes1Comment