automation
466 TopicsAzure automation feature, improvements and bugs
This is by no means meant as critic as i love the Azure Automation Account product and its current features but these are thing that i would love to see as an offering/fixed for the future. Source Control (I can only speak for Github as that is what i use): Bugs: Tags being overwritten / removed by source controll both on full sync but also on incremential syncs (Already reported in case #2508010040002105) Features: Runbooks in source control is not being deleted in automation account when they have been deleted in source control. Support for diffrent sync types other than PowerShell 5.1 (Personally we will not consider upgrading to a newer version before there is source control implemented) Support for syncing the full repository instead of only a specific folder. So recursive source control for easier organisation in repositories I know we can setup multiple source control in azure automation but that seems a bit redundant and more maintance as the source control integration expires after 1 year does not matter if your PAT token is set to never expires Add support for syncing synopsis / description for at least PowerShell scripts so it grabs it directly from the given script and inputs it into the description field. Just the output of get-help .\ScriptName.ps1 Logging: Bugs: From time to time we see that logs is being displayed twice after each other so lets say you get the first result of logs. For this example lets say the first 10 entries in the All log page and scroll down further then the same 10 entries are repeated again and again and again this can also be seen by the time stamp of the log entry. (No new network requests for logs is being made so i believe this might be a bug in a javascript without being 100% certain) The most often time we see this bug is when a runbook is still running so it might be the log output stream that messes this up. And just to provide a picture for refrence without exposing anything sensitive the bug can be seen based on timestamps here: PowerShell 7 and above log outputs seems to contain some non escaped ASCI characters which makes the logs harder to read and also makes a log object being split into multiple log entries in Azure automation Log outputs Seems to have been fixed since i last tested Features: Searching for a specific job id in the general job list. Currently there is a work arround by going into a specific runbook - go to jobs - Press "Find job" and then you can lookup a jobid globally but the UI is not being updated correctly as displayed here: Would love to see a button here or be able to search for a jobid Formatting log outputs so you can do multi line output in a single log output entry E.G. "Write-output "New´r´nLine" So the output entry contains multiple lines for easier human readable log outputs Runbook page: Bugs: Searching for runbook names seems a bit buggy as far as i have seen there is 3 diffrent results for the end user Base image intialy looking at all runbooks One option is that it is not able to find a runbook with that name I have not been able to replicate it to get a picture of it. Another is that it displays a list of runbooks none of which matches what you searched for Third is that when you have searched for something and remove your search it does not return the original view Features: Ability to go to a previous job and re-run it/restart it with the same parameters. Think a bit like the way you can restart a github action run Scheduling: Features: More of a feature request but adding the schedule for a runbook directly in the code is awesome. (This is something we currently do by adding a parameter that contains the scheduling information then we have a runbook going over all our runbooks every hour and looking for this parameter and then constructing a schedule if it does not exist and links the runbook to the schedule and finally we also add a tag mentioning If the schedule name is enabled or not (*back to the issue in source control removing the tag*)) Hybrid workers: Features: I personally would love the ability to pause a hybrid worker in a hybrid worker group - Why? - Well we currently have 4 hybrid workers all running windows and have monthly patch windows and if a job hits a hybrid worker that is in patch then the jobs would go into a suspended state and not be picked up again Now we could remove the hybrid worker from the group but that would also remove the extension which would be reinstalled when added and then we would hit this https://learn.microsoft.com/en-us/azure/automation/troubleshoot/extension-based-hybrid-runbook-worker#scenario-runbooks-go-into-a-suspended-state-on-a-hybrid-runbook-worker-when-using-a-custom-account-on-a-server-with-user-account-control-uac-enabled This is an issue we originally started experiencing when we migrated from agent-based hybrid workers to extension based due to the discontinuation of agent-based. Another great reason is when needing to troubleshoot something on a specific hybrid worker or even when needing to update modules on a specific hybrid worker as this can not be done while the hybrid worker is still running jobs unless you use force or hit a time that it is not running or by manually stopping the service and then again end up with suspended jobs that is not being picked up again. Additional features that i personally would love to see as an offering: A front end for azure automation for end users (Think self-service portal) as some kind of add-on feature allowing a specific group of people to start a given runbook but supplying a more user friendly front end for it while also including some more limitations for end user groupings. I know there is already third party solutions for this and tbh I almost created one my self on my last maternity leave but my company chose not to pursue it further as the statement is we have 1 self service platform being servicenow can be viewed https://github.com/Mynster9361/Self-Service-Frontend-Azure-Automation just to give some inspiration if needed RBAC permissions for individual runbooks (as far as i remember this can already be done through cli) A General overview management blade for managing webhooks and the associated runbooks Currently there is no way to know which runbooks has an active / inactive webhook assigned to them as the only way to see this is by going to a runbook go to the webhooks blade and look if there is one or not. Personally i would love to see a blade on the general overview called "Webhooks" that looks similar to this table maybe: RunbookNameExpirationLast triggeredStatusRunbook1 (Clickable to get directly to the runbook)Custom_name_for_this webhook02/01/2022 16:00 EnabledRunbook2webhook211/11/2026 16:00TodayDisabledRunbook3webhook311/11/2027 16:00TodayEnabled Instead of webhook being a gentleman agreemnet on when you can enable and when you shouldn't enable and naming and such you have 1 general overview of all webhooks which would give value in regards to security and easier management of webhooks The things i see as most critical or highest on my wish list: To list 2 things i would like to see sooner rather than later Source control definitely needs to be updated/revamped so it both supports other languages/versions and also does not remove tags. Another thing that would be nice to have is to force it to follow source control so if i delete something that is in source control it is also deleted in azure automation Hybrid workers in maintenance mode so it completes running jobs and you are able to work on the hybrid worker whether it be bugs or just regular updates.169Views2likes1CommentPatterns for low-code Azure config state snapshot + recovery solution for resource groups
I’m looking for patterns that capture resource configuration changes over time and support best-effort recovery (redeployment) of resource config state. I understand that authoritative IaC (Bicep) would be the most mature option, however, I am wondering if anyone has ever implemented a solution similar to what I have described above. Ideally this would be a low-code, Azure native solution.69Views0likes2CommentsMigrate Sentinel to Defender - Why It Is a Security Architecture Decision, Not Just a Portal Change
Microsoft will retire the Sentinel experience in Azure on March 31, 2027. Most of the conversation around this transition focuses on cost optimization and portal consolidation. That framing undersells what is actually happening. The unified Defender portal is not a new interface for the same capabilities. It is the platform foundation for a fundamentally different SOC operating model — one built on a 2-tier data architecture, graph-based investigation, and AI agents that can hunt, enrich, and respond at machine speed. Partners who understand this will help customers build security programs that match how attackers actually operate. This document covers four things: What the unified experience delivers — the security capabilities that do not exist in standalone Sentinel and why they matter against today’s threats. What the transition really involves - is not data migration, but it is a data architecture project that changes how telemetry flows, where it lives, and who queries it. Where the partner opportunity lives — a structured progression from professional services (transactional, transition execution, and advisory) to ongoing managed security services. Why does the unified experience win competitively — factual capability advantages that give partners a defensible position against third-party SIEM alternatives. The Bigger Picture: Preparing for the Agentic SOC Before getting into transition mechanics, partners need to understand where the industry is headed — because the platform decisions made during this transition will determine whether a customer’s SOC is ready for what comes next. The security industry is moving from human-driven, alert-centric workflows to an operating model built on three pillars: Intellectual Property — the detection logic, hunting hypotheses, response playbooks, and domain expertise that differentiate one security team from another. Human Orchestration — the judgment, context, and decision-making that humans bring to complex incidents. Humans set strategy, validate findings, and make containment decisions. They do not manually triage every alert. AI Agents - built agents that execute repeatable work: enriching incidents, hunting across months of telemetry, validating security posture, drafting response actions, and flagging anomalies for human review. The SOC of 2027 will not be scaled by hiring more analysts. It will be scaled by deploying agents that encode institutional knowledge into automated workflows — orchestrated by humans who focus on the decisions that require judgment. This transformation requires a platform that provides three things: Deep telemetry — agents need months of queryable data to analyze behavioral patterns, build baselines, and detect slow-moving threats. The Sentinel data lake provides this at a cost point that makes long-retention feasible. Relationship context — agents need to understand how entities connect. Which accounts share credentials? What is the blast radius of a compromised service principle? What is the attack path from a phished user to domain admin? Sentinel Graph provides this. Extensibility — partners and customers need to build and deploy their own agents without waiting for Microsoft to ship them. The MCP framework and Copilot agent architecture provide this. None of these exist in Azure experience for Sentinel. All three ship with the Defender experience. The urgency goes beyond the March 2027 deadline. Organizations are deploying AI agents, copilots, and autonomous workflows across their businesses — and every one of those creates a new attack surface. Prompt injection, data poisoning, agent hijacking, cross-plugin exploitation — these are not theoretical risks. They are in the wild today. Defending against AI-powered attacks requires a security platform that is itself AI Agent-ready. The new experience in Defender unlocks this experience. What Unified SIEM and XDR Actually Delivers The original framing — “single pane of glass for SIEM and XDR” — is accurate but insufficient. Here is what the unified platform delivers that standalone Sentinel does not. Cross-Domain Incident Correlation The Defender correlation engine does not just group alerts by time proximity. It builds multi-stage incident graphs that link identity compromise to lateral movement to data exfiltration across SIEM and XDR telemetry — automatically. Consider a token theft chain: an infostealer harvests browser session cookies (endpoint telemetry), the attacker replays the token from a foreign IP (Entra ID sign-in logs), creates a mailbox forwarding rule (Exchange audit logs), and begins exfiltrating data (DLP alerts). In standalone Sentinel, these are four separate alerts in four different tables. In the unified platform, they are one correlated incident with a visual attack timeline. 2-Tier Data Architecture The Sentinel data lake introduces a second storage tier that changes the economics and capabilities of security telemetry: Analytics Tier Data Lake Purpose Real-time detection rules, SOAR, alerting Hunting, forensics, behavioral analysis, AI agent queries Latency Sub-5-minute query and alerting Minutes to hours acceptable Cost ~$4.30/GB PAYG ingestion (~$2.96 at 100 GB/day commitment) ~$0.05/GB ingestion + $0.10/GB data processing (at least 20x cheaper) Retention 90 days default (expensive to extend) Up to 12 years at low cost Best for High-signal, low-volume sources High-volume, investigation-critical sources The architecture decision is not “which tier is cheaper.” It is “which tier gives me the right detection capability for each data source.” Analytics tier candidates: Entra ID sign-in logs, Azure activity, audit logs, EDR alerts, PAM events, Defender for Identity alerts, email threat detections. These need sub-5-minute alerting. Data lake candidates: Raw firewall session logs, full DNS query streams, proxy request logs, Sysmon process events, NSG flow logs. These drive hunting and forensic analysis over weeks or months. Dual-ingest sources: Some sources need both tiers. Entra ID sign-in logs are the canonical example — analytics tier for real-time password spray detection, Data Lake for graph-based blast radius analysis across months of authentication history. Implementation is straightforward: a single Data Collection Rule (DCR) transformation handles the split. One collection point, two routing destinations. The right framing: “Right data in the right tier = better detections AND lower cost.” Cost savings are a side effect of good security architecture, not the goal. Sentinel Graph Sentinel graph enables SOC teams and AI agents to answer questions that flat log queries cannot: What is the blast radius of this compromised account? Which service principals share credentials with the breached identity? What is the attack path from this phished user to domain admin? Which entities are connected to this suspicious IP across all telemetry sources? Graph-based investigation turns isolated alerts into context-rich intelligence. It is the difference between knowing “this account was compromised” and understanding “this account has access to 47 service principals, 3 of which have written access to production Key Vault.” Security Copilot Integration Security Copilot embedded in the defender portal helps analysts summarize incidents, generate hunting queries, explain attacker behavior, and draft response actions. For complex multi-stage incidents, it reduces the time from “I see an alert” to “I understand the full scope” from hours to minutes. With free SCUs available with Microsoft 365 E5, teams can apply AI to the highest-effort investigation work without adding incremental cost. MCP and the Agent Framework The Model Context Protocol (MCP) and Copilot agent architecture let partners and customers build purpose-built security agents. A concrete example: an MCP-enabled agent can automatically enrich a phishing incident by querying email metadata, checking the sender against threat intelligence, pulling the user’s recent sign-in patterns, correlating with Sentinel Graph for lateral risk, and drafting a containment recommendation — in under 60 seconds. This is where partner intellectual property becomes competitive advantage. The agent framework is the mechanism for encoding proprietary detection logic, response playbooks, and domain expertise into automated workflows that run at machine speed. Security Store Security Store allows partners to evolve from one‑time transition projects into repeatable, scalable offerings—supporting professional services, managed services, and agent‑based IP that align with the customer’s unified SecOps operating model As part of the transition, the Microsoft Security Store becomes the extension layer for the Defender —allowing partners to deliver differentiated agents, SaaS, and security services natively within Defender and Sentinel, instead of building and integrating in isolation The 4 Investigation Surfaces: A Customer Maturity Ladder The Sentinel Data Lake exposes four distinct investigation surfaces, each representing a step toward the Agentic SOC — and a partner service opportunity: Surface Capability Maturity Level Partner Opportunity KQL Query Ad-hoc hunting, forensic investigation Basic — “we can query” Hunting query libraries; KQL training Graph Analytics Blast radius, attack paths, entity relationships Intermediate — “we understand relationships” Graph investigation training; attack path workshops Notebooks (PySpark) Statistical analysis, behavioral baselines, ML models Advanced — “we predict behaviors” Custom notebook development; anomaly scoring Agent/MCP Access Autonomous hunting, triage, response at machine speed Agentic SOC — “we automate” Custom agent development; MCP integration The customer who starts with “help us hunt better” ends up at “build us agents that hunt autonomously.” That is the progression from professional services to managed services. What the Transition Actually Involves It is not a data migration — customers’ underlying log data and analytics remain in their existing Log Analytics workspaces. That is important for partners to communicate clearly. But partners should not set the expectation that nothing changes except the URL. Microsoft’s official transition guide documents significant operational changes — including automation rules and playbooks, analytics rule, RBAC restructuring to the new unified model (URBAC), API schema changes that break ServiceNow and Jira integrations, analytics rule transitions where the Fusion engine is replaced by the Defender XDR correlation engine, and data policy shifts for regulated industries. Most customers cannot navigate this complexity without professional help. Important: Transitioning to the Defender portal has no extra cost - estimate the billing with the new Sentinel Cost Estimator Optimizing the unified platform means making deliberate changes: Adding dual-ingest for critical sources that need both real-time detection and long-horizon hunting. Moving high-volume telemetry to the Data Lake — enabling hunting at scale that was previously cost-prohibitive. Retiring redundant data copies where Defender XDR already provides the investigation capability. Updating RBAC, automation, and integrations for the unified portal’s consolidated schema and permission structure. Training analysts on new investigation workflows, Sentinel Graph navigation, and Copilot-assisted triage. Threat Coverage: The Detection Gap Most Organizations Do Not Know They Have This transition is an opportunity to quantify detection maturity — and most organizations will not like what they find. Based on real-world breach analysis — infostealers, business email compromise, human-operated ransomware, cloud identity abuse, vulnerability exploitation, nation-state espionage, and other prevalent threat categories — organizations running standalone Sentinel with default configurations typically have significant detection gaps. Those gaps cluster in three areas: Cross-domain correlation gaps — attacks that span identity, endpoint, email, and cloud workloads. These require the Defender correlation engine because no single log source tells the complete story. Long-retention hunting gaps — threats like command-and-control beaconing and slow data exfiltration that unfold over weeks or months. Analytics-tier retention at 90 days is too expensive to extend and too short for historical pattern analysis. Graph-based analysis gaps — lateral movement, blast radius assessment, and attack path analysis that require understanding entity relationships rather than flat log queries. The unified platform with proper log source coverage across Microsoft-native sources can materially close these gaps — but only if the transition includes a detection coverage assessment, not just a portal cutover. Partners should use MITRE ATT&CK as the common framework for measuring detection maturity. Map existing detections to ATT&CK tactics and techniques before and after transition — a measurable, defensible improvement that justifies advisory fees and ongoing managed services. Partner Opportunity: Professional Services to Managed Services This transition creates a structured progression for all partner types — from professional services that build trust and surface findings, to managed security services that deliver ongoing value. The key insight most partners miss: do not jump from “transition assessment” to “managed services pitch.” Customers are not ready for that conversation until they have experienced the value of professional services. The bridge engagement — whether transactional, transition execution, or advisory — builds trust, demonstrates the expertise, and surfaces the findings that make the managed services conversation a logical next step. Professional Services (transactional + transition execution + advisory) → Managed Security Services (MSSP) The USX transition is the ideal professional services entry point because it combines a mandatory deadline (March 2027) with genuine technical complexity (analytics rule, automation behavioral changes, RBAC restructuring, API schema shifts) that most customers cannot navigate alone. Every engagement produces findings — detection gaps, automation fragility, staffing shortfalls — that are the most credible possible evidence for managed services. Professional Services Transactional Partners Offer Customer Value Key Deliverables Transition Readiness Assessment Risk-mitigated transition with clear scope Sentinel deployment inventory; Defender portal compatibility check; transition roadmap with timeline; MITRE ATT&CK detection coverage baseline Transition Execution and Enablement Accelerated time-to-value, minimal disruption Workspace onboarding; RBAC and automation updates; Dual-portal testing and validation; SOC team training on unified workflows Security Posture and Detection Optimization Better detections and lower cost Data ingestion and tiering strategy; Dual-ingest implementation for critical sources; Detection coverage gap analysis; Automation and Copilot/MCP recommendations Advisory Partners Offer Customer Value Key Deliverables Executive and Strategy Advisory Leadership alignment on why this transition matters Unified SecOps vision and business case; Zero Trust and SOC modernization alignment; Stakeholder alignment across security, IT, and leadership Architecture and Design Advisory Future-ready architecture optimized for the Agentic SOC Target-state 2-tier data architecture; Dual-ingest routing decisions mapped to MITRE tactics; RBAC, retention, and access model design Detection Coverage and Gap Analysis Measurable detection maturity improvement Current-state MITRE ATT&CK coverage mapping; Gap analysis against 24 threat patterns; Detection improvement roadmap with priority recommendations SOC Operating Model Advisory Smooth analyst adoption with clear ownership Redesigned SOC workflows for unified portal; Incident triage and investigation playbooks; RACI for detection engineering, hunting, and platform ops Agentic SOC Readiness Preparation for AI-driven security operations MCP and agent architecture assessment; Custom agent development roadmap; IP + Human Orchestration + Agent operating model design Cost, Licensing and Value Advisory Transparent cost impact with strong business case Current vs. future cost analysis; Data tiering optimization recommendations; TCO and ROI modeling for leadership The conversion to managed services is evidence-based. Every professional services engagement produces findings — detection gaps, automation fragility, staffing shortfalls. Those findings are the most credible possible case for ongoing managed services. Managed Security Services The unified platform changes the managed security conversation. Partners are no longer selling “we watch your alerts 24/7.” They are selling an operating model where proprietary AI agents handle the repeatable work — enrichment, hunting, posture validation, response drafting — and human experts focus on the decisions that require judgment. This is where the competitive moat forms. The formula: IP + Human Orchestration + AI Agents = differentiated managed security. The unified platform enables this through: Multi-tenancy — the built-in multitenant portal eliminates the need for third-party management layers. Sentinel Data Lake — agents can query months of customer telemetry for behavioral analysis without cost constraints. Sentinel Graph — agents can traverse entity relationships to assess blast radius and map attack paths. MCP extensibility — partners can build agents that integrate with proprietary tools and customer-specific systems. Partners who build proprietary agents encoding their detection logic into the MCP framework will differentiate from partners who rely on out-of-box capabilities. The Securing AI Opportunity Organizations are deploying AI agents, copilots, and autonomous workflows across their businesses at an accelerating pace. Every AI deployment creates a new attack surface — prompt injection, data poisoning, agent hijacking, cross-plugin exploitation, unauthorized data access through agentic workflows. These are not theoretical risks. They are in the wild today. Partners who can help customers secure their AI deployments while also using AI to strengthen their SOC will command premium positioning. This requires a security platform that is itself AI Agent-ready — one that can deploy defensive agents at the same pace organizations deploy business AI. The unified Defender portal is that platform. Partners who position USX as “preparing your SOC for AI-driven security operations” will differentiate from partners who position it as “moving to a new portal.” Cost and Operational Benefits Better security architecture also costs less. This is not a contradiction — it is the natural result of putting the right data in the right tier. Benefit How It Works Eliminate low-value ingestion Identify and remove log sources that are never used for detections, investigations, or hunting. Immediately lowers analytics-tier costs without impacting security outcomes. Right-size analytics rules Disable unused rules, consolidate overlapping detections, and remove automation that does not reduce SOC effort. Pay only for processing that delivers measurable security value. Avoid SIEM/XDR duplication Many threats can be investigated directly in Defender XDR without duplicating telemetry into Sentinel. Stop re-ingesting data that Defender already provides. Tier data by detection need Store high-volume, hunt-oriented telemetry in the Data Lake at at least 20x lower cost. Promote only high-signal sources to the analytics tier. Full data fidelity preserved in both tiers. Reduce operational overhead Unified SIEM+XDR workflows in a single portal reduce tool switching, accelerate investigations, simplify analyst onboarding, and enable SOC teams to scale without proportional headcount increases. Improve detection quality The Defender correlation engine produces higher-fidelity incidents with fewer false positives. SOC teams spend less time triaging noise and more time on real threats. Competitive Positioning Partners need defensible talking points when customers evaluate third-party SIEM alternatives. The following advantages are factual, sourced from Microsoft’s transition documentation and platform capabilities — not marketing claims. No extra cost for transitioning — even for non-E5 customers. Third-party SIEM migrations involve licensing, data migration, detection rewrite, and integration rebuild costs. Native cross-domain correlation across Sentinel + Defender products into multi-stage incident graphs. Third-party SIEMs receive Microsoft logs as flat events — they lack the internal signal context, entity resolution, and product-specific intelligence that powers cross-domain correlation. Custom detections across SIEM + XDR — query both Sentinel and Defender XDR tables without ingesting Defender data into Sentinel. Eliminates redundant ingestion cost. Alert tuning extends to Sentinel — previously Defender-only capability, now applicable to Sentinel analytics rules. Net-new noise reduction. Unified entity pages — consolidated user, device, and IP address pages with data from both Sentinel and Defender XDR, plus global search across SIEM and XDR. Third-party SIEMs provide entity views from ingested data only. Built-in multi-tenancy for MSSPs — multitenant portal manages incidents, alerts, and hunting across tenants without third-party management layers. Try out the new GDAP capabilities in Defender portal. Industry validation: Microsoft’s SIEM+XDR platform has been recognized as a Leader by both Forrester (Security Analytics Platforms, 2025) and Gartner (SIEM Magic Quadrant, 2025). Summary: What Partners Should Take Away Topic Key Message Framing USX is a security architecture transformation, not a portal transition. Lead with detection capability, not cost savings. Platform foundation Sentinel Data Lake + Sentinel Graph + MCP/Agent Framework = the platform for the Agentic SOC. 4 investigation surfaces KQL → Graph → Notebooks → Agent/MCP. A maturity ladder from “we can query” to “we automate at machine speed.” Architecture 2-tier data model (analytics + Data Lake) with dual-ingest for critical sources. Cost savings are a side effect of good architecture. Transition complexity Analytics rules and automation rules. API schema changes. RBAC restructuring. Most customers need professional help. Partner engagement model Professional Services (transactional + transition execution + advisory) → Managed Services (MSSP). Competitive positioning No extra cost. Native correlation. Cross-domain detections. Built-in multi-tenancy. Capabilities third-party SIEMs cannot replicate. Partner differentiation IP + Human Orchestration + AI Agents. Partners who build proprietary agents on MCP have competitive advantage. Timeline March 31, 2027. Start now — phased transition with one telemetry domain first, then scale.2.2KViews4likes4CommentsAutomating Daily MDE Compliance Monitoring Across Azure VMs
The Problem We’re Solving Most security teams have no automated way to know when a VM silently falls out of MDE coverage, whether because the agent stopped, the VM was newly provisioned without onboarding, or the device stopped reporting. This Logic App closes that gap and puts the right information in front of the right people every day. Disclaimer: This solution is designed for Azure Virtual Machines only. For non-Azure VMs onboarded to Microsoft Defender for Endpoint through Azure Arc, a separate companion blog will be published soon to cover that scenario. What changes once you deploy this Challenge Without This Logic App How This Logic App Helps Security gaps go undetected for days or weeks Any VM that is not onboarded or has stopped reporting is caught within 24 hours of the daily run No automated owner notification The VM's ServerOwner tag is read automatically, and the owner is emailed directly with full compliance details VMs with no owner fall through the cracks Flagged explicitly in the IT summary report with instructions for how to assign the tag Manual compliance reporting is time-consuming Full CSV report auto-attached to every daily IT summary; no manual extraction needed Agents silently stop reporting after onboarding Detects "Onboarded, Not Reporting" as a distinct status, separate from "Not Onboarded" Large multi-subscription environments are hard to cover Paginated queries across all enabled subscriptions; every running VM is checked Compliance States Detected Compliance Status Priority What It Means Not Onboarded P2, High The VM is running in Azure but has never appeared in MDE. There is zero security telemetry for this machine. Onboarded, Not Reporting P3, Medium The VM was previously enrolled but has not checked in within the configured window. The MDE agent may be stopped or the VM may have lost network connectivity to MDE. Compliant No alert VM is onboarded and checked in within the required time window. It is excluded from all notifications. Running VMs Only: This workflow queries Azure Resource Graph with a filter of powerState == "VM running". Deallocated, stopped, and powered-off VMs are intentionally excluded — they are not expected to report to MDE while offline. Only machines that are turned on are evaluated. Workflow Architecture The workflow runs as a sequential daily pipeline. All Azure VM data and MDE device data are collected into memory first, then each VM is evaluated in a single For Each loop. Execution Pipeline Recurrence trigger fires daily at 08:00 IST. CONFIG compose action reads MDE_LASTSEEN_HOURS (default 24). This defines the compliance window: how recently a VM must have reported to MDE to be considered Compliant. Init-varITTeamEmail and Init-varSenderEmail load the configurable email addresses used for sending and receiving notifications. Get-AllSubscriptions calls the Azure Management API to discover all subscriptions in the tenant. ForEach-Subscription runs a paginated Azure Resource Graph query per enabled subscription, collecting all running VMs along with Private IP, OS Type, Location, ServerOwner tag, and VM UUID. Init-MDEVariables then Paginate-MDEDevices call the MDE Security Center API in pages of 10,000 to load every enrolled device into the AllMDEDevices array. ForEach-AzureVM looks each Azure VM up in AllMDEDevices and determines compliance status and priority. Non-compliant handling builds HTML and CSV rows. If the VM has a ServerOwner tag, a compliance alert email goes to the owner with the IT Team CC'd. If there's no owner, the VM is appended to NoOwnerList. IT Summary email is sent once all VMs are processed. If any non-compliant VMs were found, the consolidated IT report is sent with the CSV attachment. Otherwise an All Clear email is sent. How Azure VM Data is Matched to MDE Data Each Azure VM is matched against the MDE device list using a two-level strategy. Both checks run for every VM on every run. Match Method How It Works Primary: Azure VM ID Compares azureVmId from the MDE device record (lowercase) against the VmId captured from Azure Resource Graph (lowercase). Immune to hostname changes; this is the preferred match. Fallback: Hostname + IP Checks that MDE computerDnsName starts with the Azure VM name (case-insensitive) AND lastIpAddress matches the Azure Private IP. Both conditions must be true. Not Found A synthetic MDE record with onboardingStatus: "NotFound" is created. The VM is treated as Not Onboarded and a P2 High alert is raised. Pagination Design The workflow handles large environments through two independent pagination mechanisms that run before any compliance evaluation begins. Data Source Page Size Mechanism Azure Resource Graph 1,000 VMs per page Uses $skipToken from the response. The Until loop re-queries with the token until no token is returned (last page). Variables VMSkipToken and VMFetchComplete manage loop state per subscription. Supports up to 50,000 VMs (50 pages). MDE Security Center API 10,000 devices per page Uses the $skip offset parameter. MDESkip is incremented by 10,000 each iteration. The loop stops when a page returns fewer than 10,000 records. Supports up to 500,000 MDE devices (50 pages × 10,000). Prerequisites Azure Resources Resource Requirement Notes Azure Logic App Standard plan, Stateful workflow Consumption plan also supported Managed Identity System-assigned on the Logic App Enable under Logic App > Identity Sender mailbox (varSenderEmail) Licensed Microsoft 365 account Emails are sent FROM this address IT Team email (varITTeamEmail) Valid email address or distribution list Receives all reports; CC'd on owner alerts Azure VMs Running, with ServerOwner tag (recommended) Tag value must be a valid email address MDE licensing Microsoft Defender for Endpoint P1 or P2 Tenant must be enrolled in MDE The ServerOwner Tag Server owner notifications rely on a VM-level Azure tag. Without it, the VM is included in the IT summary, but no individual alert is sent to an owner. Tag Name Expected Value Effect ServerOwner Valid email, e.g. john@yourcompany.com Compliance alert sent TO this address; IT Team CC'd If the tag is missing or empty, the VM is flagged in the Action Required: No Owner Tag Found section of the IT summary email, with step-by-step instructions for tagging it in the Azure Portal. Required Permissions & Why The Logic App's Managed Identity must be granted three API permissions. These are Application permissions that cannot be assigned through the Azure Portal UI, so the PowerShell script in Section 4.3 must be used. Admin consent is required. Permission Summary Permission API / Service AppId Why It Is Required user_impersonation Azure Management 797f4846-ba00-4fd7-ba43-dac1f8f63013 Allows the Managed Identity to call the Azure Resource Graph API to query VM inventory across all subscriptions. Without this, the workflow cannot discover VMs. WindowsDefenderATP.Read.All MDE Security Center fc780465-2017-40d4-a0c5-307022471b92 Allows reading all device records from the MDE API (/api/machines). This returns onboarding status, last seen time, and health status — the core compliance data. Mail.Send Microsoft Graph 00000003-0000-0000-c000-000000000000 Allows sending emails via the Graph /sendMail endpoint on behalf of the varSenderEmail mailbox. Without this, no alerts or reports can be sent. Important: The Azure Management and MDE permissions belong to separate service principals — they are NOT part of Microsoft Graph. Each permission must be assigned to its own service principal using the AppId shown above. The script in Section 4.2 handles this correctly. Where to find the required values Parameter Where to find it in Azure Portal $tenantID Azure Portal > Microsoft Entra ID > Overview > Tenant ID $managedIdentityObjectId Logic App > Settings > Identity > System assigned tab > Object (principal) ID Permission Assignment Script Run this in Azure Cloud Shell or any terminal with the Microsoft.Graph PowerShell module installed. Update $tenantID and $managedIdentityObjectId before running. # PowerShell # ── Update these two values before running ─────────────────────────── $tenantID = "<tenantID>" # Your Tenant ID $managedIdentityObjectId = "<objectID>" # MI Object ID # Install Microsoft.Graph if not already present if (!(Get-Module -ListAvailable -Name Microsoft.Graph)) { Install-Module -Name Microsoft.Graph -Scope CurrentUser -Force } # Connect to Microsoft Graph Connect-MgGraph -TenantId $tenantID ` -Scopes "AppRoleAssignment.ReadWrite.All","Application.Read.All" # MDE Compliance Logic App needs 3 permissions across 3 different service principals $permissions = @( @{ Permission="user_impersonation"; AppId="797f4846-ba00-4fd7-ba43-dac1f8f63013" }, @{ Permission="WindowsDefenderATP.Read.All"; AppId="fc780465-2017-40d4-a0c5-307022471b92" }, @{ Permission="Mail.Send"; AppId="00000003-0000-0000-c000-000000000000" } ) foreach ($entry in $permissions) { $sp = Get-MgServicePrincipal -Filter "AppId eq '$($entry.AppId)'" $appRole = $sp.AppRoles | Where-Object { $_.Value -eq $entry.Permission } if ($appRole -ne $null) { New-MgServicePrincipalAppRoleAssignment ` -ServicePrincipalId $sp.Id ` -PrincipalId $managedIdentityObjectId ` -ResourceId $sp.Id ` -AppRoleId $appRole.Id Write-Host "Assigned: $($entry.Permission)" -ForegroundColor Green } else { Write-Host "Not found: $($entry.Permission)" -ForegroundColor Yellow } } Write-Host "All permissions assigned." -ForegroundColor Green Verify Permissions Assigned # PowerShell # Run after the assignment script to verify all 3 permissions are present Get-MgServicePrincipalAppRoleAssignment ` -ServicePrincipalId $managedIdentityObjectId | Select-Object AppRoleId, PrincipalDisplayName | Format-Table -AutoSize Note: You should see three assignment rows in the output — one for each permission. If any are missing, re-run the assignment script. An error saying the assignment already exists is normal and can be safely ignored. Creating the Logic App Create the resource Azure Portal > search Logic Apps > + Create. Select your Subscription and Resource Group. Logic App name: la-mde-compliance-monitor. Plan type: Standard > Windows > select or create a Hosting Plan > Review + Create > Create. Once deployed, click Go to resource. Enable System-assigned Managed Identity Open the Logic App > left menu: Settings > Identity. On the System assigned tab, toggle Status to On. Click Save > Yes on the confirmation dialog. The Object (principal) ID appears. Copy this value for the PowerShell script. Run the Permissions Assignment script to assign all three permissions to this identity. Why Managed Identity: A System-assigned Managed Identity is automatically scoped to this Logic App and deleted when the Logic App is deleted. It authenticates to Azure Management API, MDE API, and Microsoft Graph without any stored passwords or client secrets. Create the workflow and import the JSON Logic App > left menu: Workflows > + Add. Workflow name: MDEComplianceMonitor. State type: Stateful. Click Create. Click the workflow name > left menu: Code. Press Ctrl + A > Delete to clear the editor completely. Paste the complete workflow JSON from the companion file (see Appendix A). Click Save. It should succeed with no validation errors. Important: Always use Stateful. Stateless workflows do not support run history, have a 5-minute timeout, and do not retain intermediate state — all of which are required by this workflow's pagination loops. Configuration: What You Can Change After importing the JSON, update only the values described below. Everything else runs automatically. Email Address Variables Variable Description Where to Update varITTeamEmail The IT Team email address. All IT Summary reports are sent TO this address. All per-VM owner emails CC this address. 3000 varSenderEmail The Microsoft 365 licensed account that emails are sent FROM via Graph API. Must have Mail.Send permission granted to the Managed Identity. 3000 Compliance look-up window: MDE_LASTSEEN_HOURS This setting in the CONFIG compose action defines how recently a VM must have reported to MDE to count as Compliant. Default is 24 hours. Value Behaviour 24 (default) Compliant if the VM checked in with MDE within the last 24 hours. Recommended starting point. 12 Stricter check; suitable for high-security environments requiring near-real-time coverage. 48 More relaxed; suitable for environments with scheduled maintenance windows or intermittent connectivity. Running VMs Only The Azure Resource Graph query includes a filter for powerState == "VM running". This means: Deallocated VMs are excluded (not expected to report to MDE while offline). Stopped (allocated) VMs are excluded. Newly started VMs are included and checked on the next daily run. To Change the Filter: To change the power state filter, locate the "query" string inside the Build-VMQuery-Paged action and modify the | where powerState == clause. For example, removing the filter entirely will check all VMs regardless of state. Sample Email Notifications The screenshots below show actual emails generated by this workflow. All sensitive data (email addresses, VM names, subscription IDs, IP addresses) has been redacted. Per-VM owner alert Sent to the server owner (ServerOwner tag) when their VM is non-compliant. The IT Team is CC'd. The email contains full server details, compliance status, priority, last MDE check-in time, and resolution SLA. Note: If no ServerOwner tag is set the VM is skipped here and included in the "No Owner Tag Found" section of the IT summary instead. IT Team Daily Summary Report Sent once per day to the IT Team after all owner emails are dispatched. Shows up to 20 VMs inline with a full CSV attachment containing the complete list, plus a dedicated section for VMs with no owner tag. Note: The CSV attachment always contains the complete list of all non-compliant VMs regardless of count. The inline HTML table is limited to 20 rows to keep the email size manageable. All Compliant VMs: If all VMs are compliant, you’ll see email like this: Post-Deployment Checklist Before you leave the workflow running unattended, walk through this checklist once. # Item 1 Logic App resource created (Standard plan, Stateful workflow) 2 System-assigned Managed Identity enabled; Object ID copied 3 PowerShell script run; user_impersonation, WindowsDefenderATP.Read.All, and Mail.Send assigned 4 Permissions verified using Get-MgServicePrincipalAppRoleAssignment (3 rows expected) 5 Workflow JSON pasted into Code view; saved without validation errors 6 varITTeamEmail updated to your IT security team or distribution list address 7 varSenderEmail updated to a licensed Microsoft 365 mailbox 8 MDE_LASTSEEN_HOURS reviewed (default 24, adjust if needed) 9 At least one Azure VM has the ServerOwner tag set with a valid email 10 Manual run triggered: Logic App > Overview > Run Trigger > Run 11 Run history shows Succeeded; no 401 or 403 errors on any HTTP action 12 IT Team received the daily summary email with CSV attachment 13 Server owner received a per-VM alert with the IT Team CC'd 14 Recurrence trigger confirmed running daily at 08:00 IST Wrapping Up What I love about this is how much it accomplishes with so little: a Logic App, a Managed Identity, and three permissions. No connectors, no secrets to rotate, no third-party services. Yet every morning, your security team starts the day knowing exactly which VMs are out of MDE coverage and which owners have already been notified. If you adopt this pattern, here are a few natural next steps to consider: Hook into Microsoft Sentinel by writing non-compliant VMs to a custom table for trend analysis. Auto-create ServiceNow or Jira tickets for VMs that remain non-compliant for more than 48 hours. Extend the match logic to include Arc-enabled servers, not just Azure VMs. Add a Teams adaptive card notification alongside email for faster response. I'd love to hear how you're solving MDE coverage gaps in your environment. Appendix A: Workflow JSON The complete Logic App workflow definition is provided below. To import it: open the Logic App in Azure Portal, navigate to the workflow, click Code view, press Ctrl + A to clear the existing content, paste the entire JSON, then click Save. { "definition": { "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#", "contentVersion": "1.0.0.0", "triggers": { "Recurrence": { "recurrence": { "frequency": "Day", "interval": 1, "schedule": { "hours": [ "8" ], "minutes": [ 0 ] }, "timeZone": "India Standard Time" }, "evaluatedRecurrence": { "frequency": "Day", "interval": 1, "schedule": { "hours": [ "8" ], "minutes": [ 0 ] }, "timeZone": "India Standard Time" }, "type": "Recurrence" } }, "actions": { "CONFIG": { "runAfter": {}, "type": "Compose", "inputs": { "MDE_LASTSEEN_HOURS": 24 } }, "Set-ExcludedSubscriptions": { "runAfter": { "CONFIG": [ "Succeeded" ] }, "type": "Compose", "inputs": [] }, "Init-varITTeamEmail": { "runAfter": { "Set-ExcludedSubscriptions": [ "Succeeded" ] }, "type": "InitializeVariable", "inputs": { "variables": [ { "name": "varITTeamEmail", "type": "string", "value": "admin@contoso.onmicrosoft.com" } ] } }, "Init-varSenderEmail": { "runAfter": { "Init-varITTeamEmail": [ "Succeeded" ] }, "type": "InitializeVariable", "inputs": { "variables": [ { "name": "varSenderEmail", "type": "string", "value": "admin@contoso.onmicrosoft.com" } ] } }, "Get-AllSubscriptions": { "runAfter": { "Init-varSenderEmail": [ "Succeeded" ] }, "type": "Http", "inputs": { "uri": "https://management.azure.com/subscriptions?api-version=2022-12-01", "method": "GET", "headers": { "Content-Type": "application/json" }, "authentication": { "type": "ManagedServiceIdentity", "audience": "https://management.azure.com" }, "retryPolicy": { "type": "fixed", "count": 3, "interval": "PT60S" } } }, "Parse-AllSubscriptions": { "runAfter": { "Get-AllSubscriptions": [ "Succeeded" ] }, "type": "ParseJson", "inputs": { "content": "@body('Get-AllSubscriptions')", "schema": { "type": "object", "properties": { "value": { "type": "array", "items": { "type": "object", "properties": { "subscriptionId": { "type": "string" }, "displayName": { "type": "string" }, "state": { "type": "string" } } } } } } } }, "Init-AllVMs": { "runAfter": { "Parse-AllSubscriptions": [ "Succeeded" ] }, "type": "InitializeVariable", "inputs": { "variables": [ { "name": "AllVMs", "type": "array", "value": [] }, { "name": "VMSkipToken", "type": "string", "value": "INIT" }, { "name": "VMFetchComplete", "type": "boolean", "value": false } ] } }, "ForEach-Subscription": { "foreach": "@body('Parse-AllSubscriptions')?['value']", "actions": { "Check-SubscriptionEnabled": { "actions": { "Reset-VMSkipToken": { "type": "SetVariable", "inputs": { "name": "VMSkipToken", "value": "INIT" } }, "Reset-VMFetchComplete": { "runAfter": { "Reset-VMSkipToken": [ "Succeeded" ] }, "type": "SetVariable", "inputs": { "name": "VMFetchComplete", "value": false } }, "Until": { "actions": { "Build-VMQuery-Paged": { "type": "Compose", "inputs": { "subscriptions": [ "@{items('ForEach-Subscription')?['subscriptionId']}" ], "query": "Resources | where type == 'microsoft.compute/virtualmachines' | extend VMName = tostring(name), ResourceGroup = tostring(resourceGroup), Location = tostring(location), OSType = tostring(properties.storageProfile.osDisk.osType), VMSize = tostring(properties.hardwareProfile.vmSize), ServerOwner = tostring(tags.ServerOwner), Environment = tostring(tags.Environment), SubscriptionId = tostring(subscriptionId), nicId = tolower(tostring(properties.networkProfile.networkInterfaces[0].id)), VmId = tolower(tostring(properties.vmId)) | join kind=leftouter (Resources | where type == 'microsoft.network/networkinterfaces' | extend privateIP = tostring(properties.ipConfigurations[0].properties.privateIPAddress) | project nicId = tolower(id), privateIP) on nicId | join kind=leftouter (Resources | where type == 'microsoft.compute/virtualmachines' | extend powerState = tostring(properties.extended.instanceView.powerState.displayStatus) | project id, powerState) on id | where powerState == 'VM running' | project VMName, ResourceGroup, Location, OSType, VMSize, ServerOwner, Environment = 'Azure', SubscriptionId, PrivateIP = privateIP, VmId, CloudEnvironment = 'Azure'", "options": { "$skipToken": "@if(equals(variables('VMSkipToken'), 'INIT'), '', variables('VMSkipToken'))" }, "$top": 1000 } }, "Get-VMs-Paged": { "runAfter": { "Build-VMQuery-Paged": [ "Succeeded" ] }, "type": "Http", "inputs": { "uri": "https://management.azure.com/providers/Microsoft.ResourceGraph/resources?api-version=2021-03-01", "method": "POST", "headers": { "Content-Type": "application/json" }, "body": "@outputs('Build-VMQuery-Paged')", "authentication": { "type": "ManagedServiceIdentity", "audience": "https://management.azure.com" } }, "runtimeConfiguration": { "contentTransfer": { "transferMode": "Chunked" } } }, "ForEach-VM-Result-Paged": { "foreach": "@body('Get-VMs-Paged')?['data']", "actions": { "Append-SingleVM-Paged": { "type": "AppendToArrayVariable", "inputs": { "name": "AllVMs", "value": "@items('ForEach-VM-Result-Paged')" } } }, "runAfter": { "Get-VMs-Paged": [ "Succeeded" ] }, "type": "Foreach" }, "Check-VMSkipToken": { "actions": { "Set-VMFetchComplete": { "type": "SetVariable", "inputs": { "name": "VMFetchComplete", "value": true } } }, "runAfter": { "ForEach-VM-Result-Paged": [ "Succeeded" ] }, "else": { "actions": { "Set-VMSkipToken": { "type": "SetVariable", "inputs": { "name": "VMSkipToken", "value": "@body('Get-VMs-Paged')?['$skipToken']" } } } }, "expression": { "or": [ { "equals": [ "@string(body('Get-VMs-Paged')?['$skipToken'])", "" ] } ] }, "type": "If" } }, "runAfter": { "Reset-VMFetchComplete": [ "Succeeded" ] }, "expression": "@equals(variables('VMFetchComplete'), true)", "limit": { "count": 50, "timeout": "PT1H" }, "type": "Until" } }, "else": { "actions": {} }, "expression": { "and": [ { "equals": [ "@items('ForEach-Subscription')?['state']", "Enabled" ] } ] }, "type": "If" } }, "runAfter": { "Init-AllVMs": [ "Succeeded" ] }, "type": "Foreach" }, "Init-MDEVariables": { "runAfter": { "ForEach-Subscription": [ "Succeeded" ] }, "type": "InitializeVariable", "inputs": { "variables": [ { "name": "AllMDEDevices", "type": "array" }, { "name": "MDESkip", "type": "integer", "value": 0 }, { "name": "MDEFetchComplete", "type": "boolean", "value": false } ] } }, "Paginate-MDEDevices": { "actions": { "Get-MDEDevices-Page": { "type": "Http", "inputs": { "uri": "https://api.securitycenter.microsoft.com/api/machines?$select=computerDnsName,id,osPlatform,lastSeen,onboardingStatus,healthStatus,lastIpAddress&$top=10000&$skip=@{variables('MDESkip')}", "method": "GET", "headers": { "Content-Type": "application/json" }, "authentication": { "type": "ManagedServiceIdentity", "audience": "https://api.securitycenter.microsoft.com" }, "retryPolicy": { "type": "fixed", "count": 3, "interval": "PT60S" } }, "runtimeConfiguration": { "contentTransfer": { "transferMode": "Chunked" } } }, "Parse-MDEPage": { "runAfter": { "Get-MDEDevices-Page": [ "Succeeded" ] }, "type": "ParseJson", "inputs": { "content": "@body('Get-MDEDevices-Page')", "schema": { "type": "object", "properties": { "value": { "type": "array", "items": { "type": "object", "properties": { "computerDnsName": { "type": [ "string", "null" ] }, "id": { "type": [ "string", "null" ] }, "osPlatform": { "type": [ "string", "null" ] }, "lastSeen": { "type": [ "string", "null" ] }, "onboardingStatus": { "type": [ "string", "null" ] }, "healthStatus": { "type": [ "string", "null" ] }, "lastIpAddress": { "type": [ "string", "null" ] }, "azureVmId": { "type": [ "string", "null" ] } } } } } } } }, "Append-MDEPage-ToArray": { "foreach": "@body('Parse-MDEPage')?['value']", "actions": { "Append-SingleMDEDevice": { "type": "AppendToArrayVariable", "inputs": { "name": "AllMDEDevices", "value": "@items('Append-MDEPage-ToArray')" } } }, "runAfter": { "Parse-MDEPage": [ "Succeeded" ] }, "type": "Foreach" }, "Check-PageSize": { "actions": { "Set-FetchComplete-True": { "type": "SetVariable", "inputs": { "name": "MDEFetchComplete", "value": true } } }, "runAfter": { "Append-MDEPage-ToArray": [ "Succeeded" ] }, "else": { "actions": { "Increment-MDESkip": { "type": "IncrementVariable", "inputs": { "name": "MDESkip", "value": 10000 } } } }, "expression": { "and": [ { "less": [ "@length(body('Parse-MDEPage')?['value'])", 10000 ] } ] }, "type": "If" } }, "runAfter": { "Init-MDEVariables": [ "Succeeded" ] }, "expression": "@equals(variables('MDEFetchComplete'), true)", "limit": { "count": 50, "timeout": "PT1H" }, "type": "Until" }, "Init-Variables": { "runAfter": { "Paginate-MDEDevices": [ "Succeeded" ] }, "type": "InitializeVariable", "inputs": { "variables": [ { "name": "EmailsSent", "type": "array", "value": [] }, { "name": "NoOwnerList", "type": "array", "value": [] }, { "name": "NonCompliantList", "type": "array", "value": [] }, { "name": "SummaryStats", "type": "object", "value": { "TotalNonCompliant": 0, "P1Critical": 0, "P2High": 0, "P3Medium": 0, "P4Low": 0, "EmailsSent": 0, "NoOwnerFound": 0 } }, { "name": "HTMLRows", "type": "string" }, { "name": "NonCompliantCount", "type": "integer", "value": 0 }, { "name": "CSVRows", "type": "string", "value": "@{concat('\"VM Name\",\"Private IP\",\"OS Type\",\"Location\",\"Server Owner\",\"MDE Status\",\"Last Seen\",\"Priority\",\"Action Taken\",\"Subscription ID\"', decodeUriComponent('%0A'))}" }, { "name": "HTMLRowCount", "type": "integer", "value": 0 } ] } }, "ForEach-AzureVM": { "foreach": "@variables('AllVMs')", "actions": { "Find-VMInMDE-Filter": { "type": "Query", "inputs": { "from": "@variables('AllMDEDevices')", "where": "@or(and(not(equals(item()?['azureVmId'], null)), not(equals(item()?['azureVmId'], '')), equals(toLower(item()?['azureVmId']), toLower(items('ForEach-AzureVM')?['VmId']))), and(or(equals(item()?['azureVmId'], null), equals(item()?['azureVmId'], '')), startsWith(toLower(item()?['computerDnsName']), toLower(items('ForEach-AzureVM')?['VMName'])), equals(item()?['lastIpAddress'], items('ForEach-AzureVM')?['PrivateIP'])))" } }, "Find-VMInMDE": { "runAfter": { "Find-VMInMDE-Filter": [ "Succeeded" ] }, "type": "Compose", "inputs": "@if(greater(length(body('Find-VMInMDE-Filter')), 0), first(body('Find-VMInMDE-Filter')), json('{\"computerDnsName\":\"NOT_FOUND\",\"onboardingStatus\":\"NotFound\",\"lastSeen\":\"1900-01-01T00:00:00Z\",\"lastIpAddress\":\"N/A\",\"healthStatus\":\"Unknown\"}'))" }, "Get-ComplianceStatus": { "runAfter": { "Find-VMInMDE": [ "Succeeded" ] }, "type": "Compose", "inputs": "@if(equals(outputs('Find-VMInMDE')?['computerDnsName'], 'NOT_FOUND'), 'Not Onboarded', if(equals(outputs('Find-VMInMDE')?['onboardingStatus'], 'Onboarded'), if(greater(outputs('Find-VMInMDE')?['lastSeen'], addHours(utcNow(), mul(-1, outputs('CONFIG')?['MDE_LASTSEEN_HOURS']))), 'Compliant', 'Onboarded - Not Reporting'), 'Not Onboarded'))" }, "Get-Priority": { "runAfter": { "Get-ComplianceStatus": [ "Succeeded" ] }, "type": "Compose", "inputs": "@if(equals(outputs('Get-ComplianceStatus'), 'Not Onboarded'), 'P2 - High', if(equals(outputs('Get-ComplianceStatus'), 'Onboarded - Not Reporting'), 'P3 - Medium', if(equals(outputs('Get-ComplianceStatus'), 'Compliant'), 'Compliant', 'P4 - Low')))" }, "Is-NonCompliant": { "actions": { "Append-CSVRows": { "type": "AppendToStringVariable", "inputs": { "name": "CSVRows", "value": "\"@{items('ForEach-AzureVM')?['VMName']}\",\"@{if(equals(items('ForEach-AzureVM')?['PrivateIP'], ''), 'N/A', items('ForEach-AzureVM')?['PrivateIP'])}\",\"@{items('ForEach-AzureVM')?['OSType']}\",\"@{items('ForEach-AzureVM')?['Location']}\",\"@{if(equals(items('ForEach-AzureVM')?['ServerOwner'], ''), 'No Owner Tag', items('ForEach-AzureVM')?['ServerOwner'])}\",\"@{outputs('Get-ComplianceStatus')}\",\"@{if(equals(outputs('Find-VMInMDE')?['onboardingStatus'], 'Onboarded'), if(equals(outputs('Find-VMInMDE')?['lastSeen'], '1900-01-01T00:00:00Z'), 'Never', concat(convertTimeZone(outputs('Find-VMInMDE')?['lastSeen'], 'UTC', 'India Standard Time', 'dd-MM-yyyy HH:mm:ss'), ' (', string(div(sub(ticks(utcNow()), ticks(outputs('Find-VMInMDE')?['lastSeen'])), 864000000000)), ' days ago)')), concat(outputs('Find-VMInMDE')?['onboardingStatus'], ' - Last Seen: ', convertTimeZone(outputs('Find-VMInMDE')?['lastSeen'], 'UTC', 'India Standard Time', 'dd-MM-yyyy HH:mm:ss')))}\",\"@{outputs('Get-Priority')}\",\"@{if(equals(items('ForEach-AzureVM')?['ServerOwner'], ''), 'IT Team Notified', 'Email sent to Server Owner')}\",\"@{items('ForEach-AzureVM')?['SubscriptionId']}\"@{decodeUriComponent('%0A')}" } }, "Check-HTMLRowCount": { "actions": { "Append-HTMLRows": { "type": "AppendToStringVariable", "inputs": { "name": "HTMLRows", "value": "<tr><td style=\"padding:8px 10px;border:1px solid #ddd;font-weight:600;\">@{items('ForEach-AzureVM')?['VMName']}</td><td style=\"padding:8px 10px;border:1px solid #ddd;word-wrap:break-word;\">@{if(equals(items('ForEach-AzureVM')?['PrivateIP'], ''), 'N/A', items('ForEach-AzureVM')?['PrivateIP'])}</td><td style=\"padding:8px 10px;border:1px solid #ddd;word-wrap:break-word;\">@{items('ForEach-AzureVM')?['OSType']}</td><td style=\"padding:8px 10px;border:1px solid #ddd;word-wrap:break-word;\">@{items('ForEach-AzureVM')?['Location']}</td><td style=\"padding:8px 10px;border:1px solid #ddd;word-wrap:break-word;\">@{if(equals(items('ForEach-AzureVM')?['ServerOwner'], ''), 'No Owner Tag', items('ForEach-AzureVM')?['ServerOwner'])}</td><td style=\"padding:8px 10px;border:1px solid #ddd;color:#c80000;\">@{outputs('Get-ComplianceStatus')}</td><td style=\"padding:8px 10px;border:1px solid #ddd;word-wrap:break-word;\">@{if(equals(outputs('Find-VMInMDE')?['onboardingStatus'], 'Onboarded'), if(equals(outputs('Find-VMInMDE')?['lastSeen'], '1900-01-01T00:00:00Z'), 'Never', concat(convertTimeZone(outputs('Find-VMInMDE')?['lastSeen'], 'UTC', 'India Standard Time', 'dd-MM-yyyy HH:mm:ss'), ' (', string(div(sub(ticks(utcNow()), ticks(outputs('Find-VMInMDE')?['lastSeen'])), 864000000000)), ' days ago)')), concat(outputs('Find-VMInMDE')?['onboardingStatus'], ' - Last Seen: ', convertTimeZone(outputs('Find-VMInMDE')?['lastSeen'], 'UTC', 'India Standard Time', 'dd-MM-yyyy HH:mm:ss')))}</td><td style=\"padding:8px 10px;border:1px solid #ddd;word-wrap:break-word;\">@{outputs('Get-Priority')}</td><td style=\"padding:8px 10px;border:1px solid #ddd;word-wrap:break-word;\">@{if(equals(items('ForEach-AzureVM')?['ServerOwner'], ''), 'IT Team Notified', 'Email sent to Server Owner')}</td><td style=\"padding:8px 10px;border:1px solid #ddd;word-break:break-all;\">@{items('ForEach-AzureVM')?['SubscriptionId']}</td></tr>" } }, "Increment-HTMLRowCount": { "runAfter": { "Append-HTMLRows": [ "Succeeded" ] }, "type": "IncrementVariable", "inputs": { "name": "HTMLRowCount", "value": 1 } } }, "runAfter": { "Append-CSVRows": [ "Succeeded" ] }, "else": { "actions": {} }, "expression": { "and": [ { "less": [ "@variables('HTMLRowCount')", 20 ] } ] }, "type": "If" }, "Increment-NonCompliantCount": { "runAfter": { "Check-HTMLRowCount": [ "Succeeded" ] }, "type": "IncrementVariable", "inputs": { "name": "NonCompliantCount", "value": 1 } }, "Check-ServerOwner": { "actions": { "Send-OwnerEmail": { "type": "Http", "inputs": { "uri": "@{concat('https://graph.microsoft.com/v1.0/users/', encodeURIComponent(variables('varSenderEmail')), '/sendMail')}", "method": "POST", "headers": { "Content-Type": "application/json" }, "body": { "message": { "subject": "[@{outputs('Get-Priority')}] MDE Compliance Alert - @{items('ForEach-AzureVM')?['VMName']}", "body": { "contentType": "HTML", "content": "<html><body style=\"font-family:Segoe UI,Arial,sans-serif;color:#1a1a1a;\"><div style=\"max-width:680px;margin:24px auto;border:1px solid #e0e0e0;border-radius:8px;overflow:hidden;\"><div style=\"background:#c80000;padding:20px 28px;\"><h2 style=\"color:#fff;margin:0;\">MDE Compliance Alert</h2><p style=\"color:#ffcccc;margin:6px 0 0;font-size:13px;\">Priority: @{outputs('Get-Priority')}</p></div><div style=\"padding:28px;\"><p style=\"margin-top:0;font-size:14px;\">Your server <strong>@{items('ForEach-AzureVM')?['VMName']}</strong> has a Microsoft Defender for Endpoint compliance issue requiring immediate attention.</p><table style=\"width:100%;border-collapse:collapse;font-size:14px;\"><thead><tr style=\"background:#f5f5f5;\"><th style=\"text-align:left;padding:10px 14px;border:1px solid #ddd;width:38%;\">Field</th><th style=\"text-align:left;padding:10px 14px;border:1px solid #ddd;word-wrap:break-word;\">Value</th></tr></thead><tbody><tr><td style=\"padding:9px 14px;border:1px solid #ddd;font-weight:600;\">Server Name</td><td style=\"padding:9px 14px;border:1px solid #ddd;word-wrap:break-word;\">@{items('ForEach-AzureVM')?['VMName']}</td></tr><tr style=\"background:#fafafa;\"><td style=\"padding:9px 14px;border:1px solid #ddd;font-weight:600;\">Private IP</td><td style=\"padding:9px 14px;border:1px solid #ddd;word-wrap:break-word;\">@{if(equals(items('ForEach-AzureVM')?['PrivateIP'], ''), 'N/A', items('ForEach-AzureVM')?['PrivateIP'])}</td></tr><tr><td style=\"padding:9px 14px;border:1px solid #ddd;font-weight:600;\">OS Type</td><td style=\"padding:9px 14px;border:1px solid #ddd;word-wrap:break-word;\">@{items('ForEach-AzureVM')?['OSType']}</td></tr><tr style=\"background:#fafafa;\"><td style=\"padding:9px 14px;border:1px solid #ddd;font-weight:600;\">Location</td><td style=\"padding:9px 14px;border:1px solid #ddd;word-wrap:break-word;\">@{items('ForEach-AzureVM')?['Location']}</td></tr><tr><td style=\"padding:9px 14px;border:1px solid #ddd;font-weight:600;\">Compliance Status</td><td style=\"padding:9px 14px;border:1px solid #ddd;color:#c80000;font-weight:700;\">@{outputs('Get-ComplianceStatus')}</td></tr><tr style=\"background:#fafafa;\"><td style=\"padding:9px 14px;border:1px solid #ddd;font-weight:600;\">Priority</td><td style=\"padding:9px 14px;border:1px solid #ddd;font-weight:700;\">@{outputs('Get-Priority')}</td></tr><tr><td style=\"padding:9px 14px;border:1px solid #ddd;font-weight:600;\">MDE Onboarding Status</td><td style=\"padding:9px 14px;border:1px solid #ddd;word-wrap:break-word;\">@{outputs('Find-VMInMDE')?['onboardingStatus']}</td></tr><tr style=\"background:#fafafa;\"><td style=\"padding:9px 14px;border:1px solid #ddd;font-weight:600;\">Last Seen in MDE (IST)</td><td style=\"padding:9px 14px;border:1px solid #ddd;word-wrap:break-word;\">@{if(equals(outputs('Find-VMInMDE')?['lastSeen'], '1900-01-01T00:00:00Z'), 'Never', concat(convertTimeZone(outputs('Find-VMInMDE')?['lastSeen'], 'UTC', 'India Standard Time', 'dd-MM-yyyy HH:mm:ss'), ' (', string(div(sub(ticks(utcNow()), ticks(outputs('Find-VMInMDE')?['lastSeen'])), 864000000000)), ' days ago)'))}</td></tr><tr><td style=\"padding:9px 14px;border:1px solid #ddd;font-weight:600;\">Resource Group</td><td style=\"padding:9px 14px;border:1px solid #ddd;word-wrap:break-word;\">@{items('ForEach-AzureVM')?['ResourceGroup']}</td></tr><tr style=\"background:#fafafa;\"><td style=\"padding:9px 14px;border:1px solid #ddd;font-weight:600;\">Subscription ID</td><td style=\"padding:9px 14px;border:1px solid #ddd;word-break:break-all;\">@{items('ForEach-AzureVM')?['SubscriptionId']}</td></tr></tbody></table><br/><table style=\"width:100%;border-collapse:collapse;\"><tr style=\"background:#fff8e1;\"><td style=\"padding:10px 14px;border:1px solid #ffe082;font-size:13px;\"><strong>Resolution SLA:</strong> P1 Critical - 24hrs | P2 High - 48hrs | P3 Medium - 72hrs</td></tr></table><br/><p style=\"font-size:13px;color:#555;\">For assistance contact IT Security: <a href=\"mailto:@{variables('varITTeamEmail')}\">@{variables('varITTeamEmail')}</a></p></div></div></body></html>" }, "toRecipients": [ { "emailAddress": { "address": "@{items('ForEach-AzureVM')?['ServerOwner']}" } } ], "ccRecipients": [ { "emailAddress": { "address": "@variables('varITTeamEmail')" } } ] }, "saveToSentItems": "true" }, "authentication": { "type": "ManagedServiceIdentity", "audience": "https://graph.microsoft.com" }, "retryPolicy": { "type": "fixed", "count": 2, "interval": "PT60S" } }, "runtimeConfiguration": { "contentTransfer": { "transferMode": "Chunked" } } }, "Append-EmailsSent": { "runAfter": { "Send-OwnerEmail": [ "Succeeded" ] }, "type": "AppendToArrayVariable", "inputs": { "name": "EmailsSent", "value": "@{items('ForEach-AzureVM')?['VMName']} → @{items('ForEach-AzureVM')?['ServerOwner']}" } } }, "runAfter": { "Increment-NonCompliantCount": [ "Succeeded" ] }, "else": { "actions": { "Append-NoOwnerList": { "type": "AppendToArrayVariable", "inputs": { "name": "NoOwnerList", "value": "<tr><td style=\"padding:8px 10px;border:1px solid #ddd;font-weight:600;\">@{items('ForEach-AzureVM')?['VMName']}</td><td style=\"padding:8px 10px;border:1px solid #ddd;word-wrap:break-word;\">@{if(equals(items('ForEach-AzureVM')?['PrivateIP'], ''), 'N/A', items('ForEach-AzureVM')?['PrivateIP'])}</td><td style=\"padding:8px 10px;border:1px solid #ddd;word-wrap:break-word;\">@{outputs('Get-ComplianceStatus')}</td><td style=\"padding:8px 10px;border:1px solid #ddd;font-weight:700;\">@{outputs('Get-Priority')}</td></tr>" } } } }, "expression": { "and": [ { "not": { "equals": [ "@items('ForEach-AzureVM')?['ServerOwner']", "" ] } } ] }, "type": "If" } }, "runAfter": { "Get-Priority": [ "Succeeded" ] }, "else": { "actions": {} }, "expression": { "and": [ { "not": { "equals": [ "@outputs('Get-ComplianceStatus')", "Compliant" ] } } ] }, "type": "If" } }, "runAfter": { "Init-Variables": [ "Succeeded" ] }, "type": "Foreach", "runtimeConfiguration": { "concurrency": { "repetitions": 1 } } }, "Check-AnyNonCompliant": { "actions": { "Send-ITSummaryEmail": { "type": "Http", "inputs": { "uri": "@{concat('https://graph.microsoft.com/v1.0/users/', encodeURIComponent(variables('varSenderEmail')), '/sendMail')}", "method": "POST", "headers": { "Content-Type": "application/json" }, "body": { "message": { "subject": "MDE Compliance Report (Azure Workloads) - @{variables('NonCompliantCount')} Non-Compliant VMs Found", "body": { "contentType": "HTML", "content": "<html><body style=\"font-family:Segoe UI,Arial,sans-serif;color:#1a1a1a;\"><div style=\"max-width:1400px;margin:24px auto;border:1px solid #e0e0e0;border-radius:8px;\"><div style=\"background:#0078d4;padding:20px 28px;\"><h2 style=\"color:#fff;margin:0;\">MDE Compliance Daily Report</h2><p style=\"color:#cce4ff;margin:6px 0 0;font-size:13px;\">Generated: @{convertTimeZone(utcNow(), 'UTC', 'India Standard Time', 'dd-MM-yyyy HH:mm:ss')} IST</p></div><div style=\"padding:28px;\"><table style=\"border-collapse:collapse;font-size:14px;margin-bottom:28px;\"><thead><tr style=\"background:#f0f0f0;\"><th style=\"padding:10px 18px;border:1px solid #ddd;word-wrap:break-word;\">Metric</th><th style=\"padding:10px 18px;border:1px solid #ddd;word-wrap:break-word;\">Value</th></tr></thead><tbody><tr><td style=\"padding:9px 18px;border:1px solid #ddd;word-wrap:break-word;\">Total Non-Compliant VMs</td><td style=\"padding:9px 18px;border:1px solid #ddd;font-weight:700;color:#c80000;\">@{variables('NonCompliantCount')}</td></tr><tr style=\"background:#fafafa;\"><td style=\"padding:9px 18px;border:1px solid #ddd;word-wrap:break-word;\">Server Owners Notified</td><td style=\"padding:9px 18px;border:1px solid #ddd;color:#107c10;font-weight:600;\">@{length(variables('EmailsSent'))}</td></tr><tr><td style=\"padding:9px 18px;border:1px solid #ddd;word-wrap:break-word;\">No Owner Tag</td><td style=\"padding:9px 18px;border:1px solid #ddd;color:#e65100;font-weight:600;\">@{length(variables('NoOwnerList'))}</td></tr></tbody></table><p style=\"background:#fff3cd;border:1px solid #ffc107;padding:10px 14px;border-radius:4px;font-size:13px;margin-bottom:16px;\">This report shows the first <strong>20 non-compliant VMs</strong> only. <strong>Please check the attached CSV file</strong> for the complete list.</p><table style=\"width:100%;table-layout:fixed;border-collapse:collapse;font-size:13px;\"><colgroup><col style=\"width:120px\"><col style=\"width:90px\"><col style=\"width:70px\"><col style=\"width:100px\"><col style=\"width:160px\"><col style=\"width:110px\"><col style=\"width:165px\"><col style=\"width:80px\"><col style=\"width:90px\"><col style=\"width:195px\"></colgroup><thead><tr style=\"background:#0078d4;color:#fff;\"><th style=\"padding:10px 12px;border:1px solid #005a9e;\">VM Name</th><th style=\"padding:10px 12px;border:1px solid #005a9e;\">Private IP</th><th style=\"padding:10px 12px;border:1px solid #005a9e;\">OS Type</th><th style=\"padding:10px 12px;border:1px solid #005a9e;\">Location</th><th style=\"padding:10px 12px;border:1px solid #005a9e;\">Server Owner</th><th style=\"padding:10px 12px;border:1px solid #005a9e;\">MDE Status</th><th style=\"padding:10px 12px;border:1px solid #005a9e;\">Last Seen (IST)</th><th style=\"padding:10px 12px;border:1px solid #005a9e;\">Priority</th><th style=\"padding:10px 12px;border:1px solid #005a9e;\">Action Taken</th><th style=\"padding:10px 12px;border:1px solid #005a9e;\">Subscription ID</th></tr></thead><tbody>@{variables('HTMLRows')}</tbody></table><br/><h3 style=\"border-bottom:2px solid #e65100;padding-bottom:8px;\">Action Required - No Owner Tag Found</h3><div style=\"background:#fff8f0;border:1px solid #ffccbc;padding:16px;border-radius:4px;font-size:13px;margin-bottom:16px;\"><p style=\"margin:0 0 8px 0;\">The following <strong>@{length(variables('NoOwnerList'))}</strong> server(s) have no <strong>ServerOwner</strong> tag assigned.</p><ol style=\"margin:0;padding-left:20px;\"><li style=\"margin-bottom:6px;\">Identify the owner of each server below</li><li style=\"margin-bottom:6px;\">Go to the VM in Azure Portal → Tags → Add tag</li><li style=\"margin-bottom:6px;\"><strong>Tag Name:</strong> ServerOwner | <strong>Tag Value:</strong> owner email address</li><li>Once tagged, the next daily report will automatically notify the owner</li></ol></div><table style=\"width:100%;table-layout:fixed;border-collapse:collapse;font-size:13px;\"><thead><tr style=\"background:#e65100;color:#fff;\"><th style=\"padding:10px 12px;border:1px solid #bf360c;text-align:left;\">VM Name</th><th style=\"padding:10px 12px;border:1px solid #bf360c;text-align:left;\">Private IP</th><th style=\"padding:10px 12px;border:1px solid #bf360c;text-align:left;\">MDE Status</th><th style=\"padding:10px 12px;border:1px solid #bf360c;text-align:left;\">Priority</th></tr></thead><tbody>@{if(equals(length(variables('NoOwnerList')), 0), '<tr><td colspan=\"4\" style=\"padding:12px;text-align:center;\">None - All servers have owner tags assigned</td></tr>', join(variables('NoOwnerList'), ''))}</tbody></table></div></div></body></html>" }, "toRecipients": [ { "emailAddress": { "address": "@variables('varITTeamEmail')" } } ], "attachments": [ { "@@odata.type": "#microsoft.graph.fileAttachment", "name": "@{concat('MDE-Compliance-Report-', convertTimeZone(utcNow(), 'UTC', 'India Standard Time', 'dd-MM-yyyy'), '.csv')}", "contentType": "text/csv", "contentBytes": "@{base64(variables('CSVRows'))}" } ] }, "saveToSentItems": "true" }, "authentication": { "type": "ManagedServiceIdentity", "audience": "https://graph.microsoft.com" } }, "runtimeConfiguration": { "contentTransfer": { "transferMode": "Chunked" } } } }, "runAfter": { "ForEach-AzureVM": [ "Succeeded" ] }, "else": { "actions": { "Send-AllClearEmail": { "type": "Http", "inputs": { "uri": "@{concat('https://graph.microsoft.com/v1.0/users/', encodeURIComponent(variables('varSenderEmail')), '/sendMail')}", "method": "POST", "headers": { "Content-Type": "application/json" }, "body": { "message": { "subject": "[@{convertTimeZone(utcNow(), 'UTC', 'India Standard Time', 'dd-MM-yyyy')}] MDE Compliance Report - All VMs Compliant", "body": { "contentType": "HTML", "content": "<html><body style=\"font-family:Segoe UI,Arial,sans-serif;color:#1a1a1a;\"><div style=\"max-width:600px;margin:24px auto;border:1px solid #e0e0e0;border-radius:8px;overflow:hidden;\"><div style=\"background:#107c10;padding:20px 28px;\"><h2 style=\"color:#fff;margin:0;\">MDE Compliance Report</h2><p style=\"color:#c8e6c9;margin:6px 0 0;font-size:13px;\">Generated: @{convertTimeZone(utcNow(), 'UTC', 'India Standard Time', 'dd-MM-yyyy HH:mm:ss')} IST</p></div><div style=\"padding:28px;text-align:center;\"><h2 style=\"color:#107c10;\">All VMs Compliant</h2><p style=\"font-size:15px;color:#555;\">All Azure Virtual Machines are onboarded to Microsoft Defender for Endpoint and reporting within the required 24-hour window.</p><p style=\"font-size:13px;color:#888;\">No action required. The next report will be sent tomorrow at 08:00 IST.</p></div></div></body></html>" }, "toRecipients": [ { "emailAddress": { "address": "@variables('varITTeamEmail')" } } ] }, "saveToSentItems": "true" }, "authentication": { "type": "ManagedServiceIdentity", "audience": "https://graph.microsoft.com" } }, "runtimeConfiguration": { "contentTransfer": { "transferMode": "Chunked" } } } } }, "expression": { "and": [ { "greater": [ "@variables('NonCompliantCount')", 0 ] } ] }, "type": "If" } }, "parameters": { "$connections": { "type": "Object", "defaultValue": {} } } }, "parameters": { "$connections": { "type": "Object", "value": {} } } }At-Scale Failure Reporting for Azure Update Manager
Introduction Azure Update Manager simplifies patching across Azure virtual machines and Azure Arc-enabled servers by providing a centralized platform for patch assessment and installation. However, as environments scale, a key challenge emerges—efficiently identifying and troubleshooting patch failures across large fleets of machines. While Azure Update Manager surfaces detailed error messages in the Azure portal, this information is typically available only at an individual machine level. In enterprise environments managing hundreds or thousands of systems, drilling into each VM to find error details quickly becomes impractical. In this article, we walk through a real-world use case and demonstrate how to leverage Azure Resource Graph (ARG) to extract failed machines along with their error details for a specific maintenance run—using a single query. The Challenge: Scaling Patch Failure Visibility In a large enterprise deployment, Azure Update Manager was configured to manage patching across: Windows and Linux virtual machines Azure cloud VMs and Arc-enabled on‑premises servers Multiple regions and subscriptions While patching operations were largely successful, a subset of machines experienced failures. The key challenges faced by the operations team were: Error messages were visible only by drilling into each failed VM in the portal No built‑in way to aggregate failures across all machines Lack of a simple mechanism to export: Failed VMs Error codes Error messages The team needed a scalable, query‑driven approach to analyze failures across an entire maintenance run. Key Insight: Where Azure Update Manager Stores Data Azure Update Manager does not rely on Log Analytics to store operational results. Instead: Patch assessment and installation results are stored in Azure Resource Graph Azure Resource Graph acts as a centralized, queryable store for update operations This design enables powerful querying without requiring additional ingestion, configuration, or cost overhead. Understanding Maintenance Runs and Correlation IDs Each Azure Update Manager maintenance run generates a unique identifier: properties.correlationId represents the maintenance (schedule) run ID All machines involved in the same patch cycle share this ID This allows all machines within a single patch execution to be correlated and queried collectively. The Solution: Query Failed VMs with Error Messages Azure Resource Graph allows querying failures at scale using the maintenanceresources dataset. Core Query (Kusto Query Language) 1 maintenanceresources 2 | where type =~ "microsoft.maintenance/applyupdates" 3 | where tostring(properties.correlationId) contains "<YourMaintenanceRunID>" 4 | where tostring(properties.status) =~ "Failed" 5 | project properties.resourceId, properties.errorCode, properties.errorMessage What This Query Delivers All machines that failed in a specific maintenance run Error codes for troubleshooting Full error messages that are otherwise visible only in the Azure portal Note: Property names for error information can vary by environment. Validate available fields using Azure Resource Graph Explorer and adjust the project clause if required. Sample Output (Conceptual) Resource ID Error Code Error Message vm-01 0x80244007 Windows Update API failed vm-02 0x80072f8f Connectivity issue vm-03 1C WSUS configuration issue Advanced Scenario: Automatically Detecting the Latest Failed Maintenance Run In real-world scenarios, you may not always know the maintenance run ID. The following query dynamically identifies the most recent maintenance run that had failures, and then retrieves all failed machines from that run. 1 // Step 1: Identify the latest maintenance run ID with failures 2 let lastFailedRun = toscalar( 3 maintenanceresources 4 | extend runId = extract(@"applyupdates/(\d+)$", 1, properties.correlationId) 5 | where type =~ "microsoft.maintenance/applyupdates" 6 | where tostring(properties.status) =~ "Failed" 7 | order by tostring(properties.startDateTime) desc 8 | take 1 9 | project runId 10 ); 11 // Step 2: Query all failed VMs from that run 12 maintenanceresources 13 | where type =~ "microsoft.maintenance/applyupdates" 14 | where tostring(properties.correlationId) contains lastFailedRun 15 | where tostring(properties.status) =~ "Failed" 16 | project properties.resourceId, properties.errorCode, properties.errorMessage This approach is ideal for automation, scheduled reporting, and dashboard scenarios. Why This Approach Matters Operational Efficiency Eliminates manual portal navigation Provides consolidated failure insights in seconds Scalability Works across large, distributed environments Supports both Azure and hybrid (Arc‑enabled) machines Automation Ready Can be integrated into scripts, dashboards, and reporting pipelines Enables proactive monitoring and alerting scenarios Best Practices for Enterprise Patch Reporting To maximize the value of this approach: Capture and track maintenance run IDs Use Azure Resource Graph as the primary reporting layer Build reusable queries for different patch scenarios Export reports for compliance and auditing Correlate failures with root‑cause trends over time Conclusion As organizations scale patching operations with Azure Update Manager, visibility, speed, and automation become essential. While the Azure portal is effective for per‑machine troubleshooting, it is not optimized for fleet‑level analysis. Azure Resource Graph fills this gap by enabling a shift from manual troubleshooting to automated, query‑driven failure analysis at scale. By adopting this approach, teams can significantly improve operational efficiency, reduce mean time to resolution, and build a more mature patch management strategy. Final takeaway: Don’t rely only on the portal Leverage Azure Resource Graph to operationalize patch insights at enterprise scale References Azure Update Manager – Query resources with Azure Resource Graph https://learn.microsoft.com/azure/update-manager/query-logs Azure Update Manager – Troubleshooting guide https://learn.microsoft.com/azure/update-manager/troubleshoot Sample Azure Resource Graph queries for Azure Update Manager https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/update-manager/sample-query-logs.mdPending Approval/Provisioning for Microsoft Defender XDR Lab/Trial Environment
Hello Microsoft Community Team, On June 26, 2026, our organization applied for a Microsoft 365 Developer Environment / Free Trial to support evaluation of the Microsoft Defender XDR Lab environment. To date, the environment has not been provisioned, and we have not received any status updates or confirmation. Impact: Current Status: We are currently utilizing our production environment to test project capabilities, which poses risks and limitations. Future Intent: Our organization plans to transition to a full, paid Business/Enterprise purchase immediately upon proving the platform’s benefits. Urgency: This delay is stalling our evaluation phase. We urgently need this environment onboarded and activated so we can proceed with deployment tests and subsequent procurement. Request: Please review the status of our registration and expedite the onboarding/provisioning of this developer environment. Thank you for your prompt assistance.22Views0likes0CommentsSentinel Foundry - MCP Server (Preview) (Github Community Release)
I’ve been cooking something that a lot of people in SOC have been struggling with — especially on the engineering side of Microsoft Sentinel. Thanks to the Microsoft Security team for shaping the capabilities of Sentinel even better with Sentinel Data Lake & Modern SecOps. Today’s the day I can finally share it. Note: This is not an official Microsoft product, but it is designed to make the Sentinel Build even better (complement) with much more intelligence. 🚀 Sentinel Foundry is now in public preview with 43 tools. (Sentinel Foundry - MCP Server) It’s an MCP server built to act like the brain of a strong Sentinel engineer — helping make building, improving, and operating Sentinel far more practical, faster, and honestly more enjoyable. For a lot of teams, the challenge is not understanding what Sentinel can do. The hard part is the engineering work around it: -> Deciding what data should actually be ingested -> Building a clean, scalable Sentinel foundation -> Writing useful detections instead of noisy ones -> Balancing security value with cost -> Turning ideas into deployable engineering outputs That is exactly why I built Sentinel Foundry to help communities grow stronger. It helps with the real engineering tasks behind Sentinel — from architecture thinking to detection design, deployment planning, ingestion strategy, automation ideas, and many of the workflows outlined in the GitHub project. How does it work? Here’s one of the flagship prompts I ran with it: “Give me a complete security posture report for our workspace. Score each pillar and tell me what to prioritise.” And within seconds, it produced a structured engineering blueprint that would normally take a lot longer to pull together manually. You can see the example prompts here in what it can do: https://github.com/prabhukiranveesam/Sentinel-Foundry#what-can-it-do I want building Sentinel to feel less like repetitive engineering overhead — and more like real security engineering that is fast, creative, and enjoyable. If you work with Sentinel as a SOC L2 analyst, engineer, detection engineer, consultant, or architect, I’d genuinely love for you to try it and tell me what you think. 🔗 Public Preview: https://github.com/prabhukiranveesam/Sentinel-Foundry This is just the start of an AI era — and I’m excited to keep shaping it with more powerful features over the coming days. This is very easy to set up and will be available to all of you at no cost during this month as part of the public preview, and your feedback is extremely valuable to shape this as a powerful solution.528Views0likes1CommentIs Power Automate Becoming the New Technical Debt in Dynamics 365 Projects?
Power Automate has transformed how organisations build automation within Dynamics 365 and the Power Platform. Teams can automate processes quickly, reduce manual effort, and deliver business value without extensive custom development. At the same time, I have noticed an interesting challenge in some organizations as Power Platform adoption matures. Over time, hundreds of flows can be created by different teams, often with varying levels of governance, documentation, and ownership. Business logic may become distributed across multiple automations, making troubleshooting, maintenance, and long-term support more complex. On the other hand, many organisations have successfully scaled Power Automate by implementing strong governance practices and automation standards. I'm interested in hearing different perspectives from the community. Have you seen Power Automate become difficult to manage at scale, or has it reduced technical debt in your organization? What governance, architecture, or operational practices have worked best for balancing innovation with maintainability?Sentinel SOAR migration to Unified portal: what broke? anyone evaluated the AI playbook generator?
I want to open a conversation specifically focused on the automation and SOAR side of the migration, because this is the area where problems most commonly surface after onboarding rather than during it. A quick orientation: the Unified portal introduces a specific constraint that catches teams by surprise. Alert-triggered automation for alerts created by Microsoft Defender XDR is not available in the Defender portal. The main use case for alert-triggered automation in this context is responding to alerts from analytics rules where incident creation is disabled. If you had alert-triggered playbooks firing on Defender XDR signals, those need to be re-evaluated against the incident trigger model. This is documented by Microsoft, but it is easy to miss in the volume of migration guidance. The automation failure mode I have seen most consistently: automation rules built around incident title conditions. The Defender XDR correlation engine assigns its own incident names, so any condition keyed to "if incident title contains X" stops matching without throwing an error. The rule is still active, the automation is still enabled, and everything looks fine until someone notices a class of enrichment or response has gone quiet. Microsoft's recommendation is to use Analytic rule name as the condition instead. There is also a firm near-term deadline separate from the March 2027 portal retirement: queries and automation need to be updated by July 1, 2026 for standardised account entity naming. The Name field will consistently hold only the UPN prefix from that date. Any automation comparing AccountName against a full UPN will break. A few specific questions for practitioners: When you onboarded or reviewed your automation post-onboarding, what broke silently versus what produced a visible error? Silent failures are the dangerous ones and sharing specific patterns would be genuinely useful for the community. Has anyone evaluated the new AI playbook generator in the Defender portal? It requires Security Copilot with SCUs available and generates Python-based automation coauthored with Cline in an embedded VS Code environment. Interested in real-world comparisons against existing Logic Apps workflows for the same use case. For those who have migrated alert-triggered playbooks to automation rule invocation: did you find edge cases in the migration, particularly around playbooks used by multiple analytics rules simultaneously? Writing this up as Part 4 of the migration series. Sharing the article link once it is live for anyone who wants the full detail.204Views0likes2Comments