azure monitor
203 TopicsAzure Monitor Health Model (Preview): What's New!
Azure Monitor Health Model is a modern observability capability that brings together telemetry, architecture, and business context of your workloads to generate health insights. It continuously aggregates signals across dependencies, producing a single, actionable health state which reduces alert noise and shifts team toward proactive operations with cohesive system view, clearer insights, and faster troubleshooting. It addresses the common operation question 'Is my system/service/app healthy?' and 'Which underlying unit / component is impacting health?' This refresh introduces flexible, workload-centric discovery (use application insights topology, Azure resource graph queries in addition to designing user and system flows) and smarter, faster health signal creation (use recommended signals, import existing alert rules, set dynamic thresholds). Expanded Discovery Scope As customers began modeling increasingly complex applications, we identified an opportunity to make discovery more flexible and intuitive. Teams naturally reason about their systems differently; some at the application level, others through infrastructure fleets or telemetry views. By expanding discovery options, we enable customers to build health models using the constructs they already use, making it easier to evolve health models as applications and architectures change. Azure Monitor health models now support multiple discovery mechanisms: Application Insights–based discovery for application-centric modelling Azure Resource Graph (ARG) discovery for scalable, query-based resource selection Continued support for Service Groups, now including nested Service Groups, as part of a broader set of discovery options This evolution reflects a shift toward loosely coupled modelling, enabling customers to define health based on application architecture rather than infrastructure-centric grouping. Learn more about Discovery Extended Health Signals Our goal has been to help customers achieve meaningful health insights faster with less manual effort. By introducing platform defaults and surfacing recommended signals, we make it easier to align health models with proven Azure best practices from day one. At the same time, we preserve support for existing alerting strategies and investments, ensuring customers can extend rather than replace what they already have. These enhancements balance simplicity, guidance, and flexibility as environments scale. Health Models now supports the following health signal capabilities: Resource Health as a default signal, ensuring every model starts with a reliable platform-provided baseline Recommended signals, automatically surfaced based on Azure service best practices and enhanced through Azure Monitor Baseline Alerts (AMBA) integration Reuse of existing signals, enabled by importing Azure Monitor alert rules as health signals Learn more about Signals Introducing Health Aggregation Rules Modern cloud applications are built for resiliency, redundancy, and tolerance of partial failure. Health Models are designed to reflect this reality by enabling customers to define what “healthy” means for their architecture. Flexible aggregation rules allow teams to model intent rather than individual component states, producing health views that better align with operational priorities and business impact. Health Models now supports advanced aggregation logic, enabling the following types of scenarios: Regional resiliency aggregation using numeric thresholds (e.g., 2 out of 4 regions must remain healthy) Cluster and fleet health aggregation using percentage thresholds (e.g., 60% of VMs in a cluster must be healthy) This enables modelling resiliency patterns, partial failures, and graceful degradation, providing a more accurate view of real business impact. Import Custom Signal Health is most valuable when it reflects both system behavior and application context. By enabling custom health inputs, customers can incorporate signals that are closest to their business logic and application state. Contextual annotations further enrich analysis, making health timelines easier to interpret and correlate with change events. To support this, Health Models now provides for: Custom health report ingestion for external application and system health signals Data annotations to overlay deployments, incidents, and configuration changes on health state Alert Experience To proactively learn about health state change, health models allow creating Alert rules and associated action group trigger automated responses sich as notifying user. It is now possible to view all the alerts on a Health Model and start troubleshooting. Alerts in Health Model Note: To avail these new capabilities, upgrade your health models to the new API version using built-in migration wizard in Azure portal for a simple, guided experience. Note: To avail these new capabilities, upgrade your health models to the new API version using built-in migration wizard in Azure portal for a simple, guided experience.7Views0likes0CommentsNew Capabilities to Observe Agents in Azure Monitor
Over the last six months, we have been listening to you and building new capabilities to help you observe your agents. You’ve been sharing with us that quality issues are tricky and evaluation is critical, that agent reasoning needs to be understood, that humans must be in the loop to review select agent interactions, and that security and privacy are essential. To address these concerns, we’re announcing several new capabilities that make agents a first-class artifact in Azure Monitor, so you can debug them in the context of your broader distributed application alongside non-agentic components. Microsoft Foundry remains the surface for building and evaluating agents within the context of your project, while Azure Monitor provides the full-stack observability platform and underlying data foundation that powers those experiences. Today, we’re announcing new capabilities in Azure Monitor across ingestion, performance, evaluation workflows, agent debugging, and instrumentation updates to help teams get telemetry faster, inspect agent behavior more deeply, and standardize observability across hosting environments and frameworks. What’s new Reducing pipeline latency from more than 60 seconds to 7.5 seconds at P90. This makes telemetry available faster for teams troubleshooting agents at scale. Emitting events up to 1MB and up to 256kB per attribute. Prompts and responses can get large, and this helps avoid data truncation. Introducing a new view that shows a list of all agents being monitored. Whether you use Microsoft Agent Framework, LangChain, Microsoft Copilot Studio, Foundry Hosting, AKS Hosting, or something else, they all show up here. Improving drill-in from Evaluations to underlying prompts/responses. Evaluations in Azure Monitor are powered by Foundry, and we continue to improve visuals. Showing conversation context in end-to-end transaction view. In chat agents, conversations have become critical glue that connects traces and eases debugging. Searching by text and showing prompt previews in end-to-end transaction view. Prompts and responses are essential to understanding agent logic, and now you can search based on keyword text in Search and End-to-end transaction details views. Show evaluation scores in end-to-end transaction details and sort by evaluation score in Search. Evaluation is emerging as a “4 th pillar” of telemetry, and you’ll see it surface more prominently across Azure Monitor Application Insights. Access the entire JSON blob of prompt/response text. This makes it easier to get to your underlying data and copy out of Azure Monitor for custom analysis/evaluation. Adding a “trace tree” to enhance traversing the agent’s reasoning logic. This new addition to end-to-end transaction view makes traversing long-traces much easier. Enabling builders to annotate (i.e., manual evaluations) from transaction details. Get rid of spreadsheets on the side and annotate from within Azure Monitor. Enabling capture of end-user feedback (i.e., thumbs up/down). Brings end-user feedback alongside other telemetry for more powerful troubleshooting. Extending AI-powered troubleshooting to agents. Observability agent offers full-stack, AI-powered troubleshooting and surfaces up findings in an issue. Learn More. Observability of Coding Agents. Get end-to-end visibility into agent and model usage, performance, and cost with Azure Monitor Application Insights, and built-in Grafana dashboards. Learn More. A unified “Microsoft OpenTelemetry Distro” to observe agents hosted anywhere. A unified Microsoft OpenTelemetry Distro for observing agents hosted anywhere gives teams a single starting point across Foundry, Azure Monitor, and A365, reducing fragmentation and simplifying onboarding (GH Repos: Python, .NET, JavaScript). Skills-based enablement. Getting started is easier. Just point your agent to a skill for AI-assisted instrumentation. We also plan to upgrade tools for instrumentation in Azure MCP. What’s next We’re continuing to invest in this area, with upcoming work focused on stronger security controls for prompts and responses, better cost transparency for agents, and clearer ways to measure ROI across your agent fleet. These updates make it possible to observe agents without adopting a separate toolchain. Explore the new capabilities, and if you see gaps, let us know so we can continue shaping the roadmap based on your feedback. Learn More.173Views1like0CommentsIs 94% of your syslog just noise? Now you can filter it out before ingestion.
At Microsoft Build 2026, we are announcing the public preview of multi-stage transformations for Azure Monitor Data Collection Rules (DCRs). Multi-stage transformations let you filter, aggregate, parse, and map your logs at the point of collection, before data is ingested into your workspace. Processing happens in a defined sequence of steps called processors, and you can chain them together to build precise data pipelines that reduce ingestion volume, improve data quality, and lower monitoring costs. Processors in orange run on the agent (client-side). The KQL transform in green runs in the ingestion pipeline. Data volume shrinks at each stage. What are multi-stage transformations? A Data Collection Rule defines how Azure Monitor collects, transforms, and routes telemetry data. Until now, DCRs supported a single KQL transformation step on the ingestion side. Multi-stage transformations extend this model by introducing a processor pipeline: an ordered sequence of processing steps that run on the agent (client-side) or at the ingestion endpoint (ingestion-side), or both. Each processor performs one operation: filtering records, parsing structured fields from raw text, renaming or dropping columns, aggregating metrics, or running a KQL expression. Processors execute in order, and the output of one becomes the input to the next. This composable design replaces what previously required complex, monolithic KQL queries or external pre-processing scripts. Client-side processors run on the Azure Monitor Agent before data leaves the source machine. This means filtered and aggregated data never crosses the network, reducing both egress and ingestion costs. Ingestion-side processors run in the Log Analytics ingestion pipeline and support KQL-based transformations for more complex logic. Key applications The most immediate use case is cost reduction. When you can filter records on the agent before they leave the machine, you stop paying for data you never query. Syslog is the classic example: in many environments, informational and debug messages make up the vast majority of volume, and none of it gets looked at unless something breaks. A single filter processor can cut that stream by 90% or more. Aggregation is equally powerful for high-frequency telemetry. Performance counters sampled every 15 seconds produce millions of records per hour across a large fleet, but most dashboards and alert rules only need 5-minute granularity. Rolling up those samples on the agent, before they cross the network, dramatically reduces ingestion without losing the operational signal your team actually relies on. Beyond cost, multi-stage transformations improve the quality of the data that does reach your workspace. Parsing structured fields out of raw text (JSON payloads, XML event data, CEF security logs) at collection time means downstream queries are simpler and faster. And because each processor handles one step in a readable sequence, maintaining the pipeline is far easier than debugging a single monolithic KQL expression that tries to do everything at once. To make this concrete, let’s walk through the two highest-impact patterns we see with preview customers: filtering noisy syslog data and aggregating performance counters. Filter data before ingestion The filter processor evaluates each record against conditions you define and drops anything that does not match. Because filtering runs on the agent, dropped records are never serialized, transmitted, or ingested. This makes it the highest-impact processor for cost reduction. You configure filters using simple field-level conditions: specify a column name, an operator (equals, not equals, greater than, contains, etc.), and a value. Conditions can be combined with AND/OR logic for precise control. Scenario: Keep only warning-and-above syslog messages A typical syslog stream generates thousands of informational and debug messages for every actionable warning or error. With a filter processor, you set a severity threshold, and the agent drops everything below it before transmission. In this example, the filter keeps records where SeverityNumber >= 4 (Warning). The 57,000 debug and informational records per hour are dropped on the machine. Only the 3,250 actionable records are transmitted and ingested, a 94% reduction in syslog volume. Filters also support compound conditions. For example, you can keep auth-facility errors OR any critical message regardless of facility, all in a single processor step. This kind of targeted filtering is especially useful for security teams that need specific event categories without paying for the full syslog firehose. Aggregate logs before ingestion The aggregate processor rolls up high-frequency records into time-windowed summaries on the agent. This is especially valuable for performance counters, heartbeat signals, and any telemetry where per-second granularity is not needed for operational decisions. You configure the processor with a time window (for example, 5 minutes), the aggregation operators to apply (average, sum, min, max, count), and the dimension columns to group by (such as host name and counter name). The agent collects records within each window, computes the aggregates, and emits one summary record per group. Scenario: Roll up performance counters into 5-minute summaries A fleet of 500 VMs, each reporting 10 performance counters every 15 seconds, generates roughly 2 million raw records per hour. Most operational dashboards and alert rules use 5-minute granularity, making the per-sample detail redundant. With the aggregate processor, each agent rolls up its local counter stream into 5-minute windows, grouped by counter name. Each summary record contains the average, maximum, and sample count for that window. Raw data After aggregation (5-min windows) Records per VM per hour 2,400 (10 counters x 4/min x 60 min) 120 (10 counters x 12 windows) Records across 500 VMs per hour 1,200,000 60,000 Volume reduction 95% Operational fidelity Per-sample (15s) Avg, max, and count per 5 min Because the aggregation runs on the agent, the reduced data set is what gets transmitted and ingested. Dashboards and alerts that rely on 5-minute granularity work identically, but ingestion costs drop by 95%. Route the output to a custom table with columns that match the aggregate output (average, max, count, and your dimension columns). Chain processors for complete pipelines Processors are composable. A common pattern chains a header processor (to convert raw data into tabular format), a filter (to drop irrelevant records), a parse step (to extract fields from structured payloads), and a column drop (to remove fields not needed downstream). Scenario: Parse, filter, and slim down Windows Event logs Consider a security team that needs logon success and failure events (Event IDs 4624 and 4625) from the Windows Security log. The raw event stream contains hundreds of event types, each carrying a large XML payload. A four-step pipeline handles this: Header processor converts the raw event stream into tabular rows Parse processor extracts EventID and TargetUser from the XML payload into typed columns Filter processor keeps only logon success (4624) and failure (4625) events, dropping everything else Drop processor removes the bulky RawXml and RenderingInfo columns that are no longer needed The result is a lean, security-focused data set containing only the events and fields the team actually queries. Each step is independent and can be modified without affecting the others. Authoring multi-stage DCRs Multi-stage transformations are available through the Azure portal and through the REST API (version 2025-05-11). The portal provides a visual editor for building processor pipelines, previewing the schema at each stage, and validating the configuration before deployment. The Transform tab in the DCR data source configuration lets you add processors at each stage and preview the resulting schema. For infrastructure-as-code workflows, the full DCR JSON can be authored and deployed via ARM templates, Bicep, or direct REST API calls. To get started: Open Azure Monitor in the Azure portal and navigate to Data Collection Rules Create a new DCR or edit an existing one In the data source configuration, select Edit transformation Author your transformation logic across client and ingestion stages using the set of available processors Preview the schema output at each stage to verify the pipeline produces the expected result Save and associate the DCR with your target resources Preview notes: Multi-stage transformations are available in public preview starting June 3, 2026 Client-side processors require Azure Monitor Agent version 1.35 or later Aggregation output must be routed to custom tables (standard table schemas do not match aggregate output) Data collection, workspace ingestion, and alert rules may incur costs based on the settings you enable. Preview pricing may differ from general availability pricing. See Azure Monitor pricing for current rates To learn more, see: Data Collection Rules overview Looking ahead Multi-stage transformations are part of our continued investment in giving teams control over their data before it reaches the workspace. During the preview period, we plan to expand processor coverage, add support for additional data source types, and incorporate user feedback into the authoring and validation experience. We are also exploring how multi-stage transformations can serve as the foundation for advanced scenarios such as data scrubbing, inline enrichment from external reference data, and AI-assisted pipeline authoring. These capabilities will build on the same processor model, so pipelines you create today will extend naturally as new processors become available. We welcome your feedback as you try multi-stage transformations. Use the feedback options in the Azure portal, or reach out through your Microsoft account team. This feature is currently in preview. Previews are provided "as-is," "with all faults," and "as available," and are excluded from the service level agreements and limited warranty. For more information, see Supplemental Terms of Use for Microsoft Azure Previews]. Statements in this post about future plans and capabilities represent our current intentions and are subject to change. They should not be relied upon when making purchasing decisions.683Views1like1CommentWhat’s new in Observability at Build 2026
At Build 2026, Azure Monitor introduces major advancements in end-to-end observability, extending across AI agents, applications, and infrastructure with OpenTelemetry at its core. New capabilities with Azure Copilot Observability agent, SLI/SLO support, and smarter alerting help teams move faster from detection to root cause while reducing noise and manual effort. Together, these innovations enable developers and SREs to operate modern, AI-driven systems with greater insight, efficiency, and alignment to customer experience.310Views2likes0CommentsWhen Telemetry Volume Gets Real: Azure Monitor pipeline’s Performance Story!
What is Azure Monitor pipeline? Azure Monitor pipeline provides centralized governance and a single point of control that runs close to your data sources, so you can filter, transform, aggregate, and route telemetry before it's sent to Azure Monitor. This approach helps you reduce ingestion volume, improve reliability in disconnected environments, and apply consistent data processing across hybrid and multi-cloud deployments. Built on OpenTelemetry technology, the pipeline supports standard ingestion protocols including Syslog and OTLP, enabling it to receive telemetry from a wide range of clients and environments. Read more about Azure Monitor pipeline here - Azure Monitor pipeline GA: Centralized, Secure Telemetry Ingestion Azure Monitor pipeline Performance A single replica on a stock 8-core node sustains ~200,000 Syslog messages per second end-to-end into Log Analytics — roughly 17 billion events or ~20 TB per day — using only ~2.8 GB of working-set memory. That's ~2.5 TB/day of throughput per vCPU, on commodity hardware, with no special tuning. (Measured on pipeline v1.1.1, May 2026.) Find more detailed performance information in the table below - vCPUs Example node Syslog Basic* Syslog Fully Formed* CEF Fully Formed* 2 Standard_D2as_v6 ~50,000/sec ~35,000/sec ~17,000/sec 4 Standard_D4as_v6 ~100,000/sec ~70,000/sec ~35,000/sec 8 Standard_D8as_v6 ~200,000/sec ~150,000/sec ~65,000/sec 16 Standard_D16as_v6 ~400,000/sec ~300,000/sec ~130,000/sec Syslog Basic* – Azure Monitor pipeline ingesting raw syslog data into Azure Monitor custom table Syslog Fully Formed* – Azure Monitor pipeline ingesting syslog data in Azure Monitor standard syslog table CEF Fully Formed* – Azure Monitor pipeline ingesting CEF data in Azure Monitor standard CEF table Further, adding replicas scales throughput linearly. Linear scaling is what makes the rest of the performance story credible in practice: if one 4-core node handles about 100,000 Syslog logs per second, eight replicas scale that to roughly 800,000 logs per second without changing the architecture. In other words, you do not hit an arbitrary throughput wall as volume grows—you add cores or replicas and get predictable capacity growth. We are continuously improving these numbers, and the latest guidance is documented here -- Azure Monitor pipeline performance and sizing - Azure Monitor | Microsoft Learn Why this Performance Story Matters? Zero-config core usage. The pipeline automatically uses every available CPU core. Move to a bigger node and it just goes faster — no tuning, no config. Backpressure, not data loss. When you exceed capacity, the pipeline applies TCP backpressure to senders instead of dropping messages. Rising send latency is your scale-up signal. Predictable sizing math. Pick your per-vCPU rate, divide your peak logs/sec, add 30% headroom, round up. Done. Efficient memory usage. ~2.8 GB working-set to push 200,000 logs/sec means you're paying for throughput, not overhead. One sizing tip worth knowing: make sure senders open at least as many concurrent TCP connections as there are cores on the pipeline node. The pipeline distributes traffic across cores by source connection, so too few connections leave cores idle. How this Stacks Up? Telemetry pipelines are usually sized per CPU core, making per-core throughput a practical way to reason about capacity and scaling. Against that backdrop, ~2.5 TB/day per vCPU for Syslog Basic — and ~65,000–150,000 logs/sec, on 8 cores for fully formed records — highlights the per-core efficiency of Azure Monitor pipeline for edge log collection. Exact numbers will vary based on event size and processing applied, but the key point is consistency: you get substantial throughput per core, and it scales linearly as you add capacity. Less hardware to move the same volume, efficient memory usage, backpressure instead of loss, and linear growth — that's the performance case for Azure Monitor pipeline. Get started Spin up a pipeline group on your Arc-enabled cluster, point your Syslog/CEF senders at it, and watch the throughput numbers above hold up in your own environment! Read more about getting started here -- What is Azure Monitor pipeline? - Azure Monitor | Microsoft Learn84Views0likes0CommentsAny source. Any destination. Ready for AI-era.
Telemetry is exploding, every new app, edge node, and AI agent is a new firehose, and AI has raised the bar on what that telemetry must be: governed, on open standards, observable at agent scale. Today, most teams answer that by stitching together a stack of disconnected tools, each catering to a set of data sources, another that offers transforms, different ones for routing to each destination, and wrappers on top for some essence of much-needed enterprise governance, all struggling to be held together by glue code and tribal knowledge. This is the gap we're closing at Build 2026, with every announcement lining up with what modern, AI-shaped workloads need most: An AI-native standard, ready for enterprises: OpenTelemetry direct ingest, GA Headroom for bursty AI-agent traffic: Azure Monitor pipeline scaling to billions of events per day One governance plane for AI and Azure platform telemetry (via DCRs) AI-noise controlled at the right point in the journey: Multi-stage transforms Coverage AI can trust: Monitoring Coverage so AI can reason on complete signals instead of blind spots. …..All organized around the journey your data takes: 1 · Discover Most teams think they're monitoring everything, until an incident proves they aren't! Monitoring Coverage turns hope into evidence by answering 3 questions at fleet scale: is monitoring configured, are the right alerts in place, is telemetry actually flowing? Go from “I think we’re covered” to “I know we are”: Is Your Monitoring Actually Working? What's New in Monitoring Coverage | Microsoft Community Hub 2 · Collect Whatever your source, Azure-native or open standard, you shouldn't need a different platform, agent, or governance model to bring it in. At Build, two big shifts close that gap: Govern Azure platform telemetry like the rest of your data: No more per-resource diagnostic settings or separate tooling for platform metrics and logs. They now ride the same policy-based control plane you already use for the rest of Azure Monitor with one model, one audit story, scoped at scale. Platform metrics support - GA Platform logs support - Public preview coming soon! Bring OpenTelemetry straight in - GA: Send OTLP logs, metrics, and traces directly to Azure Monitor and land them in Application Insights, Log Analytics, Azure Monitor Workspace (Prometheus), and Grafana, no shim, no detour! Direct OpenTelemetry ingestion into Azure Monitor is now generally available Have additional OTel collection needs? Tell us us more by filling out this quick survey! 3 · Shape Observability and storage budgets are dying a death by a thousand low-value log lines. The question today is no longer whether to shape your telemetry, it's where. Multi-stage transformations (public preview) now lets you control telemetry where it matters: at the source, in-pipeline, or post-ingest before, all before data lands at its destination. Drop noise early, enrich centrally, and optimize cost without losing signal: Is 94% of your syslog just noise? Now you can filter it out before ingestion. | Microsoft Community Hub 4 · Ingest at scale When telemetry volume spikes, you need a pipeline that doesn't blink. 17 billion events, per day, per replica. That's what Azure Monitor pipeline now sustains, generally available since April ’26, as the living proof of ‘any source, any destination’. This is the high-scale, multi-cloud, edge-resilient engine already trusted in regulated banks, industrial OT networks, and globally distributed SOCs. That's the kind of headroom you want when AI agents start emitting in bursts you didn't plan for: When Telemetry Volume Gets Real: Azure Monitor pipeline’s Performance Story! | Microsoft Community Hub Get Started TODAY! Explore the links above, try the new experiences in Azure Monitor, and tell us in comments below what to build next. The next era of enterprise telemetry is here. We can't wait to see what you'll build on it. — Your Azure Monitor team126Views0likes0CommentsIs Your Monitoring Actually Working? What's New in Monitoring Coverage
Monitoring is only useful when the right signals are collected, the right alerts are in place, and the data is actually flowing when teams need it. In large Azure environments, confirming all three across every VM and AKS cluster can still take too much manual work. At Microsoft Ignite, we introduced Monitoring Coverage in Azure Monitor, a centralized preview experience for finding coverage gaps and enabling recommended VM and container monitoring at scale. At Microsoft Build, we are expanding that experience with two new capabilities that make monitoring easier to operationalize: data flow status and at-scale recommended alert enablement for virtual machines and Azure Kubernetes Service (AKS). With these updates, teams can move beyond asking whether monitoring was configured. They can see whether recommended monitoring is enabled, whether important alert coverage is missing, and whether configuration issues may prevent monitoring data from reaching its destination. Monitoring Coverage overview with recommendations and data flow status. What is Monitoring Coverage? Monitoring Coverage in Azure Monitor gives you a single place to review recommended monitoring across supported Azure resources. The Overview page summarizes coverage across your selected scope, shows Azure Advisor observability recommendations, and provides quick actions to enable recommended monitoring settings. Coverage is grouped into basic, partial, and enhanced monitoring so you can quickly understand whether a resource is using only default monitoring or has the Microsoft-recommended configuration enabled. From there, you can drill into the Monitoring Details tab to review individual resources and take action. New: data flow status The most important question after enabling monitoring is simple: is the data flowing? Data flow status helps answer that question directly from Monitoring Coverage. The new data flow status summary shows how many resources need attention, passed initial checks, or are not configured for validation. It also highlights top resources that need attention so operators can start with the most important issues first. When you open data flow status for a resource, Azure Monitor shows validation checks across areas such as: Resource configuration Data collection rule associations Network connectivity Data flows to the configured destination Detected issues are prioritized at the top of the details pane, and each validation check includes a recommended action. After making a fix, you can run validation again to confirm that data flow issues are resolved. Data flow status details with validation checks and recommended actions. Alternatively, you can visualize your data flows and identify problems from there. New: enable recommended alerts at scale Monitoring Coverage now also helps close alerting gaps. From the Overview page, you can see recommendations such as Enable VM Recommended Alerts and Enable AKS Recommended Alerts, then select Apply to configure recommended alert rules from a centralized flow. For virtual machines, you can enable alerts across an entire subscription or choose selected resources. Subscription scope is useful when you want recommended alerts to apply broadly, including to future VMs in the selected subscription. Selected resource scope gives you more granular control when you want to enable alert rules for a specific set of VMs. The enablement flow lets you review recommended alert rules, adjust thresholds, and configure notification options such as email, Azure Resource Manager role notifications, Azure mobile app notifications, or an existing action group. Some VMs may already have alerts configured, and new rules are designed not to duplicate existing alerts. For AKS, Monitoring Coverage can surface recommended alert gaps and start the same guided pattern: review impacted resources, configure recommended alert settings, and use Review + Enable to create the alert rules. A resource-centric view for follow-up The Monitoring Details tab brings coverage and data flow into the same resource list. Two columns are especially useful for triage: Monitoring coverage and Data flow status. Select either value to open resource-level details. Monitoring coverage details show what is configured for the resource, including VM Insights, recommended alerts, data collection rules, data sources, destinations, and agent version when available. Data flow details show validation results and recommended remediation steps. This makes it easier to move from a high-level gap to the specific resource and configuration that needs attention. Getting started Monitoring Coverage is available in preview from the Azure portal. Open Monitor, select Monitoring Coverage (preview), and choose the subscriptions and resources you want to review. From the Overview page, you can: Review coverage across VMs and AKS resources. Apply recommendations to enable VM Insights, container monitoring, and recommended alerts. Use data flow status to find resources whose monitoring data needs attention. Open Monitoring Details for resource-level coverage and validation results. A few preview notes: enablement operations include up to 100 resources at a time, and enabling monitoring or alert rules may create data collection rules, deploy Azure Monitor Agent, configure destinations, or create alert rules. Data collection, workspace ingestion, and alert rules may incur costs based on the settings you enable. To learn more, see Monitoring coverage in Azure Monitor (preview). Looking ahead Monitoring Coverage is part of our continued work to make Azure Monitor easier to operationalize at scale. We want teams to spend less time hunting for monitoring gaps and more time acting on reliable, validated signals. We would love your feedback as you try these new Build updates and we look to expand support beyond this set of resource types. Use the Azure portal feedback options or share feedback through your Microsoft account team.171Views1like0CommentsAzure Monitor Copilot Observability Agent: What’s new at Build
The Observability agent in Azure Copilot is an AI-powered assistant built into Azure Monitor that helps engineers investigate issues and explore their systems using natural language. By grounding its analysis in telemetry data such as metrics, logs, and traces, it supports both open-ended exploration and guided troubleshooting. For more details, see the documentation. Since our initial public preview, the Observability agent in Azure Copilot has continued to evolve with new capabilities and expanded coverage (You can read more about the initial release in our previous blog) At Build 2026, we’re introducing updates that expand the Observability agent’s capabilities and the range of scenarios it can support. These updates provide deeper analysis and more detailed responses for both exploration and investigation. Expanded Investigation Scenarios The Observability agent now supports a broader set of scenarios across applications and infrastructure. These can be accessed directly from relevant product experiences, without requiring a prior alert, allowing teams to explore data conversationally and initiate deeper investigations as signals emerge. Integration with Microsoft Foundry AI Agent The Observability agent integrates with Microsoft Foundry AI Agents, enabling correlation of signals across key generative AI and agent observability scenarios such as latency spikes, error patterns, and tool invocation failures. Teams can interact with the Observability agent either from alerts - including alerts based on Foundry telemetry - or directly within Application Insights, where the Agents details experience serves as the primary entry point. From there, users can use the Observability agent to diagnose errors, analyze trends, and explore their data across one or multiple agents. Application Insights integration The Observability agent enables investigation of failure scenarios directly from Application Insights Failures blade, allowing teams to analyze application-level issues and move from symptom to root cause. Azure Kubernetes Service (AKS) integration The Observability agent enables deep investigation of issues in Azure Kubernetes Service (AKS) clusters. AKS investigations correlate signals from Azure Monitor with Kubernetes logs and events, and (coming soon) Prometheus metrics stored in an Azure Monitor Workspace. Together, these signals enable full‑stack analysis of applications running on AKS. The Observability agent helps teams determine whether an issue originates from the application or from the underlying Kubernetes platform, reducing time to diagnosis and resolution. Activity Logs integration Investigations can be initiated based on Azure Resource Health events surfaced in Activity Logs, enabling analysis of service-impacting signals related to the Azure platform. Deeper Insights across systems Multiple Application Insights - Coming soon! The Observability agent supports investigations that can span multiple Application Insights resources, enabling scenarios that involve multiple services within distributed applications. The agent can guide users to expand the investigation scope when cross-service issues are detected. Integration with Azure Service Health The Observability agent correlates investigation context with Azure Service Health events, helping teams understand potential platform impact as part of their investigation. This helps distinguish application-level issues from broader Azure platform conditions and prioritize active impacts. Issue management Enhancements Viewing issues Issues can now be viewed in multiple places, depending on the required scope: Azure Monitor: showing issues across all Azure Monitor Workspaces (AMWs) under the selected subscriptions Azure Monitor Workspace: showing issues stored within a specific AMW Issue actions & notifications Issue actions trigger notifications when issues are created or updated, enabling integration with workflows such as email, webhooks, and automation. Sharing and follow-up You can now download investigation results as a PDF, including supported data, enabling teams to capture and share investigation context for incident reviews and reporting. Coming Soon Billing for the Observability agent starts on July 1, 2026. The agent uses a consumption-based pricing model, so customers pay only for the AI work the agent performs. Agent consumption is measured in Azure Agent Credit (AAC) units, which reflect how many LLM tokens the agent used. For more details, see the documentation. Stay connected Follow this blog for ongoing updates and deeper dives into new capabilities Join our upcoming webinar for real-world scenarios, best practices, and a look at what’s coming next 👉 Register here We’d love your feedback The Observability Agent continues to evolve based on real-world usage and customer feedback. Share feedback through the Give Feedback option in the product or contact us at: azureobsagent@microsoft.com Want to learn more? Read our previous blog posts - Public Preview Update: Azure Copilot Observability Agent | Microsoft Community Hub The Azure Copilot Observability Agent Chat - Stop Writing Queries, Start Asking Questions. | Microsoft Community Hub Explore our documentation - Azure Copilot observability agent (preview) - Azure Monitor | Microsoft Learn195Views0likes0CommentsMonitor AI coding agents with OpenTelemetry in Azure Monitor
AI coding agents are quickly becoming part of the everyday developer workflow. As teams adopt tools such as GitHub Copilot, Claude Code, and Codex, they need a better way to understand usage, troubleshoot performance, and keep an eye on token consumption and cost. With Azure Monitor’s OpenTelemetry support, you can collect OpenTelemetry Protocol (OTLP) signals from AI coding agents and route them into Azure Monitor for end-to-end visibility. Ingested OTLP data is stored with OpenTelemetry semantics for logs and traces, Application Insights provides curated agent views for troubleshooting, detailed trace visualizations and end-to-end transaction views. Image: Application Insights end-to-end transaction view for agents. Azure Monitor also includes ready-to-use Grafana dashboards that deliver streamlined, out-of-the-box visualizations with the flexibility to customize further. This gives platform teams, engineering leaders, and developers a consistent way to monitor using open-source standards. The key takeaway is that these dedicated coding agent dashboards surface agent-specific details like feature usage, commit counts, code change acceptance rates, and user details if included in ingested telemetry. That creates immediate value for developer teams and organizations that want to understand adoption rates and the value being returned by coding agents. Image: Azure Monitor dashboards with Grafana for GitHub Copilot How it works Coding agents or IDEs can be configured to export OTLP signals by using organization-wide environment variables, project settings, or shared repository configurations. Note: These settings determine whether content and conversation details are captured and exported. Ensure that your configuration matches your organization's privacy and data handling policies. An OpenTelemetry Collector can receive OTLP and forward it to Azure Monitor OTLP ingestion endpoints. This OTLP ingestion pipeline uses Entra authenticated and stores logs and traces with OpenTelemetry semantics Once the data is in Azure Monitor, teams can investigate usage and adoption patterns in Application Insights agent-specific views and visualize trends with pre-built coding agent dashboards in Azure Monitor dashboards with Grafana or Azure Managed Grafana. Image: OTLP ingestion path from coding agent to Azure Monitor Why it matters This approach helps central IT and engineering management teams understand rollout, adoption, and cost across their organization, while giving developers a better view of agent interactions and productivity signals. With OpenTelemetry and Azure Monitor, teams can standardize once, reduce pipeline complexity, and access useful insights faster for these coding agents: GitHub Copilot Claude Code Codex OpenClaw Gemini CLI OpenCode Get started AI coding agents are accelerating software development, and observability needs to keep up. Azure Monitor brings together OpenTelemetry and Grafana so you can monitor agent usage and performance with a flexible, standards-based approach. To learn more, explore: OpenTelemetry export from Visual Studio Code OTLP ingestion into Azure Monitor Coding agent dashboards in Azure Managed Grafana Monitor AI agents with Application Insights107Views0likes0CommentsDirect OpenTelemetry ingestion into Azure Monitor is now generally available
OpenTelemetry is powering a new era of observability. Built on open standards, designed for portability, and made for developers who want flexibility without compromise. And now, you can send OpenTelemetry logs, metrics, and traces straight into Azure Monitor OTLP endpoints and data storage. This capability is generally available, production-ready, and built to scale from day one. With direct OTLP ingestion, you can keep your existing OpenTelemetry instrumentation and OpenTelemetry collector pipelines while sending telemetry to Azure Monitor for investigation in Application Insights, analysis in Log Analytics, Prometheus metric storage and visualization in Grafana. What’s now generally available Direct OTLP ingestion into Azure Monitor for logs, metrics, and traces. Production-ready onboarding for deploying data collection rules and endpoints. Application Insights experiences for distributed tracing, performance investigation, and troubleshooting powered by OTLP data. Grafana dashboards ready-to-use for visualizing OpenTelemetry signals. Prometheus data storage and query language for metrics OpenTelemetry semantic conventions for logs and traces, so data lands in a familiar standards-based schema. How to send OTLP to Azure Monitor Instrument your application with OpenTelemetry using the open-source SDKs and configure OTLP export to an OpenTelemetry Collector. Configure Azure Monitor OTLP ingestion by using an Application Insights resource with OTLP support, which sets up the required Azure Monitor resources and investigation experiences or manually create the required resources. Export traces, metrics, and logs directly to Azure Monitor from the OpenTelemetry Collector using the built-in OTLP over HTTP exporter. Get started Where your telemetry lands Azure Monitor brings these signals together so your teams can triage and troubleshoot root cause faster without modifying code and instrumentation. Metrics are stored in an Azure Monitor Workspace, a Prometheus metrics store. Logs and traces are stored in a Log Analytics workspace using an OpenTelemetry semantic conventions–based schema. Application Insights lights up distributed tracing and end-to-end performance investigations. Pre-built Grafana dashboards for OpenTelemetry metrics are available directly in the Azure portal alongside Application Insights. Why it matters Standardize once: Instrument with OpenTelemetry and keep your instrumentation vendor neutral and keep your telemetry portable. Reduce overhead: Fewer bespoke exporters and pipelines to maintain. Stick to OTLP for all cases. Debug faster: Correlate metrics, logs, and traces to get from reported issues to root cause with less guesswork. Observe with confidence: Use dashboards and tracing views that are ready on day one. Next step: Try OTLP export from your environment to Azure Monitor, then validate end-to-end signal flow with Application Insights and Grafana dashboards. Get started156Views0likes0Comments