Observability for the Age of Generative AI

Microsoft

Nov 26, 2025

Every generation of computing brings new challenges in how we monitor and trust our systems.

With the rise of Generative AI, applications are no longer static code—they’re living systems that plan, reason, call tools, and make choices dynamically.

Traditional observability, built for servers and microservices, simply can’t tell you when an AI agent is correct, safe, or cost-efficient.

We’re reimagining observability for this new world.

At Ignite, we introduced the next wave of Azure Monitor and AI Foundry integration—purpose-built for GenAI apps and agents.

End-to-End GenAI Observability Across the AI Stack

Customers can see not just whether their systems are up or fast, but also whether their agent responses are accurate.

Azure Monitor, in partnership with Foundry, unifies agent telemetry with infrastructure, application, network, and hardware signals—creating a true end-to-end view that spans AI agents, the services they call, and the compute they run on.

New capabilities include:

Agent Overview Dashboard in Grafana and Azure – Gain a unified view of one or more GenAI agents, including success rate, grounding quality, safety violations, latency, and cost per outcome. Customize dashboards in Grafana or Azure Monitor Workbooks to detect regressions instantly after a model or prompt change—and understand how those changes affect user experience and spend.

AI-Tailored Trace View – Follow every AI decision as a readable story: plan → reasoning → tool calls → guardrail checks. Identify slow or unsafe steps in seconds, without sifting through thousands of spans.

AI-Aware Trace Search by Attributes – Search, sort, and filter across millions of runs using GenAI-specific attributes like model ID, grounding score, or cost. Find the “needle” in your GenAI haystack in a single query.

Foundry Low-Code Agent Monitoring – Agents created through Foundry’s visual, low-code interface are now automatically observable. Without writing a single line of code, you can track reliability, safety, and cost metrics from day one.

Full-Stack Visibility Across the AI Stack – All evaluations, traces, and red-teaming results are now published to Azure Monitor, where agent signals correlate seamlessly with infrastructure KPIs and application telemetry to deliver a unified operational view.

Here’s a demo video that demonstrates some of the new capabilities:

2025_IgniteAct3Video.mp4

Check out our get started documentation. 

Powered by OpenTelemetry Innovation

This work builds directly on the new OpenTelemetry extensions announced in our recent Azure AI Foundry blog post.

Microsoft is helping define the OpenTelemetry agent specification, extending it to capture multi-agent orchestration traces, LLM reasoning context, and evaluation signals—enabling interoperability across Azure Monitor, AI Foundry, and partner tools such as Datadog, Arize, and Weights & Biases.

By building on open standards, customers gain consistent visibility across multi-cloud and hybrid AI environments—without vendor lock-in.

Built for Enterprise Scale and Trust

With open standards and deep integration between Azure Monitor and AI Foundry, organizations can now apply the same discipline they use for traditional applications to their GenAI workloads, complete with compliance, cost governance, and quality assurance.

GenAI is redefining what it means to operate software.

With these innovations, Microsoft is giving customers the visibility, control, and confidence to operate AI responsibly, at enterprise scale.

Published Nov 26, 2025

Version 1.0

Microsoft

Joined September 28, 2018