Microsoft runs Azure SRE Agent across its own services — 1,300+ agents deployed, 35,000+ incidents mitigated, and over 20,000 engineering hours saved every month. The same platform is now generally available to every customer.
Today, we’re excited to announce the General Availability (GA) of Azure SRE Agent— your AI‑powered operations teammate that helps organizations improve uptime, reduce incident impact, and cut operational toil by accelerating diagnosis and automating response workflows. This GA release represents a fundamental leap forward—shaped by feedback from hundreds of preview customers—in how agentic AI can be applied to cloud operations, so teams can spend less time firefighting and more time delivering new capabilities.
Microsoft runs Azure SRE Agent across its own services — 1,300+ agents deployed, 35,000+ incidents mitigated, and over 20,000 engineering hours saved every month. The same platform is now generally available to every customer.
What’s new with the GA milestone
Preview customers helped shape what would maximize day-to-day operational value. A clear pattern emerged: the most effective agents were those deeply grounded in their team's specific context. That insight drove the architecture decisions behind this GA release. Visit the what's new documentation for complete details.
Deep context — The agent now operates with your source code, tools, and operational context already loaded and ready — not fetched on demand. This means faster, more accurate results grounded in your real architecture not generic knowledge.
Built-in computation — A Code Interpreter writes and executes Python producing downloadable deliverables: PDF reports, charts, Excel workbooks, dashboards, and data exports.
Memory and learning — The agent builds its own knowledge base from every interaction and gets smarter over time. Your team can also teach it directly, so tribal knowledge is never lost.
The GA release also brings continued investment in automation building blocks — skills, subagents, Python tools, and agent hooks — along with deeper ecosystem integrations via MCP connectors.
Attend Agentic DevOps Live!
Agentic DevOps Live is your go-to hub for learning and using AI-driven automation in the software lifecycle. Each session provides practical tips, demos, and playbooks to help you use Agentic DevOps effectively. Join these live Azure SRE Agent sessions or access resources anytime for blueprints, guides, and hands-on tools to improve productivity, speed up delivery, and drive innovation.
|
Livestream |
Topic |
Speakers |
|
Tuesday, March 10, 2026 10:00 AM - 11:00 AM PDT (UTC-07:00)
|
Root Cause Analysis with Code Context: Azure SRE Agent + GitHub Integration
|
Dalibor Kovacevic, Principal Product Manager Shamir Abdul Aziz, Principal Product Manager
|
|
Tuesday, March 17, 2026 10:00 AM - 11:00 AM PDT (UTC-07:00)
|
Extend Azure SRE Agent: Custom Runbooks and Ecosystem Tools for Proactive Reliability |
Deepthi Chelupati, Principal Product Manager Dheeraj Bandaru, Product Manager II
|
How Azure SRE Agent drives business outcomes
Azure SRE Agent is your AI-powered operations teammate that helps teams keep production systems healthy by turning operational complexity into faster, more consistent outcomes. It operates with deep context – your source code, logs, metrics, traces, infrastructure state, and tools—all accessible whenever the agent works on your behalf. Based on your organization’s governance controls, it can not only recommend, investigate, and compute, but also execute actions to keep your services running smoothly.
Azure SRE Agent combines a set of capabilities that map directly to measurable operational outcomes:
1) Accelerate root-cause analysis Azure SRE Agent automates multi-layer investigations across apps, platform, and infrastructure—correlating telemetry, testing hypotheses, and explaining root cause clearly. The result: shorter mean time to resolution (MTTR), less on-call load, and more consistent diagnostics at scale.
2) Improve incident response with knowledge reuse
The agent captures and reuses operational knowledge from every incident—surfacing past resolutions, runbooks, and context in real time. This helps teams avoid repeated work, reduce blind spots, and respond with greater confidence as the system improves over time.
3) Automate mitigation with governance intact
Teams can choose how autonomous the agent should be—ranging from recommended actions to automated responses within defined guardrails. This reduces manual effort, speeds recovery, and helps maintain compliance and operational control.
4) Standardize operations across teams to scale reliability
By tailoring Azure SRE Agent to your tools, policies, and runbooks, organizations can roll out automation faster while ensuring the agent follows the same standards and procedures teams rely on today. And because it integrates with existing tools and systems, teams can reduce integration effort and scale consistent operations without reinventing workflows.
5) Shift from reactive firefighting to proactive reliability
By identifying emerging risks, misconfigurations, and trends early, Azure SRE Agent helps teams prioritize what matters and take action sooner—improving resilience and lowering the cost of failure.
Ecolab, as an early adopter, reduced daily performance alerts from 30–40 to under 10, sped up root‑cause analysis, and minimized time spent on urgent fixes. With Azure SRE Agent, teams resolve incidents faster, receive consolidated reports, and focus more on reliability.
Resources to Help You Get Started
Everything you need to onboard, deploy, and explore SRE Agent.
Microsoft Learn & Docs
- Product documentation: https://aka.ms/sreagent/docs
- Self-paced hands-on labs: https://aka.ms/sreagent/lab
- Technical videos and demos: https://aka.ms/sreagent/youtube
- Azure SRE Agent home page: https://www.azure.com/sreagent
- Azure SRE Agent pricing page: https://aka.ms/sreagent/pricing
💬 We Want Your Feedback
Azure SRE Agent is just the beginning of the agentic cloud era. As you explore the GA release and free trial, your insights will shape the next wave of improvements. We’d love to hear your feedback through:
- Use thumbs‑up/down feedback buttons in the agent experience
- Explore GitHub issues in the SRE Agent repo
- Follow the Azure SRE Agent product team on X
- Connect with the broader SRE community on the Tech Community Hub
- Have conversations with your Microsoft account team
Get Started Today
The future of reliable cloud operations is increasingly agentic—and it starts now. Create your Azure SRE Agent, explore the capabilities, and evaluate how AI-powered operations can improve reliability, reduce toil, and accelerate delivery.
👉 Try Azure SRE Agent today: https://aka.ms/sreagent