Blog Post

Apps on Azure Blog
6 MIN READ

Azure SRE Agent Now Builds Expertise Like Your Best Engineer Introducing Deep Context

dchelupati's avatar
dchelupati
Icon for Microsoft rankMicrosoft
Mar 10, 2026

What if SRE Agent already knew your system before the next incident?

Your most experienced SRE didn't become an expert overnight. Day one: reading runbooks, studying architecture diagrams, asking a lot of questions. Month three: knowing which services are fragile, which config changes cascade, which log patterns mean real trouble. Year two: diagnosing a production issue at 2 AM from a single alert because they'd built deep, living context about your systems. 

That learning process, absorbing documentation, reading code, handling incidents, building intuition from every interaction is what makes an expert. 

Azure SRE Agent could do the same thing 

From pulling context to living in it 

Azure SRE Agent already connects to Azure Monitor, PagerDuty, and ServiceNow. It queries Kusto logs, checks resource health, reads your code, and delivers root cause analysis often resolving incidents without waking anyone up. Thousands of incidents handled. Thousands of engineering hours saved. 

Deep Context takes this to the next level. Instead of accessing context on demand, your agent now lives in it — continuously reading your code, knowledge building persistent memory from every interaction, and evolving its understanding of your systems in the background. 

Three things makes Deep Context work: 

Continuous access. Source code, terminal, Python runtime, and Azure environment are available whenever the agent needs them. Connected repos are cloned into the agent's workspace automatically. The agent knows your code structure from the first message. 

Persistent memoryInsights from previous investigations, architecture understanding, team context — it all persists across sessions. The next time the agent picks up an alert, it already knows what happened last time. 

Background intelligence. Even when you're not chatting, background services continuously learn. After every conversation, the agent extracts what worked, what failed, what the root cause was. It aggregates these across all past investigations to build evolving operational insights. The agent recognizes patterns you haven't noticed yet. One example: connected to Kusto, background scanning auto-discovers every table, documents schemas, and builds reusable query templates. But this learning applies broadly — every conversation, every incident, every data source makes the agent sharper. 

Expertise that compound with every incident 

 

New on-call engineer 

SRE Agent with Deep Context 

Alert fires 

Opens runbook, looks up which service this maps to 

Already knows the service, its dependencies, and failure patterns from prior incidents 

Investigation 

Reads logs, searches code, asks teammates 

Goes straight to the relevant code path, correlates with logs and persistent insights from similar incidents 

After 100 incidents 

Becomes the team expert — irreplaceable institutional knowledge 

Same institutional knowledge — always available, never forgets, scales across your entire organization 

 

A human expert takes months to build this depth. An agent with Deep Context builds it in days and the knowledge compounds with every interaction.

You shape what your agent learns. Deep Context learns automatically but the best results come when your team actively guides what the agent retains. 

  • Type #remember in chat to save important facts your agent should always know environment details, escalation paths, team preferences. For example: "#remember our Redis cache uses Premium tier with 6GB" or "#remember database failover takes approximately 15 minutes." These are recalled automatically during future investigations. 
  • Turn investigations into knowledge. After a good investigation, ask your agent to turn the resolution into a runbook: "Create a troubleshooting guide from the steps we just followed and save it to Knowledge settings." The agent generates a structured document, uploads it, and indexes it — so the next time a similar issue occurs, the agent finds and follows that guide automatically. 
  • The agent captures insights from every conversation on its own. Your guidance tells it which ones matter most. This is exactly how Microsoft’s own SRE team gets the best results: “Whenever it gets stuck, we talk to it and teach it, ask it to update its memory, and it doesn’t fail that class of problem again.” Read the full story in The Agent That Investigates Itself. 

See it in action: an Azure Monitor alert, end to end 

An HTTP 5xx spike fires on your container app. Your agent is in autonomous mode. It acknowledges the alert, checks resource health, reads logs, and delivers a diagnosis — that's what it already does well. Deep Context makes this dramatically better. 

Two things change everything: 

  1. The agent already knows your environment.It'salready read your code, runbooks, and built context from previous investigations. Your route handlers, database layer, deployment configs, operational procedures, it knows all of it. So, when these alert fires, it doesn't start from scratch. It goes straight to the relevant code path, correlates a recent connection pooling commit with the deployment timeline, and confirms the root cause. 
  2.  The agent remembers.It's seen this pattern before a similar incident last week that was investigated but never permanently fixed. It recognizes the recurrence from persistent memory, skips rediscovery, confirms the issue is still in the code, and this time fixes it. 

Because it's in autonomous mode, the agent edits the source code, restarts the container, pushes the fix to a new branch, creates a PR, opens a GitHub Issue, and verifies service health,  all before you wake up. 

The agent delivers a complete remediation summary including alert, root cause with code references, fix applied, PR created, without a single message from you. 

Code access turns diagnosis into action. Persistent memory turns recurring problems into solved problems.

Give your agent your code — here's why it matters 

If you're on an IT operations, SRE, or DevOps team, you might think: "Code access? That's for developers." We'd encourage you to rethink that. Your infrastructure-as-code, deployment configs, Helm charts, Terraform files, pipeline definitions — that's all code. And it's exactly the context your agent needs to go from good to extraordinary. 

When your agent can read your actual configuration and infrastructure code, investigations transform. Instead of generic troubleshooting, you get root cause analysis that points to the exact file, the exact line, the exact config change. It correlates a deployment failure with a specific commit. It reads your Helm values and spots the misconfiguration that caused the pod crash loop. 

"Will the agent modify our production code?" No. The agent works in a secure sandbox — a copy of your repository, not your production environment. When it identifies a fix, it creates a pull request on a new branch. Your code review process, your CI/CD pipeline, your approval gates — all untouched. The agent proposes. Your team decides. 

Whether you're a developer, an SRE, or an IT operator managing infrastructure you didn't write — connecting your code is the single highest-impact thing you can do to make your agent smarter. 

The compound effects 

Deep Context amplifies every other SRE Agent capability: 

Deep Context + Incident management → Alerts fire, the agent correlates logs with actual code. Root cause references specific files and line numbers. 

Deep Context + Scheduled tasks → Automated code analysis, compliance checks, and drift detection — inspecting your actual infrastructure code, not just metrics. 

Deep Context + MCP connectors → Datadog, Splunk, PagerDuty data combined with source code context. The full picture in one conversation. 

Deep Context + Knowledge files → Upload runbooks, architecture docs, postmortems — in any format. The agent cross-references your team's knowledge with live code, logs, and infrastructure state. 

 

Logs tell the agent what happened. Code tells it why. Your knowledge files tell it what to do about it. 

Get started 

Deep Context is available today as part of Azure SRE Agent GA. New agents have it enabled by default. For a step-by-step walkthrough connecting your code, logs, incidents, and knowledge files, see What It Takes to Give an SRE Agent a Useful Starting Point  

Resources 

Updated Mar 10, 2026
Version 1.0
No CommentsBe the first to comment