Apps on Azure Blog

6 MIN READ

Never Explain Context Twice: Introducing Azure SRE Agent memory

Microsoft

Dec 08, 2025

In our recent blog post, we highlighted how Azure SRE Agent has evolved into an extensible AI-powered operations platform. One of the most requested capabilities from customers has been the ability for agents to retain knowledge across sessions-learning from past incidents, remembering team preferences, and continuously improving troubleshooting accuracy. Today, we're excited to dive deeper into the Azure SRE Agent memory, a powerful feature that transforms how your operations teams work with AI.

Why Memory Matters for AI Operations

Every seasoned SRE knows that institutional knowledge is invaluable. The most effective on-call engineers aren't just technically skilled, they remember the quirks of specific services, recall solutions from past incidents, and know the team's preferred diagnostic approaches. Until now, AI assistants started every conversation from scratch, forcing teams to repeatedly explain context that experienced engineers would simply know.

The SRE Agent Memory changes this paradigm. It enables agents to:

Remember team facts, preferences, and context across all conversations
Retrieve relevant runbooks and documentation during troubleshooting
Learn from past sessions to improve future responses
Share knowledge across your entire team automatically

Context Engineering: The Key to Better AI Outcomes

At the heart of the memory is a concept we call context engineering, the practice of purposefully curating and optimizing the information you provide to the agent to get better results. Rather than hoping the AI figures things out, you systematically build a knowledge foundation that makes every interaction smarter.

The workflow is simple:

Identify gaps: Use Session Insights to see where the agent struggled or lacked knowledge
Add targeted context: Upload runbooks to the Knowledge Base or save facts with User Memories
Track improvement: Review subsequent sessions to measure whether your additions improved outcomes
Iterate: Continuously refine your context based on real session data

This feedback loop transforms ad-hoc troubleshooting into a systematically improving process, where each session makes future sessions more effective.

Memory Components at a Glance

The memory consists of three complementary components that work together to give your agents comprehensive knowledge:

🧠 User Memories: Quick Chat Commands for Team Knowledge

Save facts, preferences, and context using simple chat commands. User Memories are ideal for team standards, service configurations, and workflow patterns that should persist across all conversations.

Key benefits:

✅ Instant setup-no configuration required
✅ Managed directly in chat with #remember, #forget, and #retrieve commands
✅ Shared across all team members automatically
✅ Works across all conversations and agents

Example commands:

#remember Team owns app-service-prod in East US region #remember For latency issues, check Redis cache first #remember Production deployments happen Tuesdays at 2 PM PST

When you save a memory, it's instantly available across all your team's conversations. The agent automatically retrieves relevant memories during reasoning, no additional configuration needed.

Saving team knowledge with the #remember command

Use #retrieve to search and display your saved memories:

Retrieving saved memories with the #retrieve command

📚 Knowledge Base: Direct Document Uploads for Runbooks and Guides

Upload markdown and text files directly to the agent's knowledge base. Documents are automatically indexed using semantic search and available for agent retrieval during troubleshooting.

The Knowledge Base uses intelligent indexing that combines keyword matching with semantic similarity. Documents are automatically split into optimal chunks, so agents retrieve the most relevant sections, not entire documents.

Key benefits:

✅ Supports .md and .txt files (up to 16MB per file)
✅ Automatic chunking and semantic indexing
✅ Simple file upload interface
✅ Instant availability after upload

Best for: Static runbooks, troubleshooting guides, internal documentation, and configuration templates.

Navigate to Settings > Knowledge Base to access document management. There you will find Add File, allows you to upload txt and md file(s) and Delete, allows you to delete individual or bulk files.

📊 Session Insights: Automated Analysis of Your Troubleshooting Sessions

Get automated feedback on your troubleshooting sessions with timelines, performance analysis, and key learnings. Session Insights help you understand what happened, learn from mistakes, and continuously improve.

Key benefits:

✅ Automatic analysis after conversations complete
✅ Chronological timeline of actions taken
✅ Performance scoring with specific improvement suggestions
✅ Key learnings for future sessions

Navigate to Settings > Session Insights to view your troubleshooting analysis:

Session Insights dashboard showing analysis of past troubleshooting sessions

You can also manually trigger insight generation for any conversation by clicking the Generate Session Insights icon in the chat footer:

Manually triggering Session Insights generation

Each insight includes:

Timeline: A chronological narrative showing what actions were taken and their outcomes
What Went Well: Highlights correct understanding and effective actions
Areas for Improvement: Shows what could be done better with specific remediation steps
Key Learnings: Actionable takeaways for future sessions
Investigation Quality Score: Sessions rated on a 1-5 scale for completeness

How Azure SRE Agent Use Memory: The SearchMemory Tool

During conversations, incident handling, and scheduled tasks, Azure SRE Agents search across memory sources to retrieve relevant context using the SearchMemory tool.

Enabling Memory Retrieval in Custom Sub-Agents

When building custom sub-agents with the Sub-Agent Builder, you can enable memory retrieval by adding the SearchMemory tool to your sub-agent's toolset. This allows your custom automation to leverage all the knowledge stored in User Memories and the Knowledge Base.

How it works:

In the Sub-Agent Builder, add the SearchMemory tool to your sub-agent's available tools
The tool automatically searches across all memory sources using intelligent retrieval
Your sub-agent receives relevant context to inform its responses and actions

This means your custom sub-agents, whether handling specific incident types, automating runbook execution, or performing scheduled health checks, can all benefit from your team's accumulated knowledge.

Choosing the Right Memory Type

Feature	User Memories	Knowledge Base
Setup	Instant (chat commands)	Quick (file upload)
Management	Chat commands	Portal UI
Content Size	Short facts	Documents (up to 16MB)
Best Use Case	Team preferences	Static runbooks
Team Sharing	✅ Shared	✅ Shared

Quick guidance:

User Memories: Short, focused facts (1-2 sentences) for immediate team context
Knowledge Base: Well-structured documents with clear headers for procedural knowledge

Getting Started in Minutes

1. Start with User Memories

Open any chat with your Azure SRE Agent and save immediate team knowledge:

#remember Team owns services: app-service-prod, redis-cache-prod, and sql-db-prod #remember For latency issues, check Redis cache health first #remember Team uses East US for production workloads

That's it, these facts are now available across all conversations.

2. Upload Key Documents

Add critical runbooks and guides to the Knowledge Base:

Navigate to Settings > Knowledge Base
Upload .md or .txt files
Files are automatically indexed and available immediately

3. Review Session Insights

After troubleshooting sessions, check Settings > Session Insights to see what went well and where the agent needs more context. Use this feedback to identify gaps and add targeted memories or documentation.

Best Practices for Building Agent Memory

Content Organization

Keep memories focused and specific
Use consistent terminology across your team
Avoid duplication, choose one source of truth for each piece of information

Security

Never store:

❌ Credentials, API keys, or secrets
❌ Personal identifiable information (PII)
❌ Customer data or logs
❌ Confidential business information

Maintenance

Regularly review and update memories
Remove outdated information using #forget
Consolidate duplicate entries
Use #retrieve to audit what's been saved

The Impact: Smarter Troubleshooting, Lower MTTR

The Azure SRE Agent memory delivers measurable improvements:

Faster troubleshooting: Agents immediately understand your environment and preferences
Reduced toil: No more repeatedly explaining the same context
Institutional knowledge capture: Critical team knowledge persists even as team members change
Continuous improvement: Each session makes future sessions more effective

By systematically building your agent's knowledge foundation, you create an operations assistant that truly understands your environment, reducing mean time to resolution (MTTR) and freeing your team to focus on high-value work.

Ready to Get Started?

What's Next?

We're continually enhancing the memory based on customer feedback. Your input is critical, use the thumbs up/down feedback in the agent, or share your thoughts in our GitHub repo.

What operational knowledge would you like your AI agent to remember? Let us know!

This blog post is part of our ongoing series on Azure SRE Agent capabilities. See our previous post on automation, integration, and extensibility features.

Updated Dec 08, 2025

Version 2.0

azure sre agent

devops

Dalibor_Kovacevic

Microsoft

Joined June 14, 2021

View Profile

Apps on Azure Blog

Follow this blog board to get notified when there's new activity