Forum Discussion
BYO Thread Storage in Azure AI Foundry using Python
Build scalable, secure, and persistent multi-agent memory with your own storage backend
As AI agents evolve beyond one-off interactions, persistent context becomes a critical architectural requirement. Azure AI Foundry’s latest update introduces a powerful capability — Bring Your Own (BYO) Thread Storage — enabling developers to integrate custom storage solutions for agent threads.
This feature empowers enterprises to control how agent memory is stored, retrieved, and governed, aligning with compliance, scalability, and observability goals.
What Is “BYO Thread Storage”?
In Azure AI Foundry, a thread represents a conversation or task execution context for an AI agent. By default, thread state (messages, actions, results, metadata) is stored in Foundry’s managed storage.
With BYO Thread Storage, you can now:
- Store threads in your own database — Azure Cosmos DB, SQL, Blob, or even a Vector DB.
- Apply custom retention, encryption, and access policies.
- Integrate with your existing data and governance frameworks.
- Enable cross-region disaster recovery (DR) setups seamlessly.
This gives enterprises full control of data lifecycle management — a big step toward AI-first operational excellence.
Architecture Overview
A typical setup involves:
- Azure AI Foundry Agent Service — Hosts your multi-agent setup.
- Custom Thread Storage Backend — e.g., Azure Cosmos DB, Azure Table, or PostgreSQL.
- Thread Adapter — Python class implementing the Foundry storage interface.
- Disaster Recovery (DR) replication — Optional replication of threads to secondary region.
Implementing BYO Thread Storage using Python
Prerequisites
First, install the necessary Python packages:
pip install azure-ai-projects azure-cosmos azure-identity
Setting Up the Storage Layer
from azure.cosmos import CosmosClient, PartitionKey
from azure.identity import DefaultAzureCredential
import json
from datetime import datetime
class ThreadStorageManager:
def __init__(self, cosmos_endpoint, database_name, container_name):
credential = DefaultAzureCredential()
self.client = CosmosClient(cosmos_endpoint, credential=credential)
self.database = self.client.get_database_client(database_name)
self.container = self.database.get_container_client(container_name)
def create_thread(self, user_id, metadata=None):
"""Create a new conversation thread"""
thread_id = f"thread_{user_id}_{datetime.utcnow().timestamp()}"
thread_data = {
'id': thread_id,
'user_id': user_id,
'messages': [],
'created_at': datetime.utcnow().isoformat(),
'updated_at': datetime.utcnow().isoformat(),
'metadata': metadata or {}
}
self.container.create_item(body=thread_data)
return thread_id
def add_message(self, thread_id, role, content):
"""Add a message to an existing thread"""
thread = self.container.read_item(item=thread_id, partition_key=thread_id)
message = {
'role': role,
'content': content,
'timestamp': datetime.utcnow().isoformat()
}
thread['messages'].append(message)
thread['updated_at'] = datetime.utcnow().isoformat()
self.container.replace_item(item=thread_id, body=thread)
return message
def get_thread(self, thread_id):
"""Retrieve a complete thread"""
try:
return self.container.read_item(item=thread_id, partition_key=thread_id)
except Exception as e:
print(f"Thread not found: {e}")
return None
def get_thread_messages(self, thread_id):
"""Get all messages from a thread"""
thread = self.get_thread(thread_id)
return thread['messages'] if thread else []
def delete_thread(self, thread_id):
"""Delete a thread"""
self.container.delete_item(item=thread_id, partition_key=thread_id)
Integrating with Azure AI Foundry
from azure.ai.projects import AIProjectClient
from azure.identity import DefaultAzureCredential
class ConversationManager:
def __init__(self, project_endpoint, storage_manager):
self.ai_client = AIProjectClient.from_connection_string(
credential=DefaultAzureCredential(),
conn_str=project_endpoint
)
self.storage = storage_manager
def start_conversation(self, user_id, system_prompt):
"""Initialize a new conversation"""
thread_id = self.storage.create_thread(
user_id=user_id,
metadata={'system_prompt': system_prompt}
)
# Add system message
self.storage.add_message(thread_id, 'system', system_prompt)
return thread_id
def send_message(self, thread_id, user_message, model_deployment):
"""Send a message and get AI response"""
# Store user message
self.storage.add_message(thread_id, 'user', user_message)
# Retrieve conversation history
messages = self.storage.get_thread_messages(thread_id)
# Call Azure AI with conversation history
response = self.ai_client.inference.get_chat_completions(
model=model_deployment,
messages=[
{"role": msg['role'], "content": msg['content']}
for msg in messages
]
)
assistant_message = response.choices[0].message.content
# Store assistant response
self.storage.add_message(thread_id, 'assistant', assistant_message)
return assistant_message
Usage Example
# Initialize storage and conversation manager
storage = ThreadStorageManager(
cosmos_endpoint="https://your-cosmos-account.documents.azure.com:443/",
database_name="conversational-ai",
container_name="threads"
)
conversation_mgr = ConversationManager(
project_endpoint="your-project-connection-string",
storage_manager=storage
)
# Start a new conversation
thread_id = conversation_mgr.start_conversation(
user_id="user123",
system_prompt="You are a helpful AI assistant."
)
# Send messages
response1 = conversation_mgr.send_message(
thread_id=thread_id,
user_message="What is machine learning?",
model_deployment="gpt-4"
)
print(f"AI: {response1}")
response2 = conversation_mgr.send_message(
thread_id=thread_id,
user_message="Can you give me an example?",
model_deployment="gpt-4"
)
print(f"AI: {response2}")
# Retrieve full conversation history
history = storage.get_thread_messages(thread_id)
for msg in history:
print(f"{msg['role']}: {msg['content']}")
Key Highlights:
- Threads are stored in Cosmos DB under your control.
- You can attach metadata such as region, owner, or compliance tags.
- Integrates natively with existing Azure identity and Key Vault.
Disaster Recovery & Resilience
When coupled with geo-replicated Cosmos DB or Azure Storage RA-GRS, your BYO thread storage becomes resilient by design:
- Primary writes in East US replicate to Central US.
- Foundry auto-detects failover and reconnects to secondary region.
- Threads remain available during outages — ensuring operational continuity.
This aligns perfectly with the AI-First Operational Excellence architecture theme, where reliability and observability drive intelligent automation.
Best Practices
| Area | Recommendation |
|---|---|
| Security | Use Azure Key Vault for credentials & encryption keys. |
| Compliance | Configure data residency & retention in your own DB. |
| Observability | Log thread CRUD operations to Azure Monitor or Application Insights. |
| Performance | Use async I/O and partition keys for large workloads. |
| DR | Enable geo-redundant storage & failover tests regularly. |
When to Use BYO Thread Storage
| Scenario | Why it helps |
|---|---|
| Regulated industries (BFSI, Healthcare, etc.) | Maintain data control & audit trails |
| Multi-region agent deployments | Support DR and data sovereignty |
| Advanced analytics on conversation data | Query threads directly from your DB |
| Enterprise observability | Unified monitoring across Foundry + Ops |
The Future
BYO Thread Storage opens doors to advanced use cases — federated agent memory, semantic retrieval over past conversations, and dynamic workload failover across regions.
For architects, this feature is a key enabler for secure, scalable, and compliant AI system design.
For developers, it means more flexibility, transparency, and integration power.
Summary
| Feature | Benefit |
|---|---|
| Custom thread storage | Full control over data |
| Python adapter support | Easy extensibility |
| Multi-region DR ready | Business continuity |
| Azure-native security | Enterprise-grade safety |
Conclusion
Implementing BYO thread storage in Azure AI Foundry gives you the flexibility to build AI applications that meet your specific requirements for data governance, performance, and scalability. By taking control of your storage, you can create more robust, compliant, and maintainable AI solutions.
2 Replies
- LalitChandra
Microsoft
Very well explained. It’s a good topic and you’ve added some excellent insights.
- akumargupta
Microsoft
Thank you LalitChandra for your kind feedback!