BYO Thread Storage in Azure AI Foundry using Python

Question

Build scalable, secure, and persistent multi-agent memory with your own storage backend
As AI agents evolve beyond one-off interactions, persistent context becomes a critical architectural requirement. Azure AI Foundry’s latest update introduces a powerful capability — Bring Your Own (BYO) Thread Storage — enabling developers to integrate custom storage solutions for agent threads.
This feature empowers enterprises to control how agent memory is stored, retrieved, and governed, aligning with compliance, scalability, and observability goals.
&nbsp;
What Is “BYO Thread Storage”?
In Azure AI Foundry, a thread represents a conversation or task execution context for an AI agent. By default, thread state (messages, actions, results, metadata) is stored in Foundry’s managed storage.
With BYO Thread Storage, you can now:

Store threads in your own database — Azure Cosmos DB, SQL, Blob, or even a Vector DB.
Apply custom retention, encryption, and access policies.
Integrate with your existing data and governance frameworks.
Enable cross-region disaster recovery (DR) setups seamlessly.

This gives enterprises full control of data lifecycle management — a big step toward AI-first operational excellence.
&nbsp;
Architecture Overview
A typical setup involves:

Azure AI Foundry Agent Service — Hosts your multi-agent setup.
Custom Thread Storage Backend — e.g., Azure Cosmos DB, Azure Table, or PostgreSQL.
Thread Adapter — Python class implementing the Foundry storage interface.
Disaster Recovery (DR) replication — Optional replication of threads to secondary region.

&nbsp;
Implementing BYO Thread Storage using Python
Prerequisites
First, install the necessary Python packages:
pip install azure-ai-projects azure-cosmos azure-identity
Setting Up the Storage Layer
from azure.cosmos import CosmosClient, PartitionKey
from azure.identity import DefaultAzureCredential
import json
from datetime import datetime

class ThreadStorageManager:
    def __init__(self, cosmos_endpoint, database_name, container_name):
        credential = DefaultAzureCredential()
        self.client = CosmosClient(cosmos_endpoint, credential=credential)
        self.database = self.client.get_database_client(database_name)
        self.container = self.database.get_container_client(container_name)
    
    def create_thread(self, user_id, metadata=None):
        """Create a new conversation thread"""
        thread_id = f"thread_{user_id}_{datetime.utcnow().timestamp()}"
        thread_data = {
            'id': thread_id,
            'user_id': user_id,
            'messages': [],
            'created_at': datetime.utcnow().isoformat(),
            'updated_at': datetime.utcnow().isoformat(),
            'metadata': metadata or {}
        }
        self.container.create_item(body=thread_data)
        return thread_id
    
    def add_message(self, thread_id, role, content):
        """Add a message to an existing thread"""
        thread = self.container.read_item(item=thread_id, partition_key=thread_id)
        
        message = {
            'role': role,
            'content': content,
            'timestamp': datetime.utcnow().isoformat()
        }
        
        thread['messages'].append(message)
        thread['updated_at'] = datetime.utcnow().isoformat()
        
        self.container.replace_item(item=thread_id, body=thread)
        return message
    
    def get_thread(self, thread_id):
        """Retrieve a complete thread"""
        try:
            return self.container.read_item(item=thread_id, partition_key=thread_id)
        except Exception as e:
            print(f"Thread not found: {e}")
            return None
    
    def get_thread_messages(self, thread_id):
        """Get all messages from a thread"""
        thread = self.get_thread(thread_id)
        return thread['messages'] if thread else []
    
    def delete_thread(self, thread_id):
        """Delete a thread"""
        self.container.delete_item(item=thread_id, partition_key=thread_id)
Integrating with Azure AI Foundry
from azure.ai.projects import AIProjectClient
from azure.identity import DefaultAzureCredential

class ConversationManager:
    def __init__(self, project_endpoint, storage_manager):
        self.ai_client = AIProjectClient.from_connection_string(
            credential=DefaultAzureCredential(),
            conn_str=project_endpoint
        )
        self.storage = storage_manager
    
    def start_conversation(self, user_id, system_prompt):
        """Initialize a new conversation"""
        thread_id = self.storage.create_thread(
            user_id=user_id,
            metadata={'system_prompt': system_prompt}
        )
        
        # Add system message
        self.storage.add_message(thread_id, 'system', system_prompt)
        return thread_id
    
    def send_message(self, thread_id, user_message, model_deployment):
        """Send a message and get AI response"""
        # Store user message
        self.storage.add_message(thread_id, 'user', user_message)
        
        # Retrieve conversation history
        messages = self.storage.get_thread_messages(thread_id)
        
        # Call Azure AI with conversation history
        response = self.ai_client.inference.get_chat_completions(
            model=model_deployment,
            messages=[
                {"role": msg['role'], "content": msg['content']}
                for msg in messages
            ]
        )
        
        assistant_message = response.choices[0].message.content
        
        # Store assistant response
        self.storage.add_message(thread_id, 'assistant', assistant_message)
        
        return assistant_message
Usage Example
# Initialize storage and conversation manager
storage = ThreadStorageManager(
    cosmos_endpoint="https://your-cosmos-account.documents.azure.com:443/",
    database_name="conversational-ai",
    container_name="threads"
)

conversation_mgr = ConversationManager(
    project_endpoint="your-project-connection-string",
    storage_manager=storage
)

# Start a new conversation
thread_id = conversation_mgr.start_conversation(
    user_id="user123",
    system_prompt="You are a helpful AI assistant."
)

# Send messages
response1 = conversation_mgr.send_message(
    thread_id=thread_id,
    user_message="What is machine learning?",
    model_deployment="gpt-4"
)
print(f"AI: {response1}")

response2 = conversation_mgr.send_message(
    thread_id=thread_id,
    user_message="Can you give me an example?",
    model_deployment="gpt-4"
)
print(f"AI: {response2}")

# Retrieve full conversation history
history = storage.get_thread_messages(thread_id)
for msg in history:
    print(f"{msg['role']}: {msg['content']}")
Key Highlights:

Threads are stored in Cosmos DB under your control.
You can attach metadata such as region, owner, or compliance tags.
Integrates natively with existing Azure identity and Key Vault.

Disaster Recovery &amp; Resilience
When coupled with geo-replicated Cosmos DB or Azure Storage RA-GRS, your BYO thread storage becomes resilient by design:

Primary writes in East US replicate to Central US.
Foundry auto-detects failover and reconnects to secondary region.
Threads remain available during outages — ensuring operational continuity.

This aligns perfectly with the AI-First Operational Excellence architecture theme, where reliability and observability drive intelligent automation.
Best Practices
AreaRecommendationSecurityUse Azure Key Vault for credentials &amp; encryption keys.ComplianceConfigure data residency &amp; retention in your own DB.ObservabilityLog thread CRUD operations to Azure Monitor or Application Insights.PerformanceUse async I/O and partition keys for large workloads.DREnable geo-redundant storage &amp; failover tests regularly.
When to Use BYO Thread Storage
ScenarioWhy it helpsRegulated industries (BFSI, Healthcare, etc.)Maintain data control &amp; audit trailsMulti-region agent deploymentsSupport DR and data sovereigntyAdvanced analytics on conversation dataQuery threads directly from your DBEnterprise observabilityUnified monitoring across Foundry + Ops
The Future
BYO Thread Storage opens doors to advanced use cases — federated agent memory, semantic retrieval over past conversations, and dynamic workload failover across regions.
For architects, this feature is a key enabler for secure, scalable, and compliant AI system design.For developers, it means more flexibility, transparency, and integration power.
Summary
FeatureBenefitCustom thread storageFull control over dataPython adapter supportEasy extensibilityMulti-region DR readyBusiness continuityAzure-native securityEnterprise-grade safety
Conclusion
Implementing BYO thread storage in Azure AI Foundry gives you the flexibility to build AI applications that meet your specific requirements for data governance, performance, and scalability. By taking control of your storage, you can create more robust, compliant, and maintainable AI solutions.

lalitchandra · Answer

Very well explained. It’s a good topic and you’ve added some excellent insights.

akumargupta · Answer

Thank you LalitChandra​ for your kind feedback!

Area	Recommendation
Security	Use Azure Key Vault for credentials & encryption keys.
Compliance	Configure data residency & retention in your own DB.
Observability	Log thread CRUD operations to Azure Monitor or Application Insights.
Performance	Use async I/O and partition keys for large workloads.
DR	Enable geo-redundant storage & failover tests regularly.

Forum Discussion

BYO Thread Storage in Azure AI Foundry using Python

Build scalable, secure, and persistent multi-agent memory with your own storage backend

What Is “BYO Thread Storage”?

Architecture Overview

Implementing BYO Thread Storage using Python

Prerequisites

First, install the necessary Python packages:

Setting Up the Storage Layer

Integrating with Azure AI Foundry

Usage Example

Key Highlights:

Disaster Recovery & Resilience

Best Practices

When to Use BYO Thread Storage

The Future

Summary

Conclusion

2 Replies

Resources

Scenario	Why it helps
Regulated industries (BFSI, Healthcare, etc.)	Maintain data control & audit trails
Multi-region agent deployments	Support DR and data sovereignty
Advanced analytics on conversation data	Query threads directly from your DB
Enterprise observability	Unified monitoring across Foundry + Ops

Feature	Benefit
Custom thread storage	Full control over data
Python adapter support	Easy extensibility
Multi-region DR ready	Business continuity
Azure-native security	Enterprise-grade safety