Forum Discussion

akumargupta's avatar
akumargupta
Icon for Microsoft rankMicrosoft
Nov 07, 2025

BYO Thread Storage in Azure AI Foundry using Python

Build scalable, secure, and persistent multi-agent memory with your own storage backend

As AI agents evolve beyond one-off interactions, persistent context becomes a critical architectural requirement. Azure AI Foundry’s latest update introduces a powerful capability — Bring Your Own (BYO) Thread Storage — enabling developers to integrate custom storage solutions for agent threads.

This feature empowers enterprises to control how agent memory is stored, retrieved, and governed, aligning with compliance, scalability, and observability goals.

 

What Is “BYO Thread Storage”?

In Azure AI Foundry, a thread represents a conversation or task execution context for an AI agent. By default, thread state (messages, actions, results, metadata) is stored in Foundry’s managed storage.

With BYO Thread Storage, you can now:

  • Store threads in your own database — Azure Cosmos DB, SQL, Blob, or even a Vector DB.
  • Apply custom retention, encryption, and access policies.
  • Integrate with your existing data and governance frameworks.
  • Enable cross-region disaster recovery (DR) setups seamlessly.

This gives enterprises full control of data lifecycle management — a big step toward AI-first operational excellence.

 

Architecture Overview

A typical setup involves:

  1. Azure AI Foundry Agent Service — Hosts your multi-agent setup.
  2. Custom Thread Storage Backend — e.g., Azure Cosmos DB, Azure Table, or PostgreSQL.
  3. Thread Adapter — Python class implementing the Foundry storage interface.
  4. Disaster Recovery (DR) replication — Optional replication of threads to secondary region.

 

Implementing BYO Thread Storage using Python

Prerequisites

First, install the necessary Python packages:
pip install azure-ai-projects azure-cosmos azure-identity
Setting Up the Storage Layer
from azure.cosmos import CosmosClient, PartitionKey
from azure.identity import DefaultAzureCredential
import json
from datetime import datetime

class ThreadStorageManager:
    def __init__(self, cosmos_endpoint, database_name, container_name):
        credential = DefaultAzureCredential()
        self.client = CosmosClient(cosmos_endpoint, credential=credential)
        self.database = self.client.get_database_client(database_name)
        self.container = self.database.get_container_client(container_name)
    
    def create_thread(self, user_id, metadata=None):
        """Create a new conversation thread"""
        thread_id = f"thread_{user_id}_{datetime.utcnow().timestamp()}"
        thread_data = {
            'id': thread_id,
            'user_id': user_id,
            'messages': [],
            'created_at': datetime.utcnow().isoformat(),
            'updated_at': datetime.utcnow().isoformat(),
            'metadata': metadata or {}
        }
        self.container.create_item(body=thread_data)
        return thread_id
    
    def add_message(self, thread_id, role, content):
        """Add a message to an existing thread"""
        thread = self.container.read_item(item=thread_id, partition_key=thread_id)
        
        message = {
            'role': role,
            'content': content,
            'timestamp': datetime.utcnow().isoformat()
        }
        
        thread['messages'].append(message)
        thread['updated_at'] = datetime.utcnow().isoformat()
        
        self.container.replace_item(item=thread_id, body=thread)
        return message
    
    def get_thread(self, thread_id):
        """Retrieve a complete thread"""
        try:
            return self.container.read_item(item=thread_id, partition_key=thread_id)
        except Exception as e:
            print(f"Thread not found: {e}")
            return None
    
    def get_thread_messages(self, thread_id):
        """Get all messages from a thread"""
        thread = self.get_thread(thread_id)
        return thread['messages'] if thread else []
    
    def delete_thread(self, thread_id):
        """Delete a thread"""
        self.container.delete_item(item=thread_id, partition_key=thread_id)
Integrating with Azure AI Foundry
from azure.ai.projects import AIProjectClient
from azure.identity import DefaultAzureCredential

class ConversationManager:
    def __init__(self, project_endpoint, storage_manager):
        self.ai_client = AIProjectClient.from_connection_string(
            credential=DefaultAzureCredential(),
            conn_str=project_endpoint
        )
        self.storage = storage_manager
    
    def start_conversation(self, user_id, system_prompt):
        """Initialize a new conversation"""
        thread_id = self.storage.create_thread(
            user_id=user_id,
            metadata={'system_prompt': system_prompt}
        )
        
        # Add system message
        self.storage.add_message(thread_id, 'system', system_prompt)
        return thread_id
    
    def send_message(self, thread_id, user_message, model_deployment):
        """Send a message and get AI response"""
        # Store user message
        self.storage.add_message(thread_id, 'user', user_message)
        
        # Retrieve conversation history
        messages = self.storage.get_thread_messages(thread_id)
        
        # Call Azure AI with conversation history
        response = self.ai_client.inference.get_chat_completions(
            model=model_deployment,
            messages=[
                {"role": msg['role'], "content": msg['content']}
                for msg in messages
            ]
        )
        
        assistant_message = response.choices[0].message.content
        
        # Store assistant response
        self.storage.add_message(thread_id, 'assistant', assistant_message)
        
        return assistant_message
Usage Example
# Initialize storage and conversation manager
storage = ThreadStorageManager(
    cosmos_endpoint="https://your-cosmos-account.documents.azure.com:443/",
    database_name="conversational-ai",
    container_name="threads"
)

conversation_mgr = ConversationManager(
    project_endpoint="your-project-connection-string",
    storage_manager=storage
)

# Start a new conversation
thread_id = conversation_mgr.start_conversation(
    user_id="user123",
    system_prompt="You are a helpful AI assistant."
)

# Send messages
response1 = conversation_mgr.send_message(
    thread_id=thread_id,
    user_message="What is machine learning?",
    model_deployment="gpt-4"
)
print(f"AI: {response1}")

response2 = conversation_mgr.send_message(
    thread_id=thread_id,
    user_message="Can you give me an example?",
    model_deployment="gpt-4"
)
print(f"AI: {response2}")

# Retrieve full conversation history
history = storage.get_thread_messages(thread_id)
for msg in history:
    print(f"{msg['role']}: {msg['content']}")

Key Highlights:

  • Threads are stored in Cosmos DB under your control.
  • You can attach metadata such as region, owner, or compliance tags.
  • Integrates natively with existing Azure identity and Key Vault.

Disaster Recovery & Resilience

When coupled with geo-replicated Cosmos DB or Azure Storage RA-GRS, your BYO thread storage becomes resilient by design:

  • Primary writes in East US replicate to Central US.
  • Foundry auto-detects failover and reconnects to secondary region.
  • Threads remain available during outages — ensuring operational continuity.

This aligns perfectly with the AI-First Operational Excellence architecture theme, where reliability and observability drive intelligent automation.

Best Practices

AreaRecommendation
SecurityUse Azure Key Vault for credentials & encryption keys.
ComplianceConfigure data residency & retention in your own DB.
ObservabilityLog thread CRUD operations to Azure Monitor or Application Insights.
PerformanceUse async I/O and partition keys for large workloads.
DREnable geo-redundant storage & failover tests regularly.

When to Use BYO Thread Storage

ScenarioWhy it helps
Regulated industries (BFSI, Healthcare, etc.)Maintain data control & audit trails
Multi-region agent deploymentsSupport DR and data sovereignty
Advanced analytics on conversation dataQuery threads directly from your DB
Enterprise observabilityUnified monitoring across Foundry + Ops

The Future

BYO Thread Storage opens doors to advanced use cases — federated agent memory, semantic retrieval over past conversations, and dynamic workload failover across regions.

For architects, this feature is a key enabler for secure, scalable, and compliant AI system design.
For developers, it means more flexibility, transparency, and integration power.

Summary

FeatureBenefit
Custom thread storageFull control over data
Python adapter supportEasy extensibility
Multi-region DR readyBusiness continuity
Azure-native securityEnterprise-grade safety

Conclusion

Implementing BYO thread storage in Azure AI Foundry gives you the flexibility to build AI applications that meet your specific requirements for data governance, performance, and scalability. By taking control of your storage, you can create more robust, compliant, and maintainable AI solutions.

Resources