ai foundry
95 TopicsBuilding a Scalable Contract Data Extraction Pipeline with Microsoft Foundry and Python
Architecture Overview Alt text: Architecture diagram showing Blob Storage triggering Azure Function, calling Document Intelligence, transforming data, and storing in Cosmos DB Flow: Upload contract files (PDF or ZIP) to Azure Blob Storage Azure Function triggers automatically on file upload Azure AI Document Intelligence extracts layout and tables A transformation layer converts output into a canonical JSON format Data is stored in Azure Cosmos DB Step 1: Trigger Processing with Azure Functions An Azure Function with a Blob trigger enables automatic processing when a file is uploaded. import logging import azure.functions as func import zipfile import io def main(myblob: func.InputStream): logging.info(f"Processing blob: {myblob.name}") if myblob.name.endswith(".zip"): with zipfile.ZipFile(io.BytesIO(myblob.read())) as z: for file_name in z.namelist(): logging.info(f"Extracting {file_name}") file_data = z.read(file_name) # Pass file_data to extraction step Best Practices Keep functions stateless and idempotent Handle retries for transient failures Store configuration in environment variables Step 2: Extract Layout Using Document Intelligence The prebuilt layout model helps extract tables, text, and structure from documents. from azure.ai.documentintelligence import DocumentIntelligenceClient from azure.core.credentials import AzureKeyCredential client = DocumentIntelligenceClient( endpoint="<your-endpoint>", credential=AzureKeyCredential("<your-key>") ) poller = client.begin_analyze_document( "prebuilt-layout", document=file_data ) result = poller.result() Output Includes Structured tables Paragraphs and text blocks Bounding regions for layout context Step 3: Handle Multi-Page Table Continuity Contract documents often contain tables split across multiple pages. These need to be merged to preserve data integrity. def merge_tables(tables): merged = [] current = None for table in tables: headers = [cell.content for cell in table.cells if cell.row_index == 0] if current and headers == current["headers"]: current["rows"].extend(extract_rows(table)) else: if current: merged.append(current) current = { "headers": headers, "rows": extract_rows(table) } if current: merged.append(current) return merged Key Considerations Match headers to detect continuation Preserve row order Avoid duplicate headers Step 4: Transform to a Canonical JSON Schema A consistent schema ensures compatibility across downstream systems. { "id": "contract_123", "documentType": "contract", "vendorName": "ABC Corp", "invoiceDate": "2023-05-05", "tables": [ { "name": "Line Items", "headers": ["Item", "Qty", "Price"], "rows": [ ["Service A", "2", "100"] ] } ], "metadata": { "sourceFile": "contract.pdf", "processedAt": "2026-04-22T10:00:00Z" } } Design Tips Keep schema flexible and extensible Include metadata for traceability Avoid excessive nesting Step 5: Persist Data in Cosmos DB Store the transformed data in a scalable NoSQL database. from azure.cosmos import CosmosClient client = CosmosClient("<cosmos-uri>", "<key>") database = client.get_database_client("contracts-db") container = database.get_container_client("documents") container.upsert_item(canonical_json) Best Practices Choose an appropriate partition key (for example, documentType or vendorName) Optimize indexing policies Monitor request units (RU) usage Observability and Monitoring To ensure reliability: Enable logging with Application Insights Track processing time and failures Monitor document extraction accuracy Security Considerations Store secrets securely using Azure Key Vault Use Managed Identity for service authentication Apply role-based access control (RBAC) to storage resources Conclusion This approach provides a scalable and maintainable solution for contract data extraction: Event-driven processing with Azure Functions Accurate extraction using Document Intelligence Clean transformation into a reusable schema Efficient storage with Cosmos DB This foundation can be extended with validation layers, review workflows, or analytics dashboards depending on your business requirements. Resources Contract data extraction – Document Intelligence: Foundry Tools | Microsoft Learn microsoft/content-processing-solution-accelerator: Programmatically extract data and apply schemas to unstructured documents across text-based and multi-modal content using Azure AI Foundry, Azure OpenAI, Azure AI Content Understanding, and Cosmos DB.🖼️Streamline Image Generation Workflow in Foundry Toolkit
Integrating image generation into a production AI application has historically meant juggling multiple surfaces — browsing models in the Foundry portal, setting up deployments via the Azure CLI, testing prompts in a separate tool, then stitching together API credentials before writing a single line of app code. That context-switching adds friction at exactly the moment you want to be experimenting. With this release, the full image generation workflow — discover, deploy, prompt, iterate, export code — lives inside your editor. A few things this unlocks for developers: 🎨GPT-Image-2 in the Model Catalog GPT-Image-2 via Microsoft Foundry is now listed in the Foundry Toolkit Model Catalog. You can browse its capabilities, review inference parameters, and deploy it to any Azure AI Foundry project directly from the sidebar — no portal tab-switching required. To get started: Open FOUNDRY TOOLKIT → My Resources → Model Catalog Search for gpt-image-2 and select it to view model details and inference parameters. Click Deploy to add it to your active Foundry project. ✨Image Playground With GPT-Image-2 deployed, the Playground automatically surfaces an Image Playground mode. Describe what you want, hit generate, and see results side by side — no REST client, no extra tooling. Use the View Code shortcut to copy the API call directly into your project. To generate your first image: Click + New Playground in the Playground tab — the mode auto-selects Image Playground when gpt-image-2 is the active model. Type a prompt and send — generated images appear in the canvas with download controls. Click View Code (top right) to get a ready-to-paste code snippet for your application. Image generation is one of the fastest-growing use cases in production AI applications — from dynamic content creation to data augmentation to UI asset generation. This update ensures developers building on Microsoft Foundry have a first-class path to ship those capabilities faster. 🚀 Get Started Today Ready to experience the future of AI development? Here's how to get started: 📥 Download: Install the Foundry Toolkit from the Visual Studio Code Marketplace 📖 Learn: Explore our comprehensive Foundry Toolkit Documentation We'd love to hear from you! Whether it's a feature request, bug report, or feedback on your experience, join the conversation and contribute directly on our GitHub repository. Happy Coding!AG-UI: The Future of Agent-Driven User Interfaces
As AI agents evolve from simple chatbots into workflow engines, decision-makers, and copilots, one challenge has remained stubbornly unsolved: How do we connect intelligent agents to rich, dynamic, user-facing interfaces without custom plumbing every single time? Introducing AG-UI (Agent–User Interface) — a unified protocol that finally standardizes how agents talk to frontends. It enables streaming, declarative UI generation, state synchronization, and human-in-the-loop workflows out of the box. AG-UI is the missing piece that bridges backend intelligence with real-time, interactive user experiences. Why is AG‑UI a Game‑Changer? 1. A Unified Standard for Agent ↔ UI Interaction Before AG-UI, developers juggled: REST APIs for request–response Web Sockets/SSE for streaming Custom JSON structures for tool calls Ad-hoc approaches for dynamic UI The result? Fragmented systems, brittle integrations, and zero interoperability. AG-UI replaces all of this with one standard protocol covering: Chat messages UI component proposals Tool calls Shared state updates Interrupts for human approvals 2. Built-In Real-Time Streaming Token-level streaming is a first-class citizen. No need to hand-roll Web Sockets or patch SSE responses — AG-UI handles it through a consistent event protocol. 3. Human-in-the-Loop as a First-Class Concept AG-UI supports: Interrupt events Approval dialogs State diffs Resume logic All natively, with no custom workflow engines required. Problems AG-UI Solves One standard protocol for chat, UI, interrupts, and shared state Token-level streaming support Declarative UI proposals from agents Built-in HITL workflows Clear separation of concerns Cross-framework interoperability Key Use Cases 1. Interactive Copilot Applications Agents propose UI components, update shared state, and respond in real time. 2. Enterprise Workflow Automation Approvals, forms, provisioning flows, and automated processes. 3. Multi-Agent Orchestration Nested workflows and delegated subtasks. 4. Tool-Integrated Applications Agents call backend tools and instantly reflect results in the UI. How to Get Started Create an AG-UI server using Microsoft Agent Framework. Connect a frontend using libraries like CopilotKit. Implement streaming chat + declarative UI events. Deploy over HTTPS, SSE, or Web Sockets. Human-in-the-Loop (HITL) AG-UI’s interrupt model lets agents pause execution, request human approval, accept modifications, and resume safely. Common HITL workflows: Financial operations Security actions External communication drafts Compliance reviews Interrupt Event Example { "event": "interrupt", "reason": "manager_approval_required", "ui": { "type": "approval_dialog", "fields": [ { "label": "Total", "value": 1234.56 }, { "label": "Notes", "type": "text" } ] }, "shared_state": { "reportId": "R-2025-1101", "status": "pending" } } User Decision Event { "event": "user_decision", "action": "approve", "actor": "manager@contoso.com", "shared_state_diff": { "status": "approved" } } Backend HITL Logic (Python) def process_expense(report): validation = tools.expense_validate(report) if validation.requires_manager_approval: emit_interrupt( ui=approval_dialog(report), shared_state={"status": "pending"} ) decision = await wait_for_user_decision() if decision == "approve": tools.expense_post_to_erp(report) elif decision == "edit": apply_shared_state_diff(decision.diff) return process_expense(report) else: tools.expense_reject(report) Security Best Practices Protect endpoints with Azure AD Enforce HTTPS, HSTS, rate limits Use managed identity + Key Vault Apply content safety filters Isolate tool permissions Enforce approvals for sensitive actions Maintain session-level audit trails Minimize PII and enforce encryption Minimal AG-UI Example (Python) from fastapi import FastAPI from fastapi.responses import StreamingResponse import asyncio app = FastAPI() async def agent_stream(): messages = ["Hello!", "I am your AI agent.", "How can I assist you today?"] for msg in messages: yield msg + "\n" await asyncio.sleep(1) app.get("/chat") async def chat(): return StreamingResponse(agent_stream(), media_type="text/plain") app.get("/ui") async def ui_component(): return { "type": "form", "fields": [ {"label": "Name", "type": "text"}, {"label": "Email", "type": "email"} ] } Conclusion AG-UI is more than a protocol — it is the foundation for building next-generation, agent-driven applications. By standardizing how agents interact with user interfaces, it enables richer experiences, safer workflows, and faster development.336Views0likes0CommentsMaking Sense of Azure AI Foundry IQ
As enterprise teams build AI agents, the hardest design decisions often have nothing to do with models. Instead, they revolve around a more fundamental question: How should an agent access organizational knowledge in a way that is accurate, secure, and sustainable over time? Azure AI Foundry IQ is designed to address a specific version of that problem. It is not a general‑purpose data access layer, and it is not a replacement for every retrieval pattern. Understanding where it fits and where it does not is key to using it effectively. This post explores those boundaries and grounds them in concrete, enterprise‑relevant scenarios, before showing how Foundry IQ can be implemented directly via Azure AI Search APIs and SDKs. What Azure AI Foundry IQ Is (and Is Not): Azure AI Foundry IQ is a managed knowledge layer built on Azure AI Search. It allows you to define a knowledge base that spans multiple content sources such as SharePoint, Azure Blob Storage, OneLake, existing Azure AI Search indexes, and selected external sources and expose them through a single, permission‑aware endpoint. When an agent queries a knowledge base, Foundry IQ: Plans how the query should be executed Selects relevant knowledge sources Runs retrieval (optionally in multiple steps) Enforces user permissions Returns grounded results with citations A single knowledge base can be reused across multiple agents or applications, avoiding duplicated indexing and inconsistent retrieval logic. What Foundry IQ is not: It does not execute SQL queries, perform aggregations, or provide real‑time numeric accuracy. Foundry IQ retrieves unstructured text, not transactional or analytical data. Where Foundry IQ Is a Good Fit 1. Multi‑Source, Distributed Knowledge Foundry IQ is most valuable when relevant knowledge is spread across multiple systems. It removes the need for each agent to manage source‑specific routing and retrieval logic. This benefit increases as the number of sources grows; with a single source, the overhead is rarely justified. 2. Complex or Multi‑Part Questions Foundry IQ’s agentic retrieval model is designed for questions that require: Decomposition into sub‑questions Retrieval from multiple documents Synthesis across sources Its multi‑step retrieval approach is especially effective when a single document cannot answer the question on its own. 3. Reduced Custom Retrieval Engineering Foundry IQ automates indexing, chunking, vectorization, and orchestration across sources. This makes it a strong choice for teams that want to focus on agent behavior rather than building and maintaining custom RAG pipelines. 4. Enterprise Security and Governance Foundry IQ integrates with Microsoft Entra ID and supports document‑level permissions and Purview sensitivity labels where the underlying source allows it. This makes it suitable for internal or regulated scenarios where permission trimming is a hard requirement. 5. Shared Knowledge Across Multiple Agents A single knowledge base can serve multiple agents or applications, reducing operational overhead and ensuring consistent retrieval behavior across experiences. 6. High Emphasis on Answer Quality and Trust For scenarios where correctness, grounding, and citations matter more than latency or cost, Foundry IQ’s multi‑step retrieval consistently outperforms basic RAG approaches. Example Scenarios Where Foundry IQ Works Well Scenario A: Internal Policy and Operations Assistant An enterprise builds an internal assistant for store managers. Relevant information lives in: • HR policies in SharePoint • Safety procedures in Blob Storage • Operations manuals in OneLake Questions often span multiple documents. A single Foundry IQ knowledge base unifies these sources and enforces permissions automatically. Scenario B: Compliance or Regulatory Knowledge Assistant A compliance team needs answers strictly grounded in approved documents, with citations and access control. Foundry IQ ensures only authorized content is retrieved, reducing the risk of accidental data exposure. Scenario C: Shared Knowledge Layer for Multiple Internal Agents Multiple internal agents like chat assistants, workflow helpers, embedded copilots rely on the same procedural content. A shared knowledge base avoids duplicate indexing and centralizes governance. Where Foundry IQ Is Not a Good Fit 1. Simple or Single‑Source Q&A For a single, well‑defined source, Foundry IQ’s orchestration adds complexity without proportional benefit. 2. Structured or Analytical Data Queries Foundry IQ does not execute live queries or calculations. It retrieves text, not metrics. 3. Ultra‑Low Latency or High‑Throughput Requirements Agentic retrieval introduces LLM‑in‑the‑loop latency and token costs. For sub‑second responses at scale, simpler retrieval pipelines are more appropriate. 4. Highly Customized Retrieval Logic Foundry IQ abstracts the retrieval pipeline. If you require fine‑grained control over scoring or transformations, a fully custom search pipeline may be preferable. Example Scenarios Where Foundry IQ Is the Wrong Tool Scenario D: Sales and Inventory Analytics Agent Questions like “What were Q4 sales by region?” require live data queries. Indexing reports leads to stale answers. A direct SQL or analytics tool is the correct solution. Scenario E: High‑Volume, Low‑Latency Assistant Voice‑based assistants requiring sub‑second responses cannot tolerate the latency of agentic retrieval. A Common Architecture Pattern Most successful implementations combine: Foundry IQ for unstructured documents and policies Structured data tools for analytics and live queries An application or agent layer that routes questions based on intent This avoids forcing a single tool to solve every problem. Querying Foundry IQ Knowledge Bases Directly via Azure AI Search SDK You can query Azure AI Foundry IQ knowledge bases directly using the azure-search-documents Python SDK without using Foundry Agent Service. Your App → Azure AI Search SDK → Foundry IQ Knowledge Base → Grounded Results Ideal when you want full orchestration control while still benefiting from managed, agentic retrieval. How this works Note:It is a reference implementation Install pip install --pre azure-search-documents azure-identity Setup (High Level) Provision Azure AI Search (Basic or higher) Enable Azure AD and API key authentication Enable a system‑assigned managed identity Ingest Content via Knowledge Sources Blob Storage, SharePoint, or OneLake Index, indexer, data source, and skillset are created automatically Knowledge sources and KBs are created via REST API (2025‑11‑01‑preview) Create a Knowledge Base minimal reasoning → semantic retrieval only (no LLM) low / medium reasoning → requires Azure OpenAI model Search service MI needs Cognitive Services User Querying the Knowledge Base (Python) Initialize the Client from azure.identity import DefaultAzureCredential from azure.search.documents.knowledgebases import KnowledgeBaseRetrievalClient client = KnowledgeBaseRetrievalClient( endpoint="https://<search-service>.search.windows.net", knowledge_base_name="<kb-name>", credential=DefaultAzureCredential(), ) Minimal Reasoning (Fast, No LLM) from azure.search.documents.knowledgebases.models import ( KnowledgeBaseRetrievalRequest, KnowledgeRetrievalSemanticIntent, KnowledgeRetrievalMinimalReasoningEffort, KnowledgeRetrievalOutputMode, ) request = KnowledgeBaseRetrievalRequest( intents=[KnowledgeRetrievalSemanticIntent(search="your question here")], retrieval_reasoning_effort=KnowledgeRetrievalMinimalReasoningEffort(), output_mode=KnowledgeRetrievalOutputMode.EXTRACTIVE_DATA, ) response = client.retrieve(retrieval_request=request) Conversational Reasoning (LLM‑Backed) from azure.search.documents.knowledgebases.models import ( KnowledgeBaseRetrievalRequest, KnowledgeBaseMessage, KnowledgeBaseMessageTextContent, KnowledgeRetrievalLowReasoningEffort, KnowledgeRetrievalOutputMode, ) request = KnowledgeBaseRetrievalRequest( messages=[ KnowledgeBaseMessage( role="user", content=[KnowledgeBaseMessageTextContent(text="<first user question>")] ), KnowledgeBaseMessage( role="assistant", content=[KnowledgeBaseMessageTextContent(text="<assistant response>")] ), KnowledgeBaseMessage( role="user", content=[KnowledgeBaseMessageTextContent(text="<follow-up question>")] ), ], retrieval_reasoning_effort=KnowledgeRetrievalLowReasoningEffort(), output_mode=KnowledgeRetrievalOutputMode.EXTRACTIVE_DATA, ) response = client.retrieve(retrieval_request=request) Keep in mind: intents → minimal reasoning only messages → low / medium reasoning only They are not interchangeable. Processing the Response # Extracted content for msg in (response.response or []): for item in (msg.content or []): print(item.text) # Citations (handles blob, SharePoint, OneLake, and search index references) for ref in (response.references or []): ref_id = getattr(ref, "id", None) url = getattr(ref, "blob_url", None) or getattr(ref, "url", None) print(f"[{ref_id}] {url}") # Retrieval diagnostics for record in (response.activity or []): elapsed = getattr(record, "elapsed_ms", None) or "" print(f"{record.type}: {elapsed}ms") Output Modes Mode When to Use extractiveData Feed grounded chunks into your own LLM answerSynthesis Return a ready‑made answer with citations (LLM required) Security & Permissions RBAC: Search Index Data Reader with DefaultAzureCredential Permission trimming Must be enabled at ingestion (ingestionPermissionOptions) Enforced at query time by passing the user’s bearer token response = client.retrieve( retrieval_request=request, x_ms_query_source_authorization="Bearer <user-token>" ) Foundry IQ won't solve every retrieval problem. But when your agents need grounded, permission-aware answers from content scattered across SharePoint, Blob Storage, and OneLake, it handles the hard parts — so you can focus on what your agent actually does.Building an Enterprise Knowledge Copilot with Foundry IQ and Agentic Retrieval on Azure AI
Every enterprise has the same problem: knowledge scattered across SharePoint, file shares, wikis, and email. This article walks through building a knowledge copilot that unifies that data behind a single conversational interface — using Microsoft's Foundry IQ knowledge bases and the agentic retrieval engine in Azure AI Search. The Problem: Fragmented Knowledge, Fragmented Answers Enterprise AI projects today share a common pain point. Each new agent or copilot that needs to answer questions from company data must rebuild its own retrieval pipeline from scratch — data connections, chunking logic, embeddings, routing, permissions — all duplicated project after project. The result is a tangle of fragmented, siloed pipelines that are expensive to maintain and inconsistent in quality. Consider a field technician troubleshooting equipment. The answer might span a vendor manual stored in OneLake, a company repair policy on SharePoint, and a public electrical standard on the web. Traditional single-index RAG cannot orchestrate across those sources in one pass. The technician waits, the issue escalates, and productivity drops. Foundry IQ, announced in public preview in November 2025, addresses this directly. It provides a unified knowledge layer for agents — a single endpoint that replaces per-project RAG pipelines with a reusable, topic-centric knowledge base that any number of agents can consume. What Is Foundry IQ? Foundry IQ introduces four capabilities built on top of Azure AI Search: Knowledge Bases — Reusable, topic-centric collections (e.g., "employee policies," "product documentation") available directly in the Foundry portal. Rather than wiring retrieval logic into every agent, you define a knowledge base once and ground multiple agents through a single API. Indexed and Federated Knowledge Sources — A knowledge base can draw from Azure Blob Storage, OneLake, SharePoint, Azure AI Search indexes, the web, and MCP servers (MCP in private preview). Developers do not need to manage different retrieval strategies per source; the knowledge base presents a unified endpoint. Agentic Retrieval Engine — A self-reflective query engine that uses AI to plan, search, and synthesize answers with configurable retrieval reasoning effort. Enterprise-Grade Security — Document-level access control and alignment with existing permissions models. Microsoft Purview sensitivity labels are respected through the indexing and retrieval pipeline, so classified content remains governed as it flows into knowledge bases. For indexed sources, Foundry IQ automatically manages the full indexing pipeline: content is ingested, chunked, vectorized, and prepared for hybrid retrieval. When Azure Content Understanding is enabled, complex documents gain layout-aware enrichment — tables, figures, and headers are extracted and structured without extra engineering work. How Agentic Retrieval Works Single-shot RAG — one query, one index, one pass — breaks down when questions are ambiguous, multi-hop, or span several data silos. Foundry IQ's agentic retrieval engine treats retrieval as a multi-step reasoning task rather than a keyword lookup: Plan — The engine analyzes the conversation and decomposes the query into focused sub-queries, deciding which knowledge sources to consult. Search — Sub-queries run concurrently against selected sources using keyword, vector, or hybrid techniques. Rank — Semantic reranking identifies the most relevant results. Reflect — If the information gathered is insufficient, the engine iterates — issuing follow-up queries autonomously. Synthesize — Results are unified into a natural-language answer with source references. Developers control this behaviour through a high-level retrieval reasoning effort setting. Lower effort suits fast, lightweight lookups; higher effort enables iterative search and richer planning across the entire data estate. Real-world impact: AT&T integrated Azure AI Search and retrieval-augmented generation into its multi-agent framework, reducing customer resolution times by 33 percent, cutting average handle time by nearly 10 percent, and scaling 71 AI solutions to 100,000 employees. Ontario Power Generation used agentic retrieval to sift through over 40 years of nuclear operating experience, enabling data-driven decision-making and helping new staff learn from decades of institutional knowledge. Architecture Overview Step-by-Step: Setting Up the Knowledge Copilot Provision Resources You need an Azure AI Search service (Basic tier or above), a Microsoft Foundry project, an embedding model deployment (e.g., text-embedding-3-large), and an LLM deployment (e.g., gpt-4.1) for query planning and answer generation. .NET 8 or later is required for the C# SDK. Create a Knowledge Base in Azure AI Search Using the Azure.Search.Documents preview SDK, define an index, a knowledge source pointing to your data, and a knowledge base with OutputMode set to AnswerSynthesis for natural-language answers with citations. The following C# snippet (adapted from the official Azure AI Search quickstart) shows the knowledge base creation: using Azure; using Azure.Identity; using Azure.Search.Documents.Indexes; var searchEndpoint = "https://<your-service>.search.windows.net"; var aoaiEndpoint = "https://<your-resource>.openai.azure.com/"; var indexClient = new SearchIndexClient( new Uri(searchEndpoint), new DefaultAzureCredential()); // Configure the LLM for query planning and answer synthesis var openAiParameters = new AzureOpenAIVectorizerParameters { ResourceUri = new Uri(aoaiEndpoint), DeploymentName = "gpt-4.1", ModelName = "gpt-4.1" }; var model = new KnowledgeBaseAzureOpenAIModel(openAiParameters); // Create the knowledge base with answer synthesis enabled var knowledgeBase = new KnowledgeBase("<knowledge-base-name>") { OutputMode = KnowledgeBaseOutputMode.AnswerSynthesis, AnswerInstructions = "Provide a concise answer based on the retrieved documents.", Models = { model } }; await indexClient.CreateOrUpdateKnowledgeBaseAsync(knowledgeBase); Connect an Agent to the Knowledge Base via MCP Each knowledge base exposes a Model Context Protocol (MCP) endpoint that MCP-compatible agents can call. The Foundry IQ-specific agent SDK currently offers full code samples for Python and REST API, but you can use the general-purpose MCP tooling in C# to achieve the same connection. The following pattern is drawn from the official Microsoft Learn documentation on MCP tools with Foundry Agents: using Azure.AI.Projects; using Azure.Identity; var endpoint = "https://<your-resource>.services.ai.azure.com/api/projects/<your-project>"; var model = "gpt-4.1-mini"; // Point the MCP tool at the knowledge base's MCP endpoint var mcpTool = new MCPToolDefinition( serverLabel: "enterprise_kb", serverUrl: "https://<search-service>.search.windows.net" + "/knowledgebases/<kb-name>/mcp?api-version=2025-11-01-preview"); mcpTool.AllowedTools.Add("knowledge_base_retrieve"); // Create the agent with the MCP tool attached var projectClient = new AIProjectClient(new Uri(endpoint), new DefaultAzureCredential()); var agentVersion = await projectClient.AgentAdministrationClient .CreateAgentVersionAsync( "enterprise-copilot", new ProjectsAgentVersionCreationOptions( new DeclarativeAgentDefinition(model) { Instructions = "You are a company knowledge assistant. " + "Always search the knowledge base before answering. " + "If the knowledge base has no answer, say so clearly.", Tools = { mcpTool } })); The agent instructions are critical — explicitly requiring the agent to use the knowledge base prevents it from answering purely from the LLM's training data. Query the Copilot Once the agent is published, your application layer simply sends user questions via the Azure AI Projects SDK or REST API. The agent autonomously invokes the knowledge base tool, retrieves grounded context, and returns an answer with citations referencing the original documents. Trade-offs and Considerations Dimension Detail Maturity Foundry IQ is in public preview — not recommended for production workloads without accepting preview SLA terms. Cost Agentic retrieval has two billing streams: token-based billing from Azure AI Search for retrieval, and billing from Azure OpenAI for query planning and answer synthesis. Latency vs. Quality Higher retrieval reasoning effort produces better answers but adds latency due to iterative search. For sub-second lookups, use minimal effort; for complex multi-hop questions, use medium. C# SDK Coverage The Foundry IQ–specific agent connection SDK currently supports Python and REST API. C# support is available for the underlying agentic retrieval queries and for general MCP tool integration. Security Document-level ACLs from SharePoint are enforced at query time. For per-user authorization in Foundry Agent Service, the current preview does not support per-request MCP headers — use the Azure OpenAI Responses API as an alternative. Key Takeaways Foundry IQ transforms enterprise RAG from a bespoke, per-project exercise into a managed, reusable knowledge layer. You define a knowledge base once, connect it to your data sources, and any number of agents or apps can consume it. The agentic retrieval engine handles query planning, multi-source search, semantic reranking, and iterative refinement — capabilities that previously required significant custom engineering. For .NET developers, the Azure AI Search C# SDK and the MCP tooling in the Agent Framework provide the building blocks to integrate this into your applications today. References: What is Foundry IQ? Create a knowledge base in Azure AI Search Foundry IQ: Unlocking ubiquitous knowledge for agentsWhen Anthropic’s Managed Agents Meet Microsoft Hosted Agents
Let’s start from a real engineering pain point. In their engineering blog Managed Agents, Anthropic describes a sobering observation: while building the agent scaffolding (the “harness”) for Claude Sonnet 4.5, they noticed the model suffered from “context anxiety,” so they added context-reset logic into the harness. But when the same harness ran on the more capable Claude Opus 4.5, those resets became dead weight — the stronger model no longer needed them, yet the harness was actively holding it back. This is the fundamental dilemma of the harness: it encodes assumptions about the current model’s capabilities, and those assumptions rot quickly as models evolve. That’s not a minor concern. In an era when AI capabilities shift qualitatively every few months, any infrastructure tightly coupled to a specific model’s abilities becomes a bottleneck on engineering progress. Anthropic’s Answer: A Three-Part Decoupled Architecture Anthropic’s answer borrows from a problem operating systems solved decades ago: how do you provide stable abstractions for programs that haven’t been imagined yet? The answer is virtualization. Just as the OS virtualizes physical hardware into stable abstractions — processes, files, sockets — Managed Agents virtualizes the Agent runtime into three independent interface layers. For readability this post follows Anthropic’s own metaphors — Brain / Hands / Session — which map to the more conventional engineering terms reasoning orchestrator / execution sandbox / durable event log: Note: Brain, Hands, and Session are not standard industry terminology — they are metaphors Anthropic uses in its engineering blog. In more conventional engineering vocabulary they correspond, respectively, to the agent reasoning loop / orchestrator, the tool executor / execution sandbox, and the durable event log / state store. The rest of this post uses the two styles interchangeably. Brain (the reasoning orchestrator): a stateless reasoning loop This layer is the harness itself — it calls the model and routes tool calls; think of it as the agent’s reasoning orchestrator. Key design point: it must be stateless. All its state comes from the event log; as long as it can call wake(sessionId) to resume, any harness crash is recoverable. This means the harness can evolve independently as model capabilities evolve, without disturbing in-flight tasks. Hands (the execution sandbox layer): replaceable execution sandboxes This layer holds the execution environments the orchestrator calls into — Python REPLs, shells, HTTP clients, even remote containers — i.e. the tool executor / execution sandbox. The contract is brutally simple: execute(name, input) -> string Just that one interface. The orchestrator doesn’t care whether a sandbox is a local process or a remote container; if a sandbox crashes, it is treated as an ordinary tool error — the model decides whether to retry on a fresh sandbox. This is the “cattle, not pets” philosophy applied to AI engineering . Session (the durable event log): externalized memory This layer is an append-only, durable event log. It is not the model’s context window. This distinction matters enormously. When a task outgrows the context window, the harness can use getEvents(start, end) to slice history on demand, and filter, summarize, or transform it before feeding back to the model — all without changing the underlying interface. The event log also plays a key role in credential isolation: when execute calls are logged, the Vault redacts first, so raw tokens never enter the log — and never enter the model’s context window. Performance gains This decoupling yields measurable wins: Median time-to-first-token (TTFT) down ~60% P95 latency improved by more than 90% The reason: the old architecture had to provision a container before inference could begin. After decoupling, inference can start as soon as the event log is readable, with sandboxes provisioned lazily on demand Microsoft’s Answer: Foundry Agent Service and Hosted Agents Microsoft gives the enterprise-grade infrastructure answer in Microsoft Foundry. Foundry Agent Service offers three Agent types: Type Requires Code? Hosting Best For Prompt Agent No Fully managed Rapid prototyping Workflow Agent No (optional YAML) Fully managed Multi-step automation Hosted Agent Yes Containerized hosting Fully custom logic This post focuses on Hosted Agents. They let developers package their own Agent code (LangGraph, Microsoft Agent Framework, or fully custom) as a container image and deploy it on Microsoft’s fully managed, pay-per-use infrastructure. Hosting Adapter: the key abstraction The core abstraction in Hosted Agents is the Hosting Adapter. It does three things: Local testing: starts an HTTP server at localhost:8088; no containerization needed for local runs. Protocol translation: automatically converts between Foundry’s Responses API format and Agent Framework’s native data structures. Observability: plugs into OpenTelemetry and exports traces, metrics, and logs to Azure Monitor. Microsoft Agent Framework: the model-agnostic orchestration layer Microsoft Agent Framework (9.7k , now generally available) is a multi-language, multi-provider Agent orchestration framework that supports: Azure OpenAI, OpenAI, GitHub Copilot Anthropic Claude AWS Bedrock, Ollama Protocol standards like A2A, AG-UI, MCP This matters a lot: Microsoft’s own Agent framework natively supports Anthropic’s Claude models, providing an official path for cross-ecosystem integration. This Project: Two Philosophies Shake Hands in Code Now let’s see how this real project fuses the two architectural philosophies. Project layout HostedAgentDemo/ ├── main.py # reasoning orchestrator (a.k.a. Brain): main agent loop ├── agent.yaml # Hosted Agent declaration ├── azure.yaml # azd deployment config ├── Dockerfile # containerization ├── harness/ │ ├── session.py # durable event log (a.k.a. Session) │ ├── sandbox.py # execution sandbox pool (a.k.a. Hands) │ └── vault.py # credential vault └── requirements.txt Reasoning orchestrator (a.k.a. Brain): FoundryChatClient + Agent Framework # main.py (excerpt) from agent_framework import Agent from agent_framework.foundry import FoundryChatClient from agent_framework_foundry_hosting import ResponsesHostServer async with DefaultAzureCredential() as credential: client = FoundryChatClient( project_endpoint=PROJECT_ENDPOINT, model=MODEL_DEPLOYMENT_NAME, credential=credential, allow_preview=True, ) agent = Agent( client, instructions=INSTRUCTIONS, name="ManagedStyleAgent", tools=[execute, list_tools, get_events, emit_note], ) server = ResponsesHostServer(agent) await server.run_async() What’s happening here? FoundryChatClient: the Foundry model client from Microsoft Agent Framework; talks to a model deployed on Microsoft Foundry. Agent: the stateless reasoning orchestrator (what Anthropic calls the “Brain”), with a fixed toolset of four: execute, list_tools, get_events, emit_note. ResponsesHostServer: the Hosting Adapter; exposes the Agent as an HTTP service compatible with Foundry’s Responses API. The orchestrator’s toolset follows Anthropic Managed Agents’ minimalism strictly — every capability funnels through the single gateway execute(name, input_json); the reasoning layer knows nothing about concrete sandbox implementations. Execution sandbox layer (a.k.a. Hands): a cattle-style sandbox pool # harness/sandbox.py (core logic) def execute(self, name: str, input: dict[str, Any]) -> str: """The one and only contract between the orchestrator and the sandboxes.""" if name not in self._tools: return f"ERROR: unknown tool '{name}'. Available: {self.list_tools()}" sandbox_id = self.provision(kind=name) try: out = self._tools[name](input or {}, self._vault) out = self._vault.redact(out) # redact credentials return out except Exception as e: return f"ERROR: sandbox '{sandbox_id}' failed: {type(e).__name__}: {e}" finally: self.retire(sandbox_id) # forcibly destroy after every call Look at the finally block: every execute call destroys its sandbox afterwards, success or failure. That guarantees sandboxes are genuinely stateless units — leftover processes, temp files, in-memory state all vanish with the sandbox. Built-in sandboxes include: python_exec: isolated Python subprocess (15s timeout, no leaked env vars) shell_exec: argv-list execution (no shell metacharacter injection) http_fetch: auth headers injected via the Vault proxy Durable event log (a.k.a. Session): externalized memory # harness/session.py (core interface) class SessionStore: def emit_event(self, session_id, type, payload) -> SessionEvent: """Append-only — never overwritten, never deleted.""" def get_events(self, session_id, start=0, end=None) -> list[SessionEvent]: """Positional slice. The harness can transform before passing to the model.""" def wake(self, session_id) -> list[SessionEvent]: """Recovery entry point after harness crash.""" The event log is a .jsonl append-only file — one JSON event per line. In production you can drop in Azure Cosmos DB, Event Hub, or any durable store; the interface doesn’t change. Vault: credentials never touch the model # harness/vault.py class CredentialVault: def build_auth_headers(self, logical_name: str) -> dict[str, str]: token = self.resolve(logical_name) return {"Authorization": f"Bearer {token}"} def redact(self, value: Any) -> Any: """Replace every known secret in logs and tool return values.""" s = str(value) for secret in self._secrets.values(): if secret and secret in s: s = s.replace(secret, "***REDACTED***") return s The model references credentials by logical name: execute("http_fetch", {"url": "...", "credential": "github"}) — it only knows the logical name "github". The real token is injected by the Vault inside the sandbox, and tool return values are redacted before being written to the event log. Deploying: From Local to Azure in One Command # 1. Install the azd Agent extension azd ext install azure.ai.agents # 2. Test locally (no container needed) python main.py # → Managed-style Agent running on http://localhost:8088 # 3. Deploy to Azure (Provision + Build + Deploy) azd up Architectures Compared: Two Ecosystems in Philosophical Resonance Dimension Anthropic Managed Agents Microsoft Foundry Hosted Agents Core abstraction Reasoning orchestrator / execution sandbox / durable event log (Anthropic: Brain / Hands / Session) Hosting Adapter + Agent Framework Sandbox strategy Cattle (destroyed after use) Container (managed lifecycle) Credential security Vault proxy injection, invisible to the model Managed Identity + RBAC Context management External event log, sliced on demand Responses API session management Observability Event log + custom OpenTelemetry → Azure Monitor Scaling Many orchestrators × many sandboxes, concurrent minReplicas / maxReplicas Cross-model support Claude model family Many providers (Claude included) The core philosophy of both architectures aligns tightly: decouple reasoning (the orchestrator), tool execution (the sandbox layer), and memory (the event log) so each layer can evolve independently. The difference is emphasis: Anthropic prioritizes interface stability — so today’s infrastructure can run tomorrow’s stronger models. Microsoft prioritizes enterprise-grade operations — so agents get production-grade security, scaling, and observability. That’s the value of this project: it proves the two philosophies can live together in one codebase. Running it # Clone and configure git clone <your-repo> cd HostedAgentDemo cp .env.example .env # Edit .env with FOUNDRY_PROJECT_ENDPOINT and MODEL_DEPLOYMENT_NAME # Install dependencies pip install -r requirements.txt # Run locally python main.py # Test a conversation curl http://localhost:8088/responses \ -H "Content-Type: application/json" \ -d '{"input": [{"role": "user", "content": "List the tools you can use"}]}' # Deploy to Azure azd ext install azure.ai.agents azd up Summary Back in 2016 the industry was still arguing whether microservices were over-engineering. Today nobody doubts the value of service decoupling at scale. I believe 2025–2026 is the “microservices moment” for Agent engineering — people are starting to realize that an Agent that couples reasoning, tool execution, and state memory inside a single monolithic container simply cannot keep pace with model evolution. Anthropic’s Managed Agents supplies the architectural philosophy; Microsoft’s Foundry Hosted Agents supplies the enterprise infrastructure; and this open-source project shows that they are not an either/or choice — they are complementary, and they make each other better. References Sample Code https://github.com/microsoft/Agent-Framework-Samples/tree/main/09.Cases/maf_harness_managed_hosted_agent Anthropic Engineering Blog. Managed Agents https://www.anthropic.com/engineering/managed-agents Microsoft Learn – Hosted agents in Foundry Agent Service (preview) https://learn.microsoft.com/en-us/azure/foundry/agents/concepts/hosted-agents?view=foundry Microsoft Foundry Samples. agent-framework Python hosted-agent samples https://github.com/microsoft-foundry/foundry-samples/tree/main/samples/python/hosted-agents/agent-framework Microsoft Agent Framework https://github.com/microsoft/agent-framework Microsoft Agent Framework Sample https://github.com/microsoft/agent-framework-samples Microsoft Learn – What is Microsoft Foundry Agent Service https://learn.microsoft.com/en-us/azure/foundry/agents/overview706Views0likes0CommentsStop Experimenting, Start Building: AI Apps & Agents Dev Days Has You Covered
The AI landscape has shifted. The question is no longer “Can we build AI applications?” it’s “Can we build AI applications that actually work in production?” Demos are easy. Reliable, scalable, resilient AI systems that handle real-world complexity? That’s where most teams struggle. If you’re an AI developer, software engineer, or solution architect who’s ready to move beyond prototypes and into production-grade AI, there’s a series built specifically for you. What Is AI Apps & Agents Dev Days? AI Apps & Agents Dev Days is a monthly technical series from Microsoft Reactor, delivered in partnership with Microsoft and NVIDIA. You can explore the full series at https://developer.microsoft.com/en-us/reactor/series/s-1590/ This isn’t a slide deck marathon. The series tagline says it best: “It’s not about slides, it’s about building.” Each session tackles real-world challenges, shares patterns that actually work, and digs into what’s next in AI-driven app and agent design. You bring your curiosity, your code, and your questions. You leave with something you can ship. The sessions are led by experienced engineers and advocates from both Microsoft and NVIDIA, people like Pamela Fox, Bruno Capuano, Anthony Shaw, Gwyneth Peña-Siguenza, and solutions architects from NVIDIA’s Cloud AI team. These aren’t theorists; they’re practitioners who build and ship the tools you use every day. What You’ll Learn The series covers the full spectrum of building AI applications and agent-based systems. Here are the key themes: Building AI Applications with Azure, GitHub, and Modern Tooling Sessions walk through how to wire up AI capabilities using Azure services, GitHub workflows, and the latest SDKs. The focus is always on code-first learning, you’ll see real implementations, not abstract architecture diagrams. Designing and Orchestrating AI Agents Agent development is one of the series’ strongest threads. Sessions cover how to build agents that orchestrate long-running workflows, persist state automatically, recover from failures, and pause for human-in-the-loop input, without losing progress. For example, the session “AI Agents That Don’t Break Under Pressure” demonstrates building durable, production-ready AI agents using the Microsoft Agent Framework, running on Azure Container Apps with NVIDIA serverless GPUs. Scaling LLM Inference and Deploying to Production Moving from a working prototype to a production deployment means grappling with inference performance, GPU infrastructure, and cost management. The series covers how to leverage NVIDIA GPU infrastructure alongside Azure services to scale inference effectively, including patterns for serverless GPU compute. Real-World Architecture Patterns Expect sessions on container-based deployments, distributed agent systems, and enterprise-grade architectures. You’ll learn how to use services like Azure Container Apps to host resilient AI workloads, how Foundry IQ fits into agent architectures as a trusted knowledge source, and how to make architectural decisions that balance performance, cost, and scalability. Why This Matters for Your Day Job There’s a critical gap between what most AI tutorials teach and what production systems actually require. This series bridges that gap: Production-ready patterns, not demos. Every session focuses on code and architecture you can take directly into your projects. You’ll learn patterns for state persistence, failure recovery, and durable execution — the things that break at 2 AM. Enterprise applicability. The scenarios covered — travel planning agents, multi-step workflows, GPU-accelerated inference — map directly to enterprise use cases. Whether you’re building internal tooling or customer-facing AI features, the patterns transfer. Honest trade-off discussions. The speakers don’t shy away from the hard questions: When do you need serverless GPUs versus dedicated compute? How do you handle agent failures gracefully? What does it actually cost to run these systems at scale? Watch On-Demand, Build at Your Own Pace Every session is available on-demand. You can watch, pause, and build along at your own pace, no need to rearrange your schedule. The full playlist is available at This is particularly valuable for technical content. Pause a session while you replicate the architecture in your own environment. Rewind when you need to catch a configuration detail. Build alongside the presenters rather than just watching passively. What You’ll Walk Away Wit After working through the series, you’ll have: Practical agent development skills — how to design, orchestrate, and deploy AI agents that handle real-world complexity, including state management, failure recovery, and human-in-the-loop patterns Production architecture patterns — battle-tested approaches for deploying AI workloads on Azure Container Apps, leveraging NVIDIA GPU infrastructure, and building resilient distributed systems Infrastructure decision-making confidence — a clearer understanding of when to use serverless GPUs, how to optimise inference costs, and how to choose the right compute strategy for your workload Working code and reference implementations — the sessions are built around live coding and sample applications (like the Travel Planner agent demo), giving you starting points you can adapt immediately A framework for continuous learning — with new sessions each month, you’ll stay current as the AI platform evolves and new capabilities emerge Start Building The AI applications that will matter most aren’t the ones with the flashiest demos — they’re the ones that work reliably, scale gracefully, and solve real problems. That’s exactly what this series helps you build. Whether you’re designing your first AI agent system or hardening an existing one for production, the AI Apps & Agents Dev Days sessions give you the patterns, tools, and practical knowledge to move forward with confidence. Explore the series at https://developer.microsoft.com/en-us/reactor/series/s-1590/ and start watching the on-demand sessions at the link above. The best time to level up your AI engineering skills was yesterday. The second-best time is right now and these sessions make it easy to start.Join our free livestream series on hosting agents in Microsoft Foundry
Join us for a new 3‑part livestream series where we deploy AI agents on Microsoft Foundry using Microsoft Agent Framework and LangChain/LangGraph, then level them up with tools, observability, and evals. You'll learn how to: Deploy Python agents to Foundry Hosted agents using the Azure Developer CLI Build hosted agents with Microsoft Agent Framework, including Foundry IQ integration Build hosted agents with LangChain + LangGraph, including built-in tools like Web Search Run quality and safety evaluations: continuous evals, scheduled evals, guardrails, and red-teaming Throughout the series, we’ll use Python for all examples and share full code so you can run everything yourself in your own Foundry projects. 👉 Register for the full series. Spanish speaker? ¡Tendremos una serie para hispanohablantes! Regístrese aquí In addition to the live streams, you can also join Join the Microsoft Foundry Discord to ask follow-up questions after each stream. If you are new to generative AI with Python, start with our 9-part Python + AI series, which covers topics such as LLMs, embeddings, RAG, tool calling, MCP, and agents. If you are new to Microsoft Agent Framework, watch our 6-part Python + Agent series which dives deep into agents and workflows. To learn more about each live stream or register for individual sessions, scroll down: Host your agents on Foundry: Microsoft Agent Framework 27 April, 2026 | 5:00 PM - 6:00 PM (UTC) Coordinated Universal Time Register for the stream on Reactor In our first session, we'll deploy agents built with Microsoft Agent Framework (the successor of Autogen and Semantic Kernel). Starting with a simple agent, we'll add Foundry tools like Code Interpreter, ground the agent in enterprise data with Foundry IQ, and finally deploy multi-agent workflows. Along the way, we'll use the Foundry UI to interact with the hosted agent, testing it out in the playground and observing the traces from the reasoning and tool calls. Host your agents on Foundry: LangChain + LangGraph 29 April, 2026 | 5:00 PM - 6:00 PM (UTC) Coordinated Universal Time Register for the stream on Reactor In our second session, we'll deploy agents built with the popular open-source libraries LangChain and LangGraph. Starting with a simple agent, we'll add Foundry tools like Bing Web Search, ground the agent in Foundry IQ, then deploy more complex agents using the LangGraph orchestration framework. Along the way, we'll use the Foundry UI to interact with the hosted agent, testing it out in the playground and observing the traces from the reasoning and tool calls. Host your agents on Foundry: Quality & safety evaluations 30 April, 2026 | 5:00 PM - 6:00 PM (UTC) Coordinated Universal Time Register for the stream on Reactor In our third session, we'll ensure that our AI agents are producing high-quality outputs and operating safely and responsibly. First we'll explore what it means for agent outputs to be high quality, using built-in evaluators to check overall task adherence and then building custom evaluators for domain-specific checks. With Foundry hosted agents, we can run bulk evaluations on demand, set up scheduled evaluations, and even enable continuous evaluation on a subset of live agent traces. Next we'll discuss safety systems that can be layered on top of agents and audit agents for potential safety risks. To improve compliance with an organization's goals, we can configure custom policies and guardrails that can be shared across agents. Finally, we can ensure that adversarial inputs can't produce unsafe outputs by running automated red-teaming scans on agents, and even schedule those to run regularly as well. With all of these evaluation and compliance features available in Foundry, you can have more confidence hosting your agents in production.If You're Building AI on Azure, ECS 2026 is Where You Need to Be
Let me be direct: there's a lot of noise in the conference calendar. Generic cloud events. Vendor showcases dressed up as technical content. Sessions that look great on paper but leave you with nothing you can actually ship on Monday. ECS 2026 isn't that. As someone who will be on stage at Cologne this May, I can tell you the European Collaboration Summit combined with the European AI & Cloud Summit and European Biz Apps Summit is one of the few events I've seen where engineers leave with real, production-applicable knowledge. Three days. Three summits. 3,000+ attendees. One of the largest Microsoft-focused events in Europe, and it keeps getting better. If you're building AI systems on Azure, designing cloud-native architectures, or trying to figure out how to take your AI experiments to production — this is where the conversation is happening. What ECS 2026 Actually Is ECS 2026 runs May 5–7 at Confex in Cologne, Germany. It brings together three co-located summits under one roof: European Collaboration Summit — Microsoft 365, Teams, Copilot, and governance European AI & Cloud Summit — Azure architecture, AI agents, cloud security, responsible AI European BizApps Summit — Power Platform, Microsoft Fabric, Dynamics For Azure engineers and AI developers, the European AI & Cloud Summit is your primary destination. But don't ignore the overlap, some of the most interesting AI conversations happen at the intersection of collaboration tooling and cloud infrastructure. The scale matters here: 3,000+ attendees, 100+ sessions, multiple deep-dive tracks, and a speaker lineup that includes Microsoft executives, Regional Directors, and MVPs who have built, broken, and rebuilt production systems. The Azure + AI Track - What's Actually On the Agenda The AI & Cloud Summit agenda is built around real technical depth. Not "intro to AI" content, actual architecture decisions, patterns that work, and lessons from things that didn't. Here's what you can expect: AI Agents and Agentic Systems This is where the energy is right now, and ECS is leaning in. Expect sessions covering how to design agent workflows, chain reasoning steps, handle memory and state, and integrate with Azure AI services. Marco Casalaina, VP of Products for Azure AI at Microsoft, is speaking if you want to understand the direction of the Azure AI platform from the people building it, this is a direct line. Azure Architecture at Scale Cloud-native patterns, microservices, containers, and the architectural decisions that determine whether your system holds up under real load. These sessions go beyond theory you'll hear from engineers who've shipped these designs at enterprise scale. Observability, DevOps, and Production AI Getting AI to production is harder than the demos suggest. Sessions here cover monitoring AI systems, integrating LLMs into CI/CD pipelines, and building the operational practices that keep AI in production reliable and governable. Cloud Security and Compliance Security isn't optional when you're putting AI in front of users or connecting it to enterprise data. Tracks cover identity, access patterns, responsible AI governance, and how to design systems that satisfy compliance requirements without becoming unmaintainable. Pre-Conference Deep Dives One underrated part of ECS: the pre-conference workshops. These are extended, hands-on sessions typically 3–6 hours that let you go deep on a single topic with an expert. Think of them as intensive short courses where you can actually work through the material, not just watch slides. If you're newer to a particular area of Azure AI, or you want to build fluency in a specific pattern before the main conference sessions, these are worth the early travel. The Speaker Quality Is Different Here The ECS speaker roster includes Microsoft executives, Microsoft MVPs, and Regional Directors, people who have real accountability for the products and patterns they're presenting. You'll hear from over 20 Microsoft speakers: Marco Casalaina — VP of Products, Azure AI at Microsoft Adam Harmetz — VP of Product at Microsoft, Enterprise Agent And dozens of MVPs and Regional Directors who are in the field every day, solving the same problems you are. These aren't keynote-only speakers — they're in the session rooms, at the hallway track, available for real conversations. The Hallway Track Is Not a Cliché I know "networking" sounds like a corporate afterthought. At ECS it genuinely isn't. When you put 3,000 practitioners, engineers, architects, DevOps leads, security specialists in one venue for three days, the conversations between sessions are often more valuable than the sessions themselves. You get candid answers to "how are you actually handling X in production?" that you won't find in documentation. The European Microsoft community is tight-knit and collaborative. ECS is where that community concentrates. Why This Matters Right Now We're in a period where AI development is moving fast but the engineering discipline around it is still maturing. Most teams are figuring out: How to move from AI prototype to production system How to instrument and observe AI behaviour reliably How to design agent systems that don't become unmaintainable How to satisfy security and compliance requirements in AI-integrated architectures ECS 2026 is one of the few places where you can get direct answers to these questions from people who've solved them — not theoretically, but in production, on Azure, in the last 12 months. If you go, you'll come back with practical patterns you can apply immediately. That's the bar I hold events to. ECS consistently clears it. Register and Explore the Agenda Register for ECS 2026: ecs.events Explore the AI & Cloud Summit agenda: cloudsummit.eu/en/agenda Dates: May 5–7, 2026 | Location: Confex, Cologne, Germany Early registration is worth it the pre-conference workshops fill up. And if you're coming, find me, I'll be the one talking too much about AI agents and Azure deployments. See you in Cologne.Published agent from Foundry doesn't work at all in Teams and M365
I've switched to the new version of Azure AI Foundry (New) and created a project there. Within this project, I created an Agent and connected two custom MCP servers to it. The agent works correctly inside Foundry Playground and responds to all test queries as expected. My goal was to make this agent available for my organization in Microsoft Teams / Microsoft 365 Copilot, so I followed all the steps described in the official Microsoft documentation: https://learn.microsoft.com/en-us/azure/ai-foundry/agents/how-to/publish-copilot?view=foundry Issue description The first problems started at Step 8 (Publishing the agent). Organization scope publishing I published the agent using Organization scope. The agent appeared in Microsoft Admin Center in the list of agents. However, when an administrator from my organization attempted to approve it, the approval always failed with a generic error: “Sorry, something went wrong” No diagnostic information, error codes, or logs were provided. We tried recreating and republishing the agent multiple times, but the result was always the same. Shared scope publishing As a workaround, I published the agent using Shared scope. In this case, the agent finally appeared in Microsoft Teams and Microsoft 365 Copilot. I can now see the agent here: Microsoft Teams → Copilot Microsoft Teams → Applications → Manage applications However, this revealed the main issue. Main problem The published agent cannot complete any query in Teams, despite the fact that: The agent works perfectly in Foundry Playground The agent responds correctly to the same prompts before publishing In Teams, every query results in messages such as: “Sorry, something went wrong. Try to complete a query later.” Simplification test To exclude MCP or instruction-related issues, I performed the following: Disabled all MCP tools Removed all complex instructions Left only a minimal system prompt: “When the user types 123, return 456” I then republished the agent. The agent appeared in Teams again, but the behavior did not change — it does not respond at all. Permissions warning in Teams When I go to: Teams → Applications → Manage Applications → My agent → View details I see a red warning label: “Permissions needed. Ask your IT admin to add InfoConnect Agent to this team/chat/meeting.” This message is confusing because: The administrator has already added all required permissions All relevant permissions were granted in Microsoft Entra ID Admin consent was provided Because of this warning, I also cannot properly share the agent with my colleagues. Additional observation I have a similar agent configured in Copilot Studio: It shows the same permissions warning However, that agent still responds correctly in Teams It can also successfully call some MCP tools This suggests that the issue is specific to Azure AI Foundry agents, not to Teams or tenant-wide permissions in general. Steps already taken to resolve the issue Configured all required RBAC roles in Azure Portal according to: https://learn.microsoft.com/en-us/azure/ai-foundry/concepts/rbac-foundry?view=foundry-classic During publishing, an agent-bot application was automatically created. I added my account to this bot with the Azure AI User role I also assigned Azure AI User to: The project’s Managed Identity The project resource itself Verified all permissions related to AI agents publishing in: Microsoft Admin Center Microsoft Teams Admin Center Simplified and republished the agent multiple times Deleted the automatically created agent-bot and allowed Foundry to recreate it Created a new Foundry project, configured several simple agents, and published them — the same issue occurs Tried publishing with different models: gpt-4.1, o4-mini Manually configured permissions in: Microsoft Entra ID → App registrations / Enterprise applications → API permissions Added both Delegated and Application permissions and granted Admin consent Added myself and my colleagues as Azure AI User in: Foundry → Project → Project users Followed all steps mentioned in this related discussion: https://techcommunity.microsoft.com/discussions/azure-ai-foundry-discussions/unable-to-publish-foundry-agent-to-m365-copilot-or-teams/4481420 Questions How can I make a Foundry agent work correctly in Microsoft Teams? Why does the agent fail to process requests in Teams while working correctly in Foundry? What does the “Permissions needed” warning actually mean for Foundry agents? How can I properly share the agent with other users in my organization? Any guidance, diagnostics, or clarification on the correct publishing and permission model for Foundry agents in Teams would be greatly appreciated.Solved1.5KViews1like5Comments