<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>Educator Developer Blog articles</title>
    <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/bg-p/EducatorDeveloperBlog</link>
    <description>Educator Developer Blog articles</description>
    <pubDate>Sun, 31 May 2026 21:35:37 GMT</pubDate>
    <dc:creator>EducatorDeveloperBlog</dc:creator>
    <dc:date>2026-05-31T21:35:37Z</dc:date>
    <item>
      <title>Building a hands-free voice concierge with Microsoft Foundry Voice Live and a Hosted Agent</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/building-a-hands-free-voice-concierge-with-microsoft-foundry/ba-p/4523960</link>
      <description>&lt;P&gt;This post walks through a small, working sample that wires the browser microphone to&amp;nbsp;&lt;STRONG&gt;Azure AI Speech Voice Live&lt;/STRONG&gt;, binds the realtime session to a &lt;STRONG&gt;Foundry hosted agent&lt;/STRONG&gt;, and lets the agent answer travel questions using tool calls. The full source, infrastructure, and labs live in the repository linked at the end.&lt;/P&gt;
&lt;H2&gt;Why this combination matters&lt;/H2&gt;
&lt;P&gt;Voice user interfaces have historically been hard to build well. Streaming audio, partial transcripts, barge-in, voice activity detection, tool dispatch, and audio playback have traditionally meant stitching together five or six services. The combination of Voice Live and a Foundry hosted agent collapses that into &lt;STRONG&gt;one realtime WebSocket session&lt;/STRONG&gt; with a single binding field.&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Voice Live&lt;/STRONG&gt; owns the audio loop: speech to text, neural text to speech, semantic turn detection, noise suppression, and echo cancellation.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;The Foundry hosted agent&lt;/STRONG&gt; owns the brain: instructions, memory, model selection, evaluators, and tool calling.&lt;/LI&gt;
&lt;LI&gt;The link between them is one query parameter on the WebSocket URL.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;What this means in practice: the browser never sees a model API key, never instantiates a tool, and never owns the agent prompt. The browser does microphone capture and audio playback. Everything else lives server-side.&lt;/P&gt;
&lt;H2&gt;The scenario&lt;/H2&gt;
&lt;P&gt;The sample is called &lt;STRONG&gt;Contoso Travel Concierge&lt;/STRONG&gt;. The user is mid-journey, hands and eyes busy, and wants to ask things like:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;What is the weather in Tokyo this weekend?&lt;/LI&gt;
&lt;LI&gt;Is BA005 from Heathrow on time?&lt;/LI&gt;
&lt;LI&gt;What time is check-in at the Marriott Marquis?&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Each question triggers a tool call on the hosted agent. The reply is short, speakable, and synthesised back to the user in under a second on a warm connection.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;Architecture&lt;/H2&gt;
&lt;P&gt;There are four moving parts. Three of them are managed Azure services. Only the broker is your code.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;Browser client&lt;/STRONG&gt; – captures PCM16 audio at 24 kHz and streams it over a WebSocket to the broker. Plays back audio chunks the broker forwards from Voice Live.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Session broker&lt;/STRONG&gt; (FastAPI) – authenticates to Azure with &lt;CODE&gt;DefaultAzureCredential&lt;/CODE&gt;, builds the Voice Live WebSocket URL with a short-lived bearer token, and relays frames in both directions.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Voice Live&lt;/STRONG&gt; – the Azure AI Speech realtime endpoint. Transcribes the user, hands the text to the bound agent, and synthesises the agent’s reply.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Foundry hosted agent&lt;/STRONG&gt; – a prompt-kind agent in Azure AI Foundry with instructions, tool definitions, and the &lt;CODE&gt;microsoft.voice-live.enabled&lt;/CODE&gt; metadata flag set to &lt;CODE&gt;true&lt;/CODE&gt;.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Two design choices are worth calling out.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;The broker is small on purpose.&lt;/STRONG&gt; It does authentication, URL construction, and WebSocket relay. It does not transcode audio, run business logic, or hold conversation state. Voice Live and the agent already do those things well.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;The agent binding is a URL query parameter, not an SDK call.&lt;/STRONG&gt; There is no per-turn HTTP request to the agent runtime. Voice Live opens a session against the agent once and streams turns through it for the lifetime of the WebSocket. That is what keeps latency low.&lt;/P&gt;
&lt;H2&gt;The Voice Live URL contract&lt;/H2&gt;
&lt;P&gt;This is the single most important thing to get right. The public Microsoft sample that ships under &lt;CODE&gt;liupeirong/ai-foundry-voice-agent&lt;/CODE&gt; targets a different URL shape (&lt;CODE&gt;services.ai.azure.com&lt;/CODE&gt; host, &lt;CODE&gt;agent-id&lt;/CODE&gt; + &lt;CODE&gt;agent-access-token&lt;/CODE&gt; parameters, an &lt;CODE&gt;Authorization&lt;/CODE&gt; header). That shape is rejected by Foundry resources that expose voice-live-enabled agents. The shape below is the one the portal itself uses, and the one this sample dials.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Three details cause most failures:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;The host must be &lt;CODE&gt;&amp;lt;resource&amp;gt;.cognitiveservices.azure.com&lt;/CODE&gt;, not &lt;CODE&gt;services.ai.azure.com&lt;/CODE&gt;. The broker rewrites this automatically from &lt;CODE&gt;VOICE_LIVE_ENDPOINT&lt;/CODE&gt;.&lt;/LI&gt;
&lt;LI&gt;The bearer token travels in the &lt;CODE&gt;authorization&lt;/CODE&gt; query parameter, URL-encoded, with a literal &lt;CODE&gt;Bearer&lt;/CODE&gt; prefix and a &lt;CODE&gt;+&lt;/CODE&gt; (or &lt;CODE&gt;%20&lt;/CODE&gt;) before the token. No &lt;CODE&gt;Authorization&lt;/CODE&gt; header is sent.&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;agent-name&lt;/CODE&gt; and &lt;CODE&gt;model&lt;/CODE&gt; are both the agent’s display name. &lt;CODE&gt;agent-version&lt;/CODE&gt; is empty when you want the latest published version.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Walkthrough: from clone to spoken reply&lt;/H2&gt;
&lt;H3&gt;Prerequisites&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;Python 3.11 or later (the sample is developed on 3.13).&lt;/LI&gt;
&lt;LI&gt;The Azure CLI, signed in with &lt;CODE&gt;az login --tenant &amp;lt;your-tenant-id&amp;gt;&lt;/CODE&gt;.&lt;/LI&gt;
&lt;LI&gt;An Azure AI Foundry project in a Voice Live region (&lt;CODE&gt;eastus2&lt;/CODE&gt;, &lt;CODE&gt;swedencentral&lt;/CODE&gt;, or &lt;CODE&gt;westus2&lt;/CODE&gt;).&lt;/LI&gt;
&lt;LI&gt;A deployed prompt-kind agent in that project with &lt;STRONG&gt;Enable Voice Live&lt;/STRONG&gt; turned on.&lt;/LI&gt;
&lt;LI&gt;The &lt;STRONG&gt;Cognitive Services User&lt;/STRONG&gt; role on the Foundry resource for the identity the broker will use.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Configure the broker&lt;/H3&gt;
&lt;P&gt;Copy &lt;CODE&gt;.env.sample&lt;/CODE&gt; to &lt;CODE&gt;.env&lt;/CODE&gt; and fill in four values:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;AZURE_AI_PROJECT_ENDPOINT=https://&amp;lt;your-resource&amp;gt;.services.ai.azure.com
AZURE_AI_PROJECT_NAME=&amp;lt;your-foundry-project-name&amp;gt;
VOICE_LIVE_ENDPOINT=wss://&amp;lt;your-resource&amp;gt;.services.ai.azure.com/voice-live/realtime
VOICE_LIVE_API_VERSION=2025-10-01
FOUNDRY_AGENT_ID=&amp;lt;your-agent-name&amp;gt;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The agent name is what the Foundry portal shows on the agent card. The broker uses it for both the &lt;CODE&gt;agent-name&lt;/CODE&gt; and &lt;CODE&gt;model&lt;/CODE&gt; query parameters.&lt;/P&gt;
&lt;H3&gt;Install and run&lt;/H3&gt;
&lt;PRE&gt;&lt;CODE&gt;python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
.\scripts\start-local.ps1
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The broker exposes three endpoints:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;CODE&gt;GET /healthz&lt;/CODE&gt; – liveness probe.&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;GET /config&lt;/CODE&gt; – returns the &lt;CODE&gt;session.update&lt;/CODE&gt; the browser sends as its first frame.&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;WS /ws&lt;/CODE&gt; – the bi-directional relay to Voice Live.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Smoke test&lt;/H3&gt;
&lt;PRE&gt;&lt;CODE&gt;.\scripts\test-session.ps1
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;A successful run prints:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;[OK] /ws upgraded
   -&amp;gt; sent session.update
   &amp;lt;- {"type":"session.created",…}
   &amp;lt;- {"type":"session.updated",…}
[OK] session.updated received -- E2E works
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This confirms the entire chain: local broker, &lt;CODE&gt;DefaultAzureCredential&lt;/CODE&gt; token, Foundry Portal URL shape, Voice Live handshake, and the bound agent acknowledging the session.&lt;/P&gt;
&lt;H3&gt;Open the browser UI&lt;/H3&gt;
&lt;P&gt;Browse to &lt;CODE&gt;http://localhost:8000/&lt;/CODE&gt;, click &lt;STRONG&gt;Start talking&lt;/STRONG&gt;, and ask one of the sample questions. Transcripts appear in real time and the spoken reply plays back through the audio context.&lt;/P&gt;
&lt;H2&gt;Inside the broker&lt;/H2&gt;
&lt;P&gt;The relay logic is tiny – the heavy lifting is the URL construction. The function below is the canonical reference; copy it if you are porting the pattern to another language.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;def build_voice_live_ws_url(agent_access_token: str) -&amp;gt; str:
    """
    Build the Foundry Portal style Voice Live WebSocket URL.

    Auth lives in the query string only. No Authorization header is sent.
    """
    host = _ws_host_from_endpoint(VOICE_LIVE_ENDPOINT)
    qs = urlencode(
        {
            "trafficType": "FoundryPortal",
            "agent-name": FOUNDRY_AGENT_ID,
            "agent-version": "",
            "agent-project-name": AZURE_AI_PROJECT_NAME,
            "api-version": VOICE_LIVE_API_VERSION,
            "model": FOUNDRY_AGENT_ID,
            "client-request-id": str(uuid.uuid4()),
            "authorization": f"Bearer {agent_access_token}",
        },
        quote_via=quote,
    )
    return f"wss://{host}/voice-live/realtime?{qs}"
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The relay itself is a pair of asyncio tasks: one forwarding browser frames upstream, one forwarding Voice Live frames back. Audio bytes are passed straight through – the broker never decodes them.&lt;/P&gt;
&lt;H2&gt;Deploying the hosted agent&lt;/H2&gt;
&lt;P&gt;The most reliable way to create a voice-live-enabled agent is the Foundry portal. Agents created via the Assistants v2 SDK do not carry the required metadata by default and will be rejected by the Voice Live URL shape above.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;The portal steps are:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Open the Foundry project, go to &lt;STRONG&gt;Agents&lt;/STRONG&gt;, and click &lt;STRONG&gt;New agent&lt;/STRONG&gt;.&lt;/LI&gt;
&lt;LI&gt;Choose &lt;STRONG&gt;Prompt agent&lt;/STRONG&gt; as the kind, name it (for example &lt;CODE&gt;travel-concierge&lt;/CODE&gt;), and pick a model deployment.&lt;/LI&gt;
&lt;LI&gt;Paste the contents of &lt;CODE&gt;agent/src/prompts/system.txt&lt;/CODE&gt; into the instructions box.&lt;/LI&gt;
&lt;LI&gt;On the &lt;STRONG&gt;Voice&lt;/STRONG&gt; tab, switch &lt;STRONG&gt;Enable Voice Live&lt;/STRONG&gt; on. This is what sets the &lt;CODE&gt;microsoft.voice-live.enabled = true&lt;/CODE&gt; metadata.&lt;/LI&gt;
&lt;LI&gt;Add the three tools (&lt;CODE&gt;get_weather&lt;/CODE&gt;, &lt;CODE&gt;get_flight_status&lt;/CODE&gt;, &lt;CODE&gt;get_hotel_info&lt;/CODE&gt;) from &lt;CODE&gt;agent/agent.yaml&lt;/CODE&gt; on the &lt;STRONG&gt;Tools&lt;/STRONG&gt; tab.&lt;/LI&gt;
&lt;LI&gt;Publish the version and write the agent name back to &lt;CODE&gt;.env&lt;/CODE&gt; as &lt;CODE&gt;FOUNDRY_AGENT_ID&lt;/CODE&gt;.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;The full deployment guide, including how to host the broker on Azure Container Apps with a managed identity, is in &lt;CODE&gt;docs/deployment.md&lt;/CODE&gt; in the repository.&lt;/P&gt;
&lt;H2&gt;Three lessons from getting this to production&lt;/H2&gt;
&lt;H3&gt;1. Voice output must be written for speech, not for screens&lt;/H3&gt;
&lt;P&gt;Foundry agents tend to format answers in markdown with citations like &lt;CODE&gt;([data.jma.go.jp](https://…))&lt;/CODE&gt;. When Voice Live synthesises that text, the user hears the URL read aloud, character by character. The fix is to write the agent instructions so the spoken text never contains URLs, markdown, or symbols. A short block at the end of the agent instructions does the job:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;Voice output rules
- This output is read aloud by TTS. Never include URLs, domain names, or
  citation markers like "(source.com)" in your reply. Cite by speakable
  source name only.
- Never use markdown for formatting. No asterisks, brackets, backticks,
  bullets, or hashes. Write in plain spoken sentences.
- Keep numbers speakable: say "thirty degrees Celsius", not "30C / 86F".
- Keep replies under about 40 words unless the user asks for detail.
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The browser transcript can still render markdown for the eyes. The sample does so with a small, escaping markdown renderer that whitelists bold, italic, code, and &lt;CODE&gt;http(s)&lt;/CODE&gt; links only, so the same agent reply looks polished on screen even though the spoken version contains none of it.&lt;/P&gt;
&lt;H3&gt;2. Identity is simpler than it looks&lt;/H3&gt;
&lt;P&gt;The broker uses &lt;CODE&gt;DefaultAzureCredential&lt;/CODE&gt; and requests the &lt;CODE&gt;https://ai.azure.com/.default&lt;/CODE&gt; scope. Locally that resolves to your &lt;CODE&gt;az login&lt;/CODE&gt; credentials. In Azure Container Apps it resolves to the user-assigned managed identity. In both cases the only role assignment you need on the Foundry account is &lt;STRONG&gt;Cognitive Services User&lt;/STRONG&gt;. There is no API key path on the working URL shape – it is bearer tokens all the way down.&lt;/P&gt;
&lt;H3&gt;3. The wrong sample wastes a day&lt;/H3&gt;
&lt;P&gt;If you start from the public &lt;CODE&gt;liupeirong/ai-foundry-voice-agent&lt;/CODE&gt; repository against a portal-provisioned voice-live agent, the WebSocket either returns HTTP 400 or closes silently with code 1006. The cause is the URL shape, not your code. The reference probe in &lt;CODE&gt;scripts/probe_portal_shape.py&lt;/CODE&gt; is the single source of truth for the working contract – keep it as a regression test.&lt;/P&gt;
&lt;H2&gt;Responsible AI and security notes&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Credentials never reach the browser.&lt;/STRONG&gt; Tokens are minted server-side and travel only on the upstream Voice Live URL.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;No secrets in source.&lt;/STRONG&gt; The &lt;CODE&gt;.env&lt;/CODE&gt; file is gitignored. The &lt;CODE&gt;.env.sample&lt;/CODE&gt; contains only placeholders.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Markdown rendering is escape-first.&lt;/STRONG&gt; The browser HTML-escapes the agent reply before applying its small markdown whitelist, and links are restricted to &lt;CODE&gt;http(s)&lt;/CODE&gt; URLs so the rule cannot emit &lt;CODE&gt;javascript:&lt;/CODE&gt; hrefs.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Tool calls are auditable.&lt;/STRONG&gt; Every turn shows up as a run in the Foundry portal under the agent, with the prompt, model output, and tool inputs and outputs visible for review.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Voice biometric considerations.&lt;/STRONG&gt; If you plan to handle account verification by voice, plug in dedicated speaker recognition rather than relying on the conversational model.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Key takeaways&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;Voice Live plus a Foundry hosted agent is a session-level integration, not an API integration. One URL, one binding field, one WebSocket.&lt;/LI&gt;
&lt;LI&gt;The browser is a thin client. Authentication, URL construction, and relay all live in a small FastAPI broker.&lt;/LI&gt;
&lt;LI&gt;Get the URL shape right (&lt;CODE&gt;cognitiveservices.azure.com&lt;/CODE&gt;, token in the query string, &lt;CODE&gt;agent-name&lt;/CODE&gt; equals &lt;CODE&gt;model&lt;/CODE&gt; equals the agent display name) and the rest is plumbing.&lt;/LI&gt;
&lt;LI&gt;Use the Foundry portal to create the agent so the voice-live metadata is set correctly.&lt;/LI&gt;
&lt;LI&gt;Write agent instructions for the ear, not the eye, then layer screen formatting on top in the browser.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Get the code and try it&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Repository:&lt;/STRONG&gt; &lt;A href="https://github.com/microsoft/foundry-agent-voice-mode-sample" target="_blank" rel="noopener"&gt;github.com/microsoft/foundry-agent-voice-mode-sample&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Deployment guide:&lt;/STRONG&gt; &lt;CODE&gt;docs/deployment.md&lt;/CODE&gt; in the repository.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Labs:&lt;/STRONG&gt; three progressive workshops under &lt;CODE&gt;labs/&lt;/CODE&gt; – basic voice, adding tools, and binding a hosted agent.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Reference docs:&lt;/STRONG&gt; &lt;A href="https://learn.microsoft.com/azure/ai-services/speech-service/voice-live" target="_blank" rel="noopener"&gt;Voice Live in Azure AI Speech&lt;/A&gt; and &lt;A href="https://learn.microsoft.com/azure/ai-foundry/agents/overview" target="_blank" rel="noopener"&gt;Agents in Microsoft Foundry&lt;/A&gt;.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;If you build something on top of this pattern, open an issue or pull request on the repository. The sample is intentionally small so it stays easy to fork.&lt;/P&gt;</description>
      <pubDate>Fri, 29 May 2026 10:16:09 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/building-a-hands-free-voice-concierge-with-microsoft-foundry/ba-p/4523960</guid>
      <dc:creator>Lee_Stott</dc:creator>
      <dc:date>2026-05-29T10:16:09Z</dc:date>
    </item>
    <item>
      <title>Building Reliable AI Coding Workflows Using Modular AI Agent Optimization</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/building-reliable-ai-coding-workflows-using-modular-ai-agent/ba-p/4523252</link>
      <description>&lt;P class="lia-align-justify"&gt;Artificial Intelligence is rapidly transforming the modern software development industry. AI-powered coding assistants such as GitHub Copilot, Claude Code, and other Large Language Model (LLM)-based systems are helping developers automate repetitive coding tasks, improve productivity, and accelerate software development processes. These tools can generate code, assist with debugging, provide recommendations, and support developers during implementation. However, despite their growing capabilities, many AI coding assistants still face challenges related to reliability, maintainability, project-specific conventions, and structured software engineering workflows.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;Most coding assistants perform well for generic programming tasks but often struggle when working with domain-specific development requirements, API integrations, project architectures, validation workflows, and coding standards. In real-world software engineering environments, developers require systems that not only generate code but also follow project conventions, maintain readability, support modular development, and improve long-term maintainability.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-align-justify"&gt;The project &lt;A class="lia-external-url" href="https://github.com/shardakaurr/ai-agent-optimization" target="_blank"&gt;“AI Agents Optimization”&lt;/A&gt; focuses on improving the reliability and effectiveness of AI coding agents by designing structured workflows, modular configurations, validation mechanisms, and optimized task execution strategies. The objective of the project is to investigate how AI agents can become dependable collaborators in practical software engineering tasks instead of functioning only as autocomplete systems.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;The project explores different approaches for organizing AI agent workflows using structured instruction handling, modular task division, context management, validation systems, and integration of external tools and documentation sources. Different agent configurations are analyzed and evaluated to understand how workflow optimization affects software development quality and performance.&lt;/P&gt;
&lt;DIV class="lia-align-justify"&gt;
&lt;H4&gt;&lt;STRONG&gt;Why Existing AI Coding Workflows Often Fail&lt;/STRONG&gt;&lt;/H4&gt;
&lt;/DIV&gt;
&lt;P class="lia-align-justify"&gt;Most AI coding assistants perform well for isolated coding tasks but struggle in real-world engineering environments where projects involve multiple files, coding standards, APIs, validation requirements, and contextual dependencies.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;For example, a generic prompt such as:&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;“Build authentication middleware”&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;may generate functional code, but the output often lacks:&lt;/P&gt;
&lt;UL class="lia-align-justify"&gt;
&lt;LI&gt;Project-specific architecture&lt;/LI&gt;
&lt;LI&gt;Error handling consistency&lt;/LI&gt;
&lt;LI&gt;Validation logic&lt;/LI&gt;
&lt;LI&gt;Security best practices&lt;/LI&gt;
&lt;LI&gt;Dependency awareness&lt;/LI&gt;
&lt;/UL&gt;
&lt;P class="lia-align-justify"&gt;This project approaches the problem differently by introducing a structured workflow pipeline where AI agents operate in defined stages rather than generating outputs in a single step.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;The workflow separates planning, generation, validation, and refinement into independent modules. This improves maintainability, reduces inconsistent outputs, and supports iterative refinement similar to real software engineering workflows.&lt;/P&gt;
&lt;DIV class="lia-align-justify"&gt;
&lt;H4&gt;&lt;STRONG&gt;Project Objectives&lt;/STRONG&gt;&lt;/H4&gt;
&lt;/DIV&gt;
&lt;P class="lia-align-justify"&gt;The primary objective of this project is to optimize AI coding agents for real-world software engineering workflows. The project aims to improve how AI systems handle development tasks such as code generation, debugging, testing, validation, feature implementation, and workflow management.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-align-justify"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-align-justify"&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;Another major objective is to design modular AI workflows where different stages of software development are managed systematically. The workflow focuses on task planning, instruction processing, validation, refinement, and output evaluation. This structured approach improves transparency, maintainability, and consistency in AI-generated outputs.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;The project also aims to evaluate how AI coding agents perform under different configurations and development scenarios. By testing multiple workflows and structured instruction methods, the project analyzes how optimization techniques improve development reliability and coding quality.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;&lt;STRONG&gt;&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-align-justify"&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class="lia-align-justify"&gt;
&lt;H4&gt;&lt;STRONG&gt;Technologies and Tools Used&lt;/STRONG&gt;&lt;/H4&gt;
&lt;/DIV&gt;
&lt;P class="lia-align-justify"&gt;The project utilizes multiple modern technologies and development tools for experimentation and workflow optimization.&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN lia-align-justify"&gt;&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;&lt;STRONG&gt;Technology / Tool&lt;/STRONG&gt;&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;&lt;STRONG&gt;Purpose&lt;/STRONG&gt;&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;Python&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Automation and scripting&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;GitHub Copilot&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;AI-assisted coding&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;Claude / LLM APIs&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;AI workflow experimentation&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;Visual Studio Code&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Development environment&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;Git &amp;amp; GitHub&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Version control and repository management&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;Structured Prompting&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Workflow optimization&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;MCP Concepts&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Tool and context integration&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P class="lia-align-justify"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;These tools collectively support the implementation and testing of optimized AI coding workflows.&lt;/P&gt;
&lt;DIV class="lia-align-justify"&gt;
&lt;H4&gt;&lt;STRONG&gt;Implementation Workflow&lt;/STRONG&gt;&lt;/H4&gt;
&lt;/DIV&gt;
&lt;P class="lia-align-justify"&gt;The system was implemented using a modular AI workflow pipeline where each stage performs a dedicated engineering task.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;&lt;STRONG&gt;Step 1 — Task Parsing&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;The user submits a development task or coding requirement. The Instruction Processing Module extracts:&lt;/P&gt;
&lt;UL class="lia-align-justify"&gt;
&lt;LI&gt;Objective&lt;/LI&gt;
&lt;LI&gt;Constraints&lt;/LI&gt;
&lt;LI&gt;Project context&lt;/LI&gt;
&lt;LI&gt;Expected output format&lt;/LI&gt;
&lt;/UL&gt;
&lt;P class="lia-align-justify"&gt;Example structured prompt:&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;Task: Create JWT authentication middleware&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;Language: Node.js&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;Constraints:&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;- Use Express.js&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;- Add token validation&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;- Follow modular architecture&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;- Include error handling&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;&lt;STRONG&gt;Step 2 — Planning &amp;amp; Reasoning&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;The Planning Module divides the task into subtasks such as:&lt;/P&gt;
&lt;UL class="lia-align-justify"&gt;
&lt;LI&gt;Route handling&lt;/LI&gt;
&lt;LI&gt;Token verification&lt;/LI&gt;
&lt;LI&gt;Error management&lt;/LI&gt;
&lt;LI&gt;Security validation&lt;/LI&gt;
&lt;/UL&gt;
&lt;P class="lia-align-justify"&gt;This improves reasoning consistency before generation begins.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;&lt;STRONG&gt;Step 3 — Code Generation&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;The Code Generation Module produces outputs using structured prompts and contextual references instead of generic instructions.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;&lt;STRONG&gt;Step 4 — Validation&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;Generated outputs are validated using:&lt;/P&gt;
&lt;UL class="lia-align-justify"&gt;
&lt;LI&gt;Syntax checks&lt;/LI&gt;
&lt;LI&gt;Logical consistency checks&lt;/LI&gt;
&lt;LI&gt;Formatting standards&lt;/LI&gt;
&lt;LI&gt;Dependency validation&lt;/LI&gt;
&lt;/UL&gt;
&lt;P class="lia-align-justify"&gt;&lt;STRONG&gt;Step 5 — Refinement&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;If validation fails, the workflow loops back into refinement where issues are corrected before final delivery.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class="lia-align-justify"&gt;
&lt;H4&gt;&lt;STRONG&gt;System Workflow&lt;/STRONG&gt;&lt;/H4&gt;
&lt;/DIV&gt;
&lt;P class="lia-align-justify"&gt;The workflow of the&lt;A class="lia-external-url" href="https://github.com/shardakaurr/ai-agent-optimization" target="_blank"&gt; AI Agents Optimization system&lt;/A&gt; is based on modular task execution and structured development processes. The workflow begins with task planning and requirement analysis. The AI agent receives structured instructions along with coding constraints, project context, and validation requirements.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;The system processes the provided instructions and generates outputs according to defined workflows and development standards. Different configurations are tested to evaluate how instruction structures and modular task handling influence the quality of generated code&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;The workflow also includes validation and refinement stages where generated outputs are analyzed for correctness, maintainability, and consistency. The project focuses not only on code generation but also on improving readability, workflow transparency, debugging support, and adherence to project conventions.&lt;/P&gt;
&lt;img /&gt;
&lt;DIV class="lia-align-justify"&gt;
&lt;H4&gt;&lt;STRONG&gt;Key Features of the Project&lt;/STRONG&gt;&lt;/H4&gt;
&lt;/DIV&gt;
&lt;UL class="lia-align-justify"&gt;
&lt;LI&gt;Structured AI workflow design&lt;/LI&gt;
&lt;LI&gt;Modular task execution&lt;/LI&gt;
&lt;LI&gt;AI-assisted software development&lt;/LI&gt;
&lt;LI&gt;Workflow optimization strategies&lt;/LI&gt;
&lt;LI&gt;Validation and refinement mechanisms&lt;/LI&gt;
&lt;LI&gt;Integration of development tools and documentation&lt;/LI&gt;
&lt;LI&gt;Improved maintainability and readability&lt;/LI&gt;
&lt;LI&gt;Support for practical software engineering workflows&lt;/LI&gt;
&lt;/UL&gt;
&lt;img /&gt;
&lt;P class="lia-align-justify"&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class="lia-align-justify"&gt;
&lt;H4&gt;&lt;STRONG&gt;Challenges Faced During Development&lt;/STRONG&gt;&lt;/H4&gt;
&lt;/DIV&gt;
&lt;P class="lia-align-justify"&gt;One of the major challenges encountered during the project was maintaining consistency and reliability in AI-generated outputs. Different AI models often produce different responses depending on prompts, context, and task structure. Designing workflows that improve output stability and maintain coding standards required careful experimentation and optimization.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;Another challenge involved integrating structured workflows while ensuring flexibility in task execution. AI systems often require clear instructions and contextual information to produce accurate outputs. Balancing automation with maintainability and project-specific requirements was an important aspect of the project.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;Managing validation and refinement processes was also challenging because generated outputs needed to be evaluated not only for correctness but also for readability, maintainability, and software engineering best practices.&lt;/P&gt;
&lt;DIV class="lia-align-justify"&gt;
&lt;H4&gt;&lt;STRONG&gt;Observations and Outcomes&lt;/STRONG&gt;&lt;/H4&gt;
&lt;/DIV&gt;
&lt;P class="lia-align-justify"&gt;During experimentation, structured workflows produced more reliable and maintainable outputs compared to single-prompt generation approaches.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;Some important observations included:&lt;/P&gt;
&lt;UL class="lia-align-justify"&gt;
&lt;LI&gt;Reduced repetitive corrections during code refinement&lt;/LI&gt;
&lt;LI&gt;Improved consistency in generated outputs&lt;/LI&gt;
&lt;LI&gt;Better adherence to coding structure and formatting&lt;/LI&gt;
&lt;LI&gt;More stable workflow behavior for multi-step tasks&lt;/LI&gt;
&lt;LI&gt;Improved readability and maintainability of generated code&lt;/LI&gt;
&lt;/UL&gt;
&lt;P class="lia-align-justify"&gt;The validation and refinement stages were particularly effective in reducing incomplete outputs and improving response quality.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;Although the project focuses primarily on workflow architecture and qualitative analysis rather than benchmark testing, the results demonstrate that modular AI pipelines can significantly improve practical software engineering workflows.&lt;/P&gt;
&lt;DIV class="lia-align-justify"&gt;
&lt;H4&gt;&lt;STRONG&gt;Future Enhancements&lt;/STRONG&gt;&lt;/H4&gt;
&lt;/DIV&gt;
&lt;P class="lia-align-justify"&gt;The project can be further enhanced by implementing advanced multi-agent collaboration systems where multiple AI agents work together on complex software development tasks. Future versions may also include real-time documentation integration, automated testing frameworks, cloud-based workflow management, and improved reasoning models.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;Additional enhancements may include IDE extensions, intelligent debugging systems, automated code review mechanisms, and adaptive workflow optimization based on project requirements.&lt;/P&gt;
&lt;DIV class="lia-align-justify"&gt;
&lt;H4&gt;&lt;STRONG&gt;Conclusion&lt;/STRONG&gt;&lt;/H4&gt;
&lt;/DIV&gt;
&lt;P class="lia-align-justify"&gt;The &lt;A class="lia-external-url" href="https://github.com/shardakaurr/ai-agent-optimization" target="_blank"&gt;AI Agents Optimization project &lt;/A&gt;demonstrates how structured workflows and modular configurations can improve the effectiveness of AI-powered coding assistants in modern software engineering environments.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;By focusing on workflow optimization, validation mechanisms, modular task execution, and structured instruction handling, the project highlights the future potential of AI agents as reliable development collaborators capable of supporting real-world software engineering processes.&lt;/P&gt;
&lt;P class="lia-align-justify"&gt;The project represents an important step toward building dependable AI-assisted development systems that improve productivity, maintainability, and software quality while supporting modern engineering practices.&lt;/P&gt;
&lt;DIV class="lia-align-justify"&gt;
&lt;H4&gt;&lt;STRONG&gt;How to Try This Workflow&lt;/STRONG&gt;&lt;/H4&gt;
&lt;/DIV&gt;
&lt;OL class="lia-align-justify"&gt;
&lt;LI&gt;Define a structured development task&lt;/LI&gt;
&lt;LI&gt;Provide project constraints and context&lt;/LI&gt;
&lt;LI&gt;Break the task into subtasks&lt;/LI&gt;
&lt;LI&gt;Generate output using structured prompts&lt;/LI&gt;
&lt;LI&gt;Validate output quality&lt;/LI&gt;
&lt;LI&gt;Refine based on validation feedback&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Thu, 28 May 2026 19:53:45 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/building-reliable-ai-coding-workflows-using-modular-ai-agent/ba-p/4523252</guid>
      <dc:creator>ShardaKaur</dc:creator>
      <dc:date>2026-05-28T19:53:45Z</dc:date>
    </item>
    <item>
      <title>Hybrid AI Agents in Python: Routing Between Foundry Local and Microsoft Foundry</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/hybrid-ai-agents-in-python-routing-between-foundry-local-and/ba-p/4522979</link>
      <description>&lt;H2&gt;Why hybrid, and why now&lt;/H2&gt;
&lt;P&gt;If you build AI features today, you are caught between three forces. Users want low latency and strong privacy. Product teams want frontier reasoning capability. Finance teams want predictable cost. No single model satisfies all three. Run everything on a small on-device model and you bottleneck on complex questions. Send everything to a frontier cloud model and you pay for trivial requests, leak sensitive data across a network boundary, and add hundreds of milliseconds of latency to greetings.&lt;/P&gt;
&lt;P&gt;The pragmatic answer is hybrid inference: a lightweight local model classifies every request first, simple or sensitive ones stay on the device, and only the genuinely hard or frontier-capability requests escalate to the cloud. Microsoft now ships both halves of that pattern as supported Python SDKs — &lt;A href="https://pypi.org/project/foundry-local-sdk/" target="_blank"&gt;foundry-local-sdk&lt;/A&gt; for on-device inference and &lt;A href="https://pypi.org/project/azure-ai-projects/" target="_blank"&gt;azure-ai-projects&lt;/A&gt; for Microsoft Foundry cloud models. This post walks through a working reference implementation that combines them behind a single &lt;CODE&gt;ask()&lt;/CODE&gt; call.&lt;/P&gt;
&lt;P&gt;The full source is at &lt;A href="https://github.com/leestott/fl-mixedmodel" target="_blank"&gt;github.com/leestott/fl-mixedmodel&lt;/A&gt;. It is Python-only, secretless by design, and ships with a Gradio diagnostics UI, a CLI demo mode, and a full &lt;CODE&gt;pytest&lt;/CODE&gt; suite.&lt;/P&gt;
&lt;H2&gt;The contract: one schema, two paths&lt;/H2&gt;
&lt;P&gt;The most important architectural decision is that callers never know which path served a request. Every response, local or cloud, returns the same dataclass:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;class InferencePath(str, Enum):
    LOCAL = "local"
    CLOUD = "cloud"
    LOCAL_FALLBACK = "local_fallback"   # cloud attempted, fell back to local
    CLOUD_FALLBACK = "cloud_fallback"   # local attempted, fell back to cloud

@dataclass
class AgentResponse:
    answer: str
    path: InferencePath
    model: str
    reason: str
    confidence: float
    latency_ms: float
    correlation_id: str
    prompt_tokens: Optional[int] = None
    completion_tokens: Optional[int] = None
    fallback: bool = False
    fallback_reason: Optional[str] = None
    metadata: dict = field(default_factory=dict)
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This is what makes the design honest. The router can change, the cloud model can be swapped from &lt;CODE&gt;gpt-4o&lt;/CODE&gt; to &lt;CODE&gt;gpt-5.4&lt;/CODE&gt;, fallback policies can flip — and the calling code never breaks. The four &lt;CODE&gt;InferencePath&lt;/CODE&gt; values give you full observability without leaking implementation details into the API surface.&lt;/P&gt;
&lt;H2&gt;Architecture in one diagram&lt;/H2&gt;
&lt;PRE&gt;&lt;CODE&gt;┌─────────────┐   prompt    ┌──────────────────────────┐
│   caller    │ ──────────► │   HybridAgentService     │
└─────────────┘             │      .ask(prompt)        │
                            └────────────┬─────────────┘
                                         │
                            ┌────────────▼─────────────┐
                            │     RoutingPolicy        │
                            │  1. Heuristic gate       │
                            │  2. Local router LLM     │
                            │  3. Hard policy gates    │
                            └─────┬─────────────┬──────┘
                                  │             │
                          LOCAL  ◄┘             └► CLOUD
                                  │             │
                       ┌──────────▼──┐   ┌──────▼───────┐
                       │ Foundry     │   │ Microsoft    │
                       │ Local SDK   │   │ Foundry      │
                       │ (phi-4-mini)│   │ (gpt-5.4)    │
                       └─────────────┘   └──────────────┘
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H2&gt;Best practice: the two-stage router pattern&lt;/H2&gt;
&lt;P&gt;Before walking through the implementation, it is worth stating the design pattern explicitly, because it is the part that generalises beyond this specific repo. &lt;STRONG&gt;The cleanest design for hybrid inference is a two-stage router.&lt;/STRONG&gt;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;Stage 1 — local router.&lt;/STRONG&gt; A small local model performs intent and complexity classification first. It does not answer the question; it decides where the question should go.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Stage 2 — route the answer.&lt;/STRONG&gt;
&lt;UL&gt;
&lt;LI&gt;If the prompt is &lt;STRONG&gt;simple, private, latency-sensitive, or clearly within local capability&lt;/STRONG&gt;, route to a &lt;EM&gt;local task model&lt;/EM&gt; on the device.&lt;/LI&gt;
&lt;LI&gt;If the prompt is &lt;STRONG&gt;complex, needs deeper reasoning, a larger context window, or a capability unavailable locally&lt;/STRONG&gt;, escalate to a &lt;EM&gt;cloud frontier model in Microsoft Foundry&lt;/EM&gt;.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Microsoft's current guidance for the cloud side is to use the &lt;STRONG&gt;Responses API&lt;/STRONG&gt; and choose one of two control modes:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Pass a &lt;STRONG&gt;specific deployment name&lt;/STRONG&gt; (for example &lt;CODE&gt;gpt-5.4&lt;/CODE&gt;) when you want deterministic control over which model serves the request, which is the right choice for regulated workloads, repeatable evaluations, or cost ceilings.&lt;/LI&gt;
&lt;LI&gt;Pass &lt;STRONG&gt;&lt;CODE&gt;model-router&lt;/CODE&gt;&lt;/STRONG&gt; as the deployment when you want Microsoft Foundry to automatically select the best available cloud model for each request. This is a sensible default for general-purpose agents where you would rather let the platform optimise the model choice as new ones are released.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The reference repo exposes both as environment variables so you can switch without code changes:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;# .env.example
FOUNDRY_CLOUD_MODEL_DEPLOYMENT=gpt-5.4        # deterministic
FOUNDRY_CLOUD_ROUTER_DEPLOYMENT=model-router  # auto-select
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H2&gt;Best practice: pin the right SDK versions&lt;/H2&gt;
&lt;P&gt;Two SDKs do the heavy lifting and both have had recent breaking changes, so version discipline matters.&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Local development — &lt;CODE&gt;foundry-local-sdk&lt;/CODE&gt;.&lt;/STRONG&gt; The current public guidance is to use the Foundry Local SDK package &lt;A href="https://pypi.org/project/foundry-local-sdk/" target="_blank"&gt;&lt;CODE&gt;foundry-local-sdk&lt;/CODE&gt;&lt;/A&gt;, which provides model discovery, download, cache, load, unload, chat completions, embeddings, audio transcription, and an optional built-in web service. Use &lt;STRONG&gt;version 1.1.0&lt;/STRONG&gt;, released on &lt;STRONG&gt;5 May 2026&lt;/STRONG&gt;. Earlier versions used an OpenAI-compatible client surface that has since been replaced by the &lt;CODE&gt;FoundryLocalManager → load_model → get_chat_client → complete_chat&lt;/CODE&gt; chain shown above. Pin it explicitly:
&lt;PRE&gt;&lt;CODE&gt;# requirements.txt
foundry-local-sdk&amp;gt;=1.1.0&lt;/CODE&gt;&lt;/PRE&gt;
&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Cloud orchestration and agents — &lt;CODE&gt;azure-ai-projects&lt;/CODE&gt;.&lt;/STRONG&gt; For cloud-side orchestration, Microsoft's current Python guidance is to use &lt;A href="https://pypi.org/project/azure-ai-projects/" target="_blank"&gt;&lt;CODE&gt;azure-ai-projects&lt;/CODE&gt;&lt;/A&gt;, which the docs describe as part of the Microsoft Foundry SDK and as the entry point for &lt;EM&gt;agents, deployments, connections, datasets, evaluations&lt;/EM&gt;, and an OpenAI-compatible client returned by &lt;CODE&gt;get_openai_client()&lt;/CODE&gt;. The current PyPI listing shows &lt;STRONG&gt;azure-ai-projects 2.1.0&lt;/STRONG&gt;. Pin it explicitly:
&lt;PRE&gt;&lt;CODE&gt;# requirements.txt
azure-ai-projects&amp;gt;=2.1.0
azure-identity&amp;gt;=1.17.0&lt;/CODE&gt;&lt;/PRE&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;If you find yourself reading old samples that import &lt;CODE&gt;azure.ai.inference&lt;/CODE&gt; as the cloud entry point, or that initialise Foundry Local through a raw &lt;CODE&gt;openai.OpenAI(base_url=...)&lt;/CODE&gt; client, you are looking at pre-2026 patterns. The current shape is what the reference repo uses: &lt;CODE&gt;FoundryLocalManager.initialize(Configuration(...))&lt;/CODE&gt; for the device and &lt;CODE&gt;AIProjectClient(...).get_openai_client()&lt;/CODE&gt; for the cloud.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;Stage 1: a deterministic privacy gate&lt;/H2&gt;
&lt;P&gt;Before any model touches a prompt, a deterministic heuristic classifier scans for sensitive patterns — passwords, API keys, SSN/NHS numbers, PII signals, explicit "do not share" flags. If the heuristic returns &lt;CODE&gt;PrivacyClass.RESTRICTED&lt;/CODE&gt;, the prompt is forced local. The router LLM is not called. The cloud provider is not called. The decision is auditable from a single regex pass.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;# app/routing/policy.py
def decide(self, prompt: str, correlation_id: str = "") -&amp;gt; RoutingDecision:
    hint, privacy, complexity, h_reason = self._heuristic.classify(prompt)

    # Hard gate: restricted content never leaves the device
    if privacy == PrivacyClass.RESTRICTED:
        return self._make_decision(
            target=RouteTarget.LOCAL,
            confidence=1.0,
            reason=f"Policy hard-gate: {h_reason}",
            privacy=privacy,
            complexity=complexity,
            deterministic=True,
            correlation_id=correlation_id,
        )

    # Hard gate: very high complexity always goes to cloud
    if complexity == ComplexityBand.VERY_HIGH:
        return self._make_decision(
            target=RouteTarget.CLOUD,
            confidence=1.0,
            reason="Policy hard-gate: very_high complexity requires frontier model",
            ...
        )
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This is the most important responsible-AI control in the whole system. If your privacy review depends on an LLM correctly classifying every prompt, you do not have a privacy control — you have a probability distribution. Deterministic gates first, model judgement second.&lt;/P&gt;
&lt;H2&gt;Stage 2: a local LLM as the router&lt;/H2&gt;
&lt;P&gt;For everything that passes the privacy gate, a small local model classifies whether the prompt needs frontier capability. This is the bit that surprises most engineers: &lt;EM&gt;you can do useful routing with a 4B parameter model running on a laptop CPU&lt;/EM&gt;. The router does not need to answer the question. It only needs to classify it.&lt;/P&gt;
&lt;P&gt;The reference implementation uses &lt;A href="https://huggingface.co/microsoft/Phi-4-mini-instruct" target="_blank"&gt;phi-4-mini&lt;/A&gt; via Foundry Local. Initialising it is two lines:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;# app/providers/local_provider.py (excerpt)
from foundry_local import FoundryLocalManager
from foundry_local.models import Configuration

self._manager = FoundryLocalManager.initialize(
    Configuration(app_name="hybrid-agent")
)
self._router_model = self._manager.load_model(self._config.local_router_alias)
self._chat_client  = self._router_model.get_chat_client()

response = self._chat_client.complete_chat(
    messages=[
        {"role": "system", "content": ROUTER_SYSTEM_PROMPT},
        {"role": "user",   "content": prompt},
    ],
)
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The router prompt asks for a strict JSON response: &lt;CODE&gt;{ "target": "local|cloud", "confidence": 0.0-1.0, "complexity": "low|medium|high|very_high", "reason": "..." }&lt;/CODE&gt;. The application parses it, applies the confidence threshold from config (default 0.6), and falls back to the heuristic decision if the router LLM is unsure or its JSON is malformed. &lt;STRONG&gt;The router never blocks the answer path&lt;/STRONG&gt; — that is a deliberate reliability choice.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;Cloud inference via Microsoft Foundry&lt;/H2&gt;
&lt;P&gt;When the policy returns &lt;CODE&gt;RouteTarget.CLOUD&lt;/CODE&gt;, the request goes through &lt;CODE&gt;AIProjectClient&lt;/CODE&gt;, which gives you an &lt;CODE&gt;openai.OpenAI&lt;/CODE&gt;-compatible client wired to your Foundry project with &lt;CODE&gt;DefaultAzureCredential&lt;/CODE&gt;. No API keys. No secrets in &lt;CODE&gt;.env&lt;/CODE&gt;.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;# app/providers/cloud_provider.py (excerpt)
from azure.ai.projects import AIProjectClient
from azure.identity import DefaultAzureCredential

self._project = AIProjectClient(
    endpoint=self._config.foundry_project_endpoint,
    credential=DefaultAzureCredential(),
)
self._openai_client = self._project.get_openai_client()

response = self._openai_client.chat.completions.create(
    model=self._config.foundry_cloud_model_deployment,  # e.g. "gpt-5.4"
    messages=messages,
    max_completion_tokens=max_tokens,
)
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;A subtle gotcha worth flagging: gpt-5 and o-series deployments reject the legacy &lt;CODE&gt;max_tokens&lt;/CODE&gt; parameter and require &lt;CODE&gt;max_completion_tokens&lt;/CODE&gt;. They also reject custom &lt;CODE&gt;temperature&lt;/CODE&gt; values. The reference repo handles this by trying the new parameter first and falling back to the legacy one only when the API returns the specific &lt;CODE&gt;unsupported parameter&lt;/CODE&gt; error. That keeps the same code working against older deployments without forking the provider.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;Graceful degradation: the fallback paths&lt;/H2&gt;
&lt;P&gt;Hybrid systems fail in interesting ways. The cloud can be down. The local model can throw because the GPU ran out of memory. A reasoning model can return an empty completion. The service handles all of these by attempting the alternative path and labelling the response so observability stays honest:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Cloud route fails → local fallback.&lt;/STRONG&gt; The response carries &lt;CODE&gt;path=LOCAL_FALLBACK&lt;/CODE&gt;, &lt;CODE&gt;fallback=true&lt;/CODE&gt;, and a populated &lt;CODE&gt;fallback_reason&lt;/CODE&gt;. The user gets an answer instead of an error.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Local route fails → cloud fallback,&lt;/STRONG&gt; &lt;EM&gt;but only if privacy class is not RESTRICTED.&lt;/EM&gt; A sensitive prompt that the local model could not handle never leaks to the cloud as a fallback. It returns a clear error instead. This is the second hard gate in the system.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Both fail.&lt;/STRONG&gt; A structured error response with a correlation ID, never a stack trace.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;That last rule — fallback respects privacy class — is the kind of decision that is easy to skip and impossible to bolt on later. Encode it once in the service layer and your privacy reviewers will thank you.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;H2&gt;What it looks like in practice&lt;/H2&gt;
&lt;P&gt;The diagnostics panel in the Gradio UI shows the routing decision live: path, model, confidence, latency, privacy class, complexity band, and the full JSON response. Five canonical scenarios shake out the entire decision tree:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;CODE&gt;"hello"&lt;/CODE&gt; → &lt;CODE&gt;path=local, confidence=1.0, complexity=low&lt;/CODE&gt;. Heuristic only. No router LLM call. ~3 seconds end-to-end with phi-4-mini cached.&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;"explain transformer self-attention in depth with maths"&lt;/CODE&gt; → &lt;CODE&gt;path=cloud, model=gpt-5.4, complexity=high&lt;/CODE&gt;. Router LLM classifies, hard gate confirms.&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;"my password is hunter2, suggest a stronger one"&lt;/CODE&gt; → &lt;CODE&gt;path=local, privacy=restricted, deterministic=true&lt;/CODE&gt;. Privacy gate fires before any model sees it.&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;"summarise this 8 KB document"&lt;/CODE&gt; with cloud unavailable → &lt;CODE&gt;path=cloud_fallback&lt;/CODE&gt; (local handles it, response is labelled).&lt;/LI&gt;
&lt;LI&gt;Complex prompt with local model error → &lt;CODE&gt;path=local_fallback&lt;/CODE&gt;, &lt;CODE&gt;fallback_reason&lt;/CODE&gt; populated.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;You can reproduce all five without any models installed by running &lt;CODE&gt;python -m app.main --demo&lt;/CODE&gt;. The demo mode swaps the providers for deterministic stubs so you can validate the routing logic and the response schema in under a second on any machine.&lt;/P&gt;
&lt;H2&gt;Operational lessons learned&lt;/H2&gt;
&lt;P&gt;Some things the reference implementation only gets right because it got them wrong first:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Pick a non-reasoning model for the router.&lt;/STRONG&gt; Reasoning-tuned local models (Phi-4-reasoning, o-style) wrap their output in &lt;CODE&gt;&amp;lt;think&amp;gt;&lt;/CODE&gt; blocks and blow your JSON parser. &lt;CODE&gt;phi-4-mini&lt;/CODE&gt; is faster and more reliable for classification.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Cache the local model.&lt;/STRONG&gt; First load can take 30–60 seconds while Foundry Local downloads weights. Initialise the service once at process startup, not per request.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Use correlation IDs everywhere.&lt;/STRONG&gt; The service attaches one per request and the structured JSON logger emits it on every event. When you are debugging a fallback path across two model providers, this is the difference between five minutes and five hours.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Run the privacy heuristic on every fallback path too.&lt;/STRONG&gt; A naive implementation might route locally, fail, and then send the same sensitive prompt to the cloud as a "graceful" fallback. That is not graceful, it is a data leak.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Keep configuration in &lt;CODE&gt;.env&lt;/CODE&gt; and out of code.&lt;/STRONG&gt; Privacy mode, fallback toggles, confidence threshold, model aliases — all environment-driven. The &lt;CODE&gt;config.py&lt;/CODE&gt; module is the only place that reads them.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Responsible AI in a hybrid topology&lt;/H2&gt;
&lt;P&gt;Hybrid does not make responsible AI harder, but it does make it different. Three controls earn their keep:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Data residency by default.&lt;/STRONG&gt; The local path keeps prompts and answers on the device. For RESTRICTED content this is mandatory; for everything else it is a free latency and cost win.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Auditability.&lt;/STRONG&gt; Every routing decision is logged with the deterministic reason, the heuristic class, the router LLM output, the confidence, and the correlation ID. You can answer "why did this prompt go to the cloud?" months later.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Keyless auth.&lt;/STRONG&gt; &lt;CODE&gt;DefaultAzureCredential&lt;/CODE&gt; means there is no API key to leak, rotate, or commit by accident. The repo's &lt;CODE&gt;.gitignore&lt;/CODE&gt;, &lt;CODE&gt;SECURITY.md&lt;/CODE&gt;, and pre-push checklist enforce this end-to-end.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Try it&lt;/H2&gt;
&lt;P&gt;Five minutes, no Azure account needed for the demo:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;git clone https://github.com/leestott/fl-mixedmodel.git
cd fl-mixedmodel

python -m venv .venv
.venv\Scripts\activate          # Windows
# source .venv/bin/activate     # macOS / Linux

pip install -r requirements.txt
python -m app.main --demo       # all five scenarios, no models required
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;To run with real models, install &lt;A href="https://learn.microsoft.com/azure/ai-foundry/foundry-local/" target="_blank"&gt;Foundry Local&lt;/A&gt;, copy &lt;CODE&gt;.env.example&lt;/CODE&gt; to &lt;CODE&gt;.env&lt;/CODE&gt;, set your &lt;CODE&gt;FOUNDRY_PROJECT_ENDPOINT&lt;/CODE&gt;, then:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;az login
python -m app.main --ui --port 7860
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H2&gt;Where to go next&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Repository:&lt;/STRONG&gt; &lt;A href="https://github.com/leestott/fl-mixedmodel" target="_blank"&gt;github.com/leestott/fl-mixedmodel&lt;/A&gt; — full source, tests, specification, screenshots.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Foundry Local SDK:&lt;/STRONG&gt; &lt;A href="https://pypi.org/project/foundry-local-sdk/" target="_blank"&gt;pypi.org/project/foundry-local-sdk&lt;/A&gt; and the &lt;A href="https://learn.microsoft.com/azure/ai-foundry/foundry-local/" target="_blank"&gt;Foundry Local docs&lt;/A&gt;.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Azure AI Projects SDK:&lt;/STRONG&gt; &lt;A href="https://pypi.org/project/azure-ai-projects/" target="_blank"&gt;pypi.org/project/azure-ai-projects&lt;/A&gt; and the &lt;A href="https://learn.microsoft.com/azure/ai-foundry/" target="_blank"&gt;Microsoft Foundry docs&lt;/A&gt;.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Azure Identity:&lt;/STRONG&gt; &lt;A href="https://learn.microsoft.com/python/api/azure-identity/azure.identity.defaultazurecredential" target="_blank"&gt;DefaultAzureCredential reference&lt;/A&gt;.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Phi-4-mini:&lt;/STRONG&gt; &lt;A href="https://huggingface.co/microsoft/Phi-4-mini-instruct" target="_blank"&gt;Phi-4-mini on Hugging Face&lt;/A&gt;.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Key takeaways&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;The best-practice pattern is a &lt;STRONG&gt;two-stage router&lt;/STRONG&gt;: local model classifies first, then either a local task model or a Microsoft Foundry cloud model answers.&lt;/LI&gt;
&lt;LI&gt;For cloud control, use the &lt;STRONG&gt;Responses API&lt;/STRONG&gt; with either a named deployment (deterministic) or &lt;CODE&gt;model-router&lt;/CODE&gt; (auto-select).&lt;/LI&gt;
&lt;LI&gt;Pin &lt;STRONG&gt;&lt;CODE&gt;foundry-local-sdk &amp;gt;= 1.1.0&lt;/CODE&gt;&lt;/STRONG&gt; (5 May 2026) and &lt;STRONG&gt;&lt;CODE&gt;azure-ai-projects &amp;gt;= 2.1.0&lt;/CODE&gt;&lt;/STRONG&gt;. The 2026 SDK surfaces are not backwards-compatible with pre-2026 samples.&lt;/LI&gt;
&lt;LI&gt;Hybrid inference is a routing problem, not a model problem. A small local model is enough to classify the request.&lt;/LI&gt;
&lt;LI&gt;Deterministic privacy gates beat probabilistic ones. Code the rules; let the LLM judge only what is left.&lt;/LI&gt;
&lt;LI&gt;Return the same response schema from every path. Label fallbacks honestly. Carry a correlation ID everywhere.&lt;/LI&gt;
&lt;LI&gt;Keep auth keyless with &lt;CODE&gt;DefaultAzureCredential&lt;/CODE&gt; and your &lt;CODE&gt;.env&lt;/CODE&gt; out of git.&lt;/LI&gt;
&lt;LI&gt;Test the routing decisions, not just the model outputs. Demo mode and a strong &lt;CODE&gt;pytest&lt;/CODE&gt; suite pay back every time you swap a model.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Hybrid AI is not a compromise between local and cloud. It is the supervisor pattern applied to inference — fast and private where you can be, frontier where you have to be, observable everywhere. The hard part is the contract, not the models.&lt;/P&gt;</description>
      <pubDate>Wed, 27 May 2026 07:00:00 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/hybrid-ai-agents-in-python-routing-between-foundry-local-and/ba-p/4522979</guid>
      <dc:creator>Lee_Stott</dc:creator>
      <dc:date>2026-05-27T07:00:00Z</dc:date>
    </item>
    <item>
      <title>Build AI RAG Apps with Azure DocumentDB (with MongoDB compatibility) and OpenAI: Step-by-Step Guide</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/build-ai-rag-apps-with-azure-documentdb-with-mongodb/ba-p/4497192</link>
      <description>&lt;H3&gt;Scenario&lt;/H3&gt;
&lt;P data-start="0" data-end="694" data-is-last-node="" data-is-only-node=""&gt;Imagine you are building your company’s RAG chat application using &lt;STRONG data-start="67" data-end="91"&gt;Azure OpenAI Service&lt;/STRONG&gt; and orchestrating the flow with &lt;STRONG data-start="124" data-end="165"&gt;LangChain&lt;/STRONG&gt;. The chat experience works, but now it needs to be grounded in your company’s data. You generate embeddings and want to store and query them without adding another database or complex sync pipeline. Instead of stitching services together, you use &lt;STRONG data-start="413" data-end="462"&gt;Azure DocumentDB (with MongoDB compatibility)&lt;/STRONG&gt; with built-in vector search to store your JSON data and embeddings in one place. You deploy the app to &lt;STRONG data-start="566" data-end="587"&gt;Azure App Service&lt;/STRONG&gt; and quickly compare vector search alone versus a full RAG pipeline, sharing it with your team for testing.&lt;/P&gt;
&lt;H3 id="what-will-you-learn"&gt;What will you learn?&lt;/H3&gt;
&lt;P&gt;In this blog, you'll learn to:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Create an Azure DocumentDB (with MongoDB compatibility) resource.&lt;/LI&gt;
&lt;LI&gt;Create an embeddings and a chat deployment in Azure Foundry portal.&lt;/LI&gt;
&lt;LI&gt;Create an Azure App service website with Continuous deployment from GitHub.&lt;/LI&gt;
&lt;LI&gt;Configure Azure App service application settings to enable communication between Azure resources.&lt;/LI&gt;
&lt;LI&gt;Configure GitHub workflow to work successfully.&amp;nbsp;&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;What is the main objective?&lt;/H3&gt;
&lt;P&gt;Build AI Powered RAG Application using the LangChain, Azure OpenAI, and Azure DocumentDB (with MongoDB compatibility&lt;SPAN style="font-style: var(--lia-blog-font-style); font-family: var(--lia-blog-font-family); font-size: var(--lia-bs-font-size-base);"&gt;): Step-by-Step Guide&lt;/SPAN&gt;&lt;/P&gt;
&lt;img /&gt;
&lt;H3&gt;Prerequisites&lt;/H3&gt;
&lt;UL&gt;
&lt;LI class="graf graf--p"&gt;An Azure subscription.
&lt;UL&gt;
&lt;LI&gt;If you don’t already have one, you can sign up for an&amp;nbsp;&lt;A class="markup--anchor markup--li-anchor" title="Sign up for an Azure free account" href="https://azure.microsoft.com/?wt.mc_id=studentamb_71460" target="_blank" rel="noopener noreferrer" data-href="https://azure.microsoft.com/"&gt;Azure free account&lt;/A&gt;.&lt;/LI&gt;
&lt;LI&gt;For students, you can use the free&amp;nbsp;&lt;A class="markup--anchor markup--li-anchor" href="https://aka.ms/Azure4StudentsActivate" target="_blank" rel="noopener noreferrer" data-href="https://aka.ms/Azure4StudentsActivate"&gt;Azure for Students offer&lt;/A&gt;&amp;nbsp;which doesn’t require a credit card only your school email.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;A GitHub Account.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Summary of the steps:&lt;/H3&gt;
&lt;P data-unlink="true"&gt;Step 1: Create an Azure DocumentDB (with MongoDB compatibility) resource&lt;/P&gt;
&lt;P data-unlink="true"&gt;Step 2: Create an Azure OpenAI resource and Deploy chat and embedding Models&lt;/P&gt;
&lt;P data-unlink="true"&gt;Step 3: Create an Azure App Service and Deploy the RAG Chat Application&lt;/P&gt;
&lt;H2 id="h_3686287661702581784350"&gt;Step 1: Create an Azure Cosmos DB for MongoDB vCore Cluster&lt;/H2&gt;
&lt;P&gt;In this step, you'll:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Open the&amp;nbsp;Azure Portal.&lt;/LI&gt;
&lt;LI&gt;Create an Azure Cosmos DB for MongoDB vCore Cluster.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 id="toc-hId-1515534585"&gt;Open the Azure Portal&lt;/H3&gt;
&lt;P class="graf graf--p"&gt;1. Visit the Azure Portal&amp;nbsp;&lt;A class="markup--anchor markup--p-anchor" href="https://portal.azure.com/?wt.mc_id=studentamb_71460" target="_blank" rel="noopener nofollow noreferrer" data-href="https://portal.azure.com"&gt;https://portal.azure.com&lt;/A&gt;&amp;nbsp;in your browser&amp;nbsp;and&amp;nbsp;sign in.&lt;/P&gt;
&lt;FIGURE class="graf graf--figure"&gt;&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/FIGURE&gt;
&lt;P class="graf graf--p"&gt;Now you are inside the&amp;nbsp;&lt;STRONG class="markup--strong markup--p-strong"&gt;Azure portal&lt;/STRONG&gt;!&lt;/P&gt;
&lt;FIGURE class="graf graf--figure"&gt;&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3 id="toc-hId--291919878"&gt;Create a new Azure DocumentDB (with MongoDB compatibility) resource&lt;/H3&gt;
&lt;P&gt;In this step, you create an Azure Cosmos DB for MongoDB vCore Cluster to store your data, vector embedding, and perform vector search.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1.&amp;nbsp;Type&amp;nbsp;&lt;EM&gt;mongodb vcore&lt;/EM&gt;&amp;nbsp;in the&amp;nbsp;&lt;STRONG&gt;search bar&lt;/STRONG&gt;&amp;nbsp;at the top of the portal page and select&amp;nbsp;&lt;STRONG&gt;Azure Cosmos DB for MongoDB (vCore) &lt;/STRONG&gt;from the available options.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2. Select &lt;STRONG&gt;Create&amp;nbsp;&lt;/STRONG&gt;from the toolbar to start provisioning your new cluster.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3.&amp;nbsp;Add the following information to create a resource:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;What&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Value&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Subscription&lt;/td&gt;&lt;td&gt;Use your preferred subscription. It's advised to use the same subscription across all the resources that communicate with each other on Azure.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Resource group&lt;/td&gt;&lt;td&gt;Select &lt;STRONG&gt;Create new&amp;nbsp;&lt;/STRONG&gt;to create a new resource group. Enter a unique name for the resource group.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Cluster name&lt;/td&gt;&lt;td&gt;Enter a globally unique name.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Location&lt;/td&gt;&lt;td&gt;Select a region close to you for the best response time. For example, Select&amp;nbsp;&lt;STRONG&gt;UK South&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;MongoDB version&lt;/td&gt;&lt;td&gt;Select the latest available version of MongoDB.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;4. Select &lt;STRONG&gt;Configure&lt;/STRONG&gt; to configure your cluster tier.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;5.&amp;nbsp;Add the following information to configure the cluster tier. You can scale it up later:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;What&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Value&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Cluster tier&lt;/td&gt;&lt;td&gt;Select &lt;STRONG&gt;M25 &lt;/STRONG&gt;tier, 2 (Burstable) vCores.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Storage&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Select &lt;STRONG&gt;32 GiB&lt;/STRONG&gt;.&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;6. Select &lt;STRONG&gt;Save&lt;/STRONG&gt;.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;7. Enter the cluster &lt;STRONG&gt;Admin&lt;/STRONG&gt;&amp;nbsp;&lt;STRONG&gt;Username&lt;/STRONG&gt; and &lt;STRONG&gt;Password&lt;/STRONG&gt; and store them in a secure location.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;8. Select &lt;STRONG&gt;Next&lt;/STRONG&gt; to configure the networking settings.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;9. Select &lt;STRONG&gt;Allow Public Access from Azure&lt;/STRONG&gt; services and resources within the Azure to this cluster.&lt;/P&gt;
&lt;P&gt;10. Select &lt;STRONG&gt;Add current IP address&lt;/STRONG&gt; to the firewall rules to allow local access to the cluster.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;11. Select&lt;STRONG&gt; Review + create&lt;/STRONG&gt;.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;12.&amp;nbsp;Confirm your configuration settings and select&amp;nbsp;&lt;STRONG&gt;Create&lt;/STRONG&gt;&amp;nbsp;to start provisioning the resource.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note: The cluster creation can take up to 10 minutes. It's recommended to move on with the rest of the steps and get back to it later.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/FIGURE&gt;
&lt;H2 id="h_282699227321702581811866"&gt;Step 2:&amp;nbsp;Create an Azure OpenAI resource and Deploy chat and embedding Models&lt;/H2&gt;
&lt;P&gt;In this step, you'll:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Create an Azure OpenAI resource.&lt;/LI&gt;
&lt;LI&gt;Create chat and embedding model deployments.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Create an Azure OpenAI resource&lt;/H3&gt;
&lt;P&gt;In this step, you create an Azure OpenAI Service resource that enables you to interact with different large language models (LLMs).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1.&amp;nbsp;Type&amp;nbsp;&lt;EM&gt;openai&lt;/EM&gt;&amp;nbsp;in the&amp;nbsp;&lt;STRONG&gt;search bar&lt;/STRONG&gt;&amp;nbsp;at the top of the portal page and select&amp;nbsp;&lt;STRONG&gt;Azure OpenAI&amp;nbsp;&lt;/STRONG&gt;from the available options.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2. Select &lt;STRONG&gt;Create&lt;/STRONG&gt; from the toolbar to provision a new OpenAI resource.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3. Add the following information to create a resource:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;What&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Value&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Subscription&lt;/td&gt;&lt;td&gt;Use the same subscription you used to apply for Azure OpenAI access.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Resource group&lt;/td&gt;&lt;td&gt;Use the resource group you created in the previous step.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Region&lt;/td&gt;&lt;td&gt;Select a region close to you for the best response time. For example, Select&amp;nbsp;&lt;STRONG&gt;UK South&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Name&lt;/td&gt;&lt;td&gt;Enter a globally unique name.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Pricing tier&lt;/td&gt;&lt;td&gt;
&lt;DIV class="has-inner-focus"&gt;&amp;nbsp;Select &lt;STRONG&gt;S0&lt;/STRONG&gt;. Currently, this is the only available pricing tier.&lt;/DIV&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;4. Now that the basic information is added, select&amp;nbsp;&lt;STRONG&gt;Next&lt;/STRONG&gt;&amp;nbsp;to confirm your details and proceed to the next page.&lt;/P&gt;
&lt;P&gt;5. Select&amp;nbsp;&lt;STRONG&gt;Next&lt;/STRONG&gt;&amp;nbsp;to confirm your network details.&lt;/P&gt;
&lt;P&gt;6. Select&amp;nbsp;&lt;STRONG&gt;Next&lt;/STRONG&gt;&amp;nbsp;to confirm your tag details.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;7. Confirm your configuration settings and select&amp;nbsp;&lt;STRONG&gt;Create&lt;/STRONG&gt;&amp;nbsp;to start provisioning the resource. Wait for the deployment to finish.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;8. After the deployment finishes, select&amp;nbsp;&lt;STRONG&gt;Go to resource&lt;/STRONG&gt;&amp;nbsp;to inspect your created resource. Here, you can manage your resource and find important information like the endpoint URL and API keys.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Create chat and embedding model deployments&lt;/H3&gt;
&lt;P&gt;In this step, you create an Azure OpenAI embedding model deployment and a chat model deployment.&amp;nbsp;Creating a deployment on your previously provisioned resource allows you to generate text embeddings (i.e. numerical representation for text) and have a natural language conversation with your data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. Select&amp;nbsp;&lt;STRONG&gt;Go to Azure OpenAI&lt;/STRONG&gt;&amp;nbsp;&lt;STRONG&gt;Studio&lt;/STRONG&gt;&amp;nbsp;from the toolbar to open the studio.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2. Select&amp;nbsp;&lt;STRONG&gt;Create new deployment&amp;nbsp;&lt;/STRONG&gt;to go to the deployments tab.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3. Select&amp;nbsp;&lt;STRONG&gt;+ Create new deployment&amp;nbsp;&lt;/STRONG&gt;from the toolbar. A&lt;STRONG&gt;&amp;nbsp;Deploy model&lt;/STRONG&gt;&amp;nbsp;window opens.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;4.&amp;nbsp;Add the following information to create a chat model deployment:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;What&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Value&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Select a model&lt;/td&gt;&lt;td&gt;Select&amp;nbsp;&lt;STRONG&gt;gpt-35-turbo&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Model version&lt;/td&gt;&lt;td&gt;Select&amp;nbsp;&lt;STRONG&gt;0301&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Deployment name&lt;/td&gt;&lt;td&gt;Add a name that's unique for this cloud instance. For example,&lt;STRONG&gt;&amp;nbsp;chat-model&lt;/STRONG&gt;&amp;nbsp;because this model type is optimized for having conversations.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;5. Select&amp;nbsp;&lt;STRONG&gt;Create&lt;/STRONG&gt;.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;6. Select&amp;nbsp;&lt;STRONG&gt;+ Create new deployment&amp;nbsp;&lt;/STRONG&gt;from the toolbar. A&lt;STRONG&gt;&amp;nbsp;Deploy model&lt;/STRONG&gt;&amp;nbsp;window opens.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;7.&amp;nbsp;Add the following information to create an embedding model deployment:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;What&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Value&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Select a model&lt;/td&gt;&lt;td&gt;Select&amp;nbsp;&lt;STRONG&gt;text-embedding-ada-002&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Model version&lt;/td&gt;&lt;td&gt;Select&amp;nbsp;&lt;STRONG&gt;2&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Deployment name&lt;/td&gt;&lt;td&gt;Add a name that's unique for this cloud instance. For example,&lt;STRONG&gt;&amp;nbsp;embedding-model&lt;/STRONG&gt;&amp;nbsp;because this model type is optimized for creating embeddings.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;8. Select&amp;nbsp;&lt;STRONG&gt;Create&lt;/STRONG&gt;.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2 id="h_637567739611702581961458"&gt;Step 3: Create an Azure App Service and&amp;nbsp;Deploy the RAG Chat Application&lt;/H2&gt;
&lt;DIV&gt;
&lt;P&gt;In this step, you'll:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Fork the sample Repository on GitHub.&lt;/LI&gt;
&lt;LI&gt;Create an Azure App service resource with a deployment from GitHub.&lt;/LI&gt;
&lt;LI&gt;Modify Azure App service Application settings in the Azure portal.&lt;/LI&gt;
&lt;LI&gt;Configure the Workflow to deploy your application from GitHub.&lt;/LI&gt;
&lt;LI&gt;Test the website before and After adding the data.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Fork the sample Repository on GitHub&lt;/H3&gt;
&lt;P&gt;In this step, you create a copy from the source code on your GitHub account to be able to edit it and use it later.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. Visit the sample&amp;nbsp;&lt;A class="markup--anchor markup--p-anchor" href="https://github.com/john0isaac/rag-semantic-kernel-mongodb-vcore?wt.mc_id=studentamb_71460" target="_blank" rel="noopener noreferrer" data-href="https://portal.azure.com"&gt;github.com/john0isaac/rag-semantic-kernel-mongodb-vcore&lt;/A&gt;&amp;nbsp;in your browser&amp;nbsp;and&amp;nbsp;sign in.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2. Select&amp;nbsp;&lt;STRONG&gt;Fork&amp;nbsp;&lt;/STRONG&gt;from the top of the sample page.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3. Select an owner for the fork then, select&amp;nbsp;&lt;STRONG&gt;Create fork&lt;/STRONG&gt;.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Create an Azure App service resource with a deployment from GitHub&lt;/H3&gt;
&lt;P&gt;In this step, you create an Azure App service resource and connect it with your GitHub account to deploy a Python application.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1.&amp;nbsp;Type &lt;EM&gt;app service&amp;nbsp;&lt;/EM&gt;in the&amp;nbsp;&lt;STRONG&gt;search bar&lt;/STRONG&gt;&amp;nbsp;at the top of the portal page and select&amp;nbsp;&lt;STRONG&gt;App Services&amp;nbsp;&lt;/STRONG&gt;from the available options.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2. Select &lt;STRONG&gt;Create Web App&lt;/STRONG&gt; from the toolbar to start provisioning a new web application.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3.&amp;nbsp;&amp;nbsp;Add the following information to fill in the basic configuration of the application:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;What&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Value&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Subscription&lt;/td&gt;&lt;td&gt;Use the same subscription you used to apply for Azure OpenAI access.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Resource group&lt;/td&gt;&lt;td&gt;Use the same resource group you created before.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Name&lt;/td&gt;&lt;td&gt;Enter a unique name for your website. For example,&amp;nbsp;&lt;STRONG&gt;rag-mongodb-demo&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Publish?&lt;/td&gt;&lt;td&gt;Select&amp;nbsp;&lt;STRONG&gt;Code&lt;/STRONG&gt;. This option specifies whether your deployment consists of code or a container.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Runtime stack&lt;/td&gt;&lt;td&gt;Select &lt;STRONG&gt;Python 3.10&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Operating System&lt;/td&gt;&lt;td&gt;Select&amp;nbsp;&lt;STRONG&gt;Linux&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Region&lt;/td&gt;&lt;td&gt;Select&amp;nbsp;&lt;STRONG&gt;UK South&lt;/STRONG&gt;. This is the region where the rest of the resources you created reside.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;4. Add the following information to create the app service plan. You can scale it up later:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;What&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Value&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Linux Plan&lt;/td&gt;&lt;td&gt;Select a pre-existing plan or create a new plan.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Pricing Plan&lt;/td&gt;&lt;td&gt;
&lt;P&gt;&amp;nbsp;Select &lt;STRONG&gt;Basic B1&lt;/STRONG&gt;.&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;5. Select &lt;STRONG&gt;Deployment&lt;/STRONG&gt; from the toolbar to move to the deployment configuration tab.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;6. Add the following information to enable continuous deployment from GitHub:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;What&lt;/td&gt;&lt;td&gt;Value&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Continuous deployment&lt;/td&gt;&lt;td&gt;Select&lt;STRONG&gt;&amp;nbsp;Enable&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;GitHub account&lt;/td&gt;&lt;td&gt;Select your GitHub account.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Organization&lt;/td&gt;&lt;td&gt;Select your organization. If you are using your personal account then select it.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Repository&lt;/td&gt;&lt;td&gt;Select&lt;STRONG&gt; rag-semantic-kernel-mongodb-vcore&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Branch&lt;/td&gt;&lt;td&gt;Select&amp;nbsp;&lt;STRONG&gt;main&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;7. Select &lt;STRONG&gt;Review + create&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;8. Confirm your configuration settings and select&amp;nbsp;&lt;STRONG&gt;Create&lt;/STRONG&gt;&amp;nbsp;to start provisioning the resource. Wait for the deployment to finish.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;9. After the deployment finishes, select&amp;nbsp;&lt;STRONG&gt;Go to resource&lt;/STRONG&gt;&amp;nbsp;to inspect your created resource. Here, you can manage your resource and find important information like the application settings and logs.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3 id="toc-hId--503538492"&gt;Modify Azure App service Application settings in the Azure portal&lt;/H3&gt;
&amp;nbsp;In this step, you configure the Application settings to make the website able to communicate with other cloud resources.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;1. In the Web App resource, select&amp;nbsp;&lt;STRONG&gt;Configuration&lt;/STRONG&gt;&amp;nbsp;from the left side menu.&lt;/DIV&gt;
&lt;DIV&gt;&lt;img /&gt;&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;2. Select&amp;nbsp;&lt;STRONG&gt;+ New application&lt;/STRONG&gt;&amp;nbsp;&lt;STRONG&gt;setting&lt;/STRONG&gt;&amp;nbsp;to add new environment variables to the function configuration.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;3. Add the following names and values one by one and select&amp;nbsp;&lt;STRONG&gt;Ok&lt;/STRONG&gt;. Make sure to add your own values.&lt;/DIV&gt;
&lt;DIV&gt;These application settings are for the Azure OpenAI resources that you created:&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;What&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Value&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;AZURE_OPENAI_CHAT_DEPLOYMENT_NAME&lt;/td&gt;&lt;td&gt;&amp;lt;chatModelDeploymentName&amp;gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME&lt;/td&gt;&lt;td&gt;&amp;lt;embeddingModelDeploymentName&amp;gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;AZURE_OPENAI_DEPLOYMENT_NAME&lt;/td&gt;&lt;td&gt;&amp;lt;azureOpenAiResourceName&amp;gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;AZURE_OPENAI_ENDPOINT&lt;/td&gt;&lt;td&gt;https://&amp;lt;azureOpenAiResourceName&amp;gt;.openai.azure.com/&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;AZURE_OPENAI_API_KEY&lt;/td&gt;&lt;td&gt;&amp;lt;azureOpenAiResourceKey&amp;gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;You can get the Azure OpenAI key from the Azure OpenAI resource page.&lt;/DIV&gt;
&lt;DIV&gt;Select &lt;STRONG&gt;Keys and Endpoint&lt;/STRONG&gt; and copy any of the available keys.&lt;/DIV&gt;
&lt;DIV&gt;&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/DIV&gt;
&lt;DIV&gt;These application settings are for Azure Cosmos DB for MongoDB vCore:&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;AZCOSMOS_API&lt;/td&gt;&lt;td&gt;mongo-vcore&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;AZCOSMOS_CONNSTR&lt;/td&gt;&lt;td&gt;mongodb+srv://&amp;lt;mongoAdminUser&amp;gt;:&amp;lt;mongoAdminPassword&amp;gt;@&amp;lt;mongoClusterName&amp;gt;.global.mongocluster.cosmos.azure.com/?tls=true&amp;amp;authMechanism=SCRAM-SHA-256&amp;amp;retrywrites=false&amp;amp;maxIdleTimeMS=120000&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV&gt;You can get the Cosmo DB connection string from the Azure Cosmos DB for MongoDB vCore resource page.&lt;/DIV&gt;
&lt;DIV&gt;Select &lt;STRONG&gt;Connection strings&amp;nbsp;&lt;/STRONG&gt;and copy the connection string. Make sure to replace the user and password with the ones you created.&lt;/DIV&gt;
&lt;DIV&gt;&lt;img /&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;These application settings are &lt;STRONG&gt;new&lt;/STRONG&gt; and will be created when the application starts:&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;AZCOSMOS_DATABASE_NAME&lt;/td&gt;&lt;td&gt;&amp;lt;cosmosDatabaseName&amp;gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;AZCOSMOS_CONTAINER_NAME&lt;/td&gt;&lt;td&gt;&amp;lt;cosmosContainerName&amp;gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any value should work for them.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;4. Select &lt;STRONG&gt;Save&lt;/STRONG&gt;.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;5. Select &lt;STRONG&gt;General settings&amp;nbsp;&lt;/STRONG&gt;to edit the application startup command.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;6. Type &lt;EM&gt;entrypoint.sh&lt;/EM&gt;&amp;nbsp;in the startup command field and select &lt;STRONG&gt;Save&lt;/STRONG&gt;.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/DIV&gt;
&lt;DIV&gt;
&lt;H3&gt;Configure the Workflow to deploy your application from GitHub&lt;/H3&gt;
In this step, you modify the GitHub deployment workflow to point to the folder that contains the application.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;1. Visit your forked repository on GitHub and notice the failing workflow.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;2.&amp;nbsp;Open the function workflow file&amp;nbsp;&lt;EM&gt;.github/workflows/main_ragmongodbdemo.yml&lt;/EM&gt;.&lt;/DIV&gt;
&lt;DIV&gt;&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3.&amp;nbsp;&amp;nbsp;Open the file and select the pen icon to edit it.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;4. Modify lines 31 and 36 to the following:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;31&lt;/td&gt;&lt;td&gt;run: cd src &amp;amp;&amp;amp; pip install -r ./requirements.txt&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;36&lt;/td&gt;&lt;td&gt;run: cd src &amp;amp;&amp;amp; zip ../release.zip ./* -r&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;5. Select&amp;nbsp;&lt;STRONG&gt;Commit changes&lt;/STRONG&gt;, and review your commit message and description. Select&amp;nbsp;&lt;STRONG&gt;Commit changes&lt;/STRONG&gt;.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
6. Select &lt;STRONG&gt;Actions&lt;/STRONG&gt; to review the workflow run status.
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Test the website before and After adding the data&lt;/H3&gt;
In this step, you test the application before adding the data, add the data, and test again.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;1. Select the workflow name to open it and get the website URL.&lt;/DIV&gt;
&lt;DIV&gt;&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2. Type in the chat message&lt;EM&gt; What is Azure Functions?&lt;/EM&gt;&amp;nbsp;and it should respond with &lt;EM&gt;I don't know&lt;/EM&gt;.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
3. Navigate to your Azure App service resource page and select &lt;STRONG&gt;SSH&lt;/STRONG&gt;.
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;4. Select &lt;STRONG&gt;Go&lt;/STRONG&gt; to open a new SSH page.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;5.&amp;nbsp;In the SSH terminal, run this command:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;&lt;CODE&gt;python ./scripts/add_data.py&lt;/CODE&gt;&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 100.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;6.&amp;nbsp;Navigate back to the live website and type in the chat message&lt;EM&gt; What is Azure Functions?&lt;/EM&gt;&amp;nbsp;and it should respond with the correct answer now.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Congratulations!! You successfully built the full application.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P data-unlink="true"&gt;If you want to learn how to add your own data see &lt;A href="https://github.com/john0isaac/rag-semantic-kernel-mongodb-vcore?tab=readme-ov-file#add-your-own-data" target="_self"&gt;this guide&lt;/A&gt;&amp;nbsp;on the repository's main readme.&lt;/P&gt;
&lt;P data-unlink="true"&gt;&amp;nbsp;&lt;/P&gt;
&lt;/DIV&gt;
&lt;DIV&gt;
&lt;H2 id="toc-hId-1503395954"&gt;Clean Up&lt;/H2&gt;
&lt;P&gt;Once you finish experimenting on&amp;nbsp;Microsoft Azure you might want to delete the resources to not consume any more money from your subscription.&lt;/P&gt;
&lt;P&gt;You can delete the resource group and it will delete everything inside it or delete the resources one by one that's totally up to you.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/DIV&gt;
&lt;H2&gt;Conclusion&lt;/H2&gt;
&lt;P&gt;Congratulations! You've learned how to create an Azure Cosmos DB for MongoDB vCore cluster, how to create an Azure OpenAI resource, how to deploy an embedding model and a chat model from Azure OpenAI studio, how to create an Azure App service and configure continuous deployment with GitHub, and how to modify application settings to enable the communication across Azure resources. By using these technologies, you can build a RAG chat application with the option to perform vector search too over your own data and provide grounded (relevant) responses.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;Next steps&lt;/H2&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Documentation&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/azure/ai-services/openai/overview?wt.mc_id=studentamb_71460" target="_blank" rel="noopener"&gt;What is Azure OpenAI Service? - Azure AI services&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/azure/ai-services/openai/concepts/understand-embeddings?wt.mc_id=studentamb_71460" target="_blank" rel="noopener"&gt;Azure OpenAI Service embeddings - Azure OpenAI - embeddings and cosine similarity&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/azure/cosmos-db/mongodb/vcore/?wt.mc_id=studentamb_71460" target="_blank" rel="noopener"&gt;Azure Cosmos DB for MongoDB vCore documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/azure/cosmos-db/mongodb/vcore/vector-search/?wt.mc_id=studentamb_71460" target="_blank" rel="noopener"&gt;Vector Search - Azure Cosmos DB for MongoDB vCore&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/semantic-kernel/?wt.mc_id=studentamb_71460" target="_blank" rel="noopener"&gt;Semantic Kernel documentation&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Training Content&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/training/paths/develop-ai-solutions-azure-openai/?wt.mc_id=studentamb_71460" target="_blank" rel="noopener"&gt;Develop Generative AI solutions with Azure OpenAI Service&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/training/paths/azure-cosmos-db-api-for-mongodb/?wt.mc_id=studentamb_71460" target="_blank" rel="noopener"&gt;Azure Cosmos DB for MongoDB&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="graf graf--p"&gt;Found this useful? Share it with others and follow me to get updates on:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Twitter (&lt;A href="https://twitter.com/john00isaac?wt.mc_id=studentamb_71460" target="_blank" rel="nofollow noopener noreferrer"&gt;twitter.com/john00isaac&lt;/A&gt;)&lt;/LI&gt;
&lt;LI&gt;LinkedIn (&lt;A href="https://www.linkedin.com/in/john0isaac/?wt.mc_id=studentamb_71460" target="_blank" rel="nofollow noopener noreferrer"&gt;linkedin.com/in/john0isaac&lt;/A&gt;)&lt;/LI&gt;
&lt;/UL&gt;
&lt;BLOCKQUOTE class="graf graf--pullquote"&gt;Feel free to share your comments and/or inquiries in the comment section below..
&lt;P class="1702586402308"&gt;See you in future&amp;nbsp;demos!&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;</description>
      <pubDate>Tue, 26 May 2026 22:08:40 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/build-ai-rag-apps-with-azure-documentdb-with-mongodb/ba-p/4497192</guid>
      <dc:creator>JohnAziz</dc:creator>
      <dc:date>2026-05-26T22:08:40Z</dc:date>
    </item>
    <item>
      <title>CI/CD for AI Agents on Microsoft Foundry</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/ci-cd-for-ai-agents-on-microsoft-foundry/ba-p/4522218</link>
      <description>&lt;H1&gt;Introduction&lt;/H1&gt;
&lt;P&gt;Building an AI agent is the straightforward part. Shipping it reliably to production with version control, evaluation-driven quality gates, multi-environment promotion, and enterprise governance is where most teams run into friction.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Microsoft Foundry&lt;/STRONG&gt; changes this. It is Microsoft's AI app and agent factory: a fully managed platform for building, deploying, and governing AI agents at scale. It provides a first-class agent runtime with built-in lifecycle management, making it possible to apply the same CI/CD rigour you already use for application software to AI agents — regardless of whether you are building containerised hosted agents or declarative prompt-based agents.&lt;/P&gt;
&lt;P&gt;This post walks through a complete, production-ready reference architecture for doing exactly that. You will find the GitHub Actions workflow, the Azure DevOps pipeline YAML, and the architecture diagram linked throughout.&lt;/P&gt;
&lt;P&gt;Reference implementation repository: &lt;A href="https://github.com/ericchansen/foundry-agents-lifecycle" target="_blank" rel="noopener"&gt;foundry-agents-lifecycle&lt;/A&gt;&lt;BR /&gt;and&amp;nbsp;&lt;A class="lia-external-url" href="https://github.com/leestott/foundry-cicd" target="_blank" rel="noopener"&gt;CI/CD for AI Agents on Microsoft Foundry&lt;/A&gt;&lt;/P&gt;
&lt;HR /&gt;
&lt;H2&gt;Why Agent CI/CD Is Different&lt;/H2&gt;
&lt;P&gt;Traditional software pipelines gate releases on test pass/fail. Agent pipelines require an additional, critical layer: &lt;STRONG&gt;evaluation-driven quality gates&lt;/STRONG&gt;. Before any agent version can be promoted to the next environment, it must pass three categories of evaluation:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Quality&lt;/STRONG&gt; — answer correctness, task completion rate, hallucination rate&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Safety&lt;/STRONG&gt; — grounded responses, policy compliance, tool usage validation&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Performance&lt;/STRONG&gt; — token usage per query, p95 response latency&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;A second key difference is the &lt;STRONG&gt;deployment unit&lt;/STRONG&gt;. You are not deploying a binary or a container tag in isolation. You are deploying an &lt;EM&gt;agent version&lt;/EM&gt; — an immutable artefact that bundles the model selection, system instructions, tool definitions, and configuration together. This is what enables deterministic promotion and full auditability across environments.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;"Agents follow a standard CI/CD pattern, but with a critical shift: promotion happens at the &lt;STRONG&gt;agent version&lt;/STRONG&gt; level, and release gates are driven by &lt;STRONG&gt;evaluation outcomes&lt;/STRONG&gt;, not just test results."&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;HR /&gt;
&lt;H2&gt;Reference Architecture&lt;/H2&gt;
&lt;img /&gt;
&lt;P&gt;&lt;EM&gt;Figure 1: End-to-end CI/CD reference architecture for hosted and prompt-based agents on Microsoft Foundry.&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;The architecture has five logical layers, flowing from developer commit to production monitoring:&lt;/P&gt;
&lt;H3&gt;Layer 1 — Developer Layer&lt;/H3&gt;
&lt;P&gt;The developer layer is a standard source-controlled repository in GitHub or Azure DevOps. It contains:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Agent code written in Python or .NET&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;agent.yaml&lt;/CODE&gt; or prompt definition files for prompt-based agents&lt;/LI&gt;
&lt;LI&gt;Tool configurations: MCP servers, REST API connectors, or other integrations&lt;/LI&gt;
&lt;LI&gt;Infrastructure as Code: Bicep or ARM templates for provisioning the Foundry project and dependencies&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Layer 2 — CI Pipeline (Build · Validate · Evaluate)&lt;/H3&gt;
&lt;P&gt;Every push or pull request triggers the CI pipeline. It performs five steps:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;Docker build&lt;/STRONG&gt; — for hosted agents, build and tag the container image&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Static checks&lt;/STRONG&gt; — lint with &lt;CODE&gt;ruff&lt;/CODE&gt;, security scan with &lt;CODE&gt;bandit&lt;/CODE&gt;, agent YAML schema validation&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Unit and tool tests&lt;/STRONG&gt; — &lt;CODE&gt;pytest&lt;/CODE&gt; suites covering agent logic and tool integrations&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Evaluation gate&lt;/STRONG&gt; — run evaluation datasets; fail the pipeline if thresholds are breached&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Image push&lt;/STRONG&gt; — push the validated container to Azure Container Registry (ACR)&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Prompt-based agents skip the Docker build step. Instead, the YAML definition and prompt bundle are validated against schema and evaluated against golden datasets.&lt;/P&gt;
&lt;H3&gt;Layer 3 — CD Pipeline (Multi-stage Promotion)&lt;/H3&gt;
&lt;P&gt;A single agent version is promoted through three Foundry project environments:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Stage&lt;/th&gt;&lt;th&gt;Environment&lt;/th&gt;&lt;th&gt;Activities&lt;/th&gt;&lt;th&gt;Gate&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Stage 1&lt;/td&gt;&lt;td&gt;Dev Foundry Project&lt;/td&gt;&lt;td&gt;Deploy vNext version, smoke tests, developer evals&lt;/td&gt;&lt;td&gt;Eval quality thresholds&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Stage 2&lt;/td&gt;&lt;td&gt;Test / QA Foundry Project&lt;/td&gt;&lt;td&gt;Scenario tests, HITL validation, safety evaluation&lt;/td&gt;&lt;td&gt;Eval gates + human approval&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Stage 3&lt;/td&gt;&lt;td&gt;Production Foundry Project&lt;/td&gt;&lt;td&gt;Promote version, enable endpoint, post-deploy smoke test&lt;/td&gt;&lt;td&gt;Required reviewer approval&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;Rollback is straightforward: switch the active version pointer back to the previous agent version. No re-deployment is needed.&lt;/P&gt;
&lt;H3&gt;Layer 4 — Microsoft Foundry Agent Service&lt;/H3&gt;
&lt;P&gt;The Foundry Agent Service runtime provides:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Hosted agent runtime&lt;/STRONG&gt; — managed container execution supporting Agent Framework, LangGraph, Semantic Kernel, or custom code&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Prompt-based agent runtime&lt;/STRONG&gt; — declarative agent definitions, no container required&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Built-in lifecycle operations&lt;/STRONG&gt; — version, start, stop, rollback&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Entra Agent Identity&lt;/STRONG&gt; — each deployed version receives a dedicated Microsoft Entra managed identity&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;RBAC and policy enforcement&lt;/STRONG&gt; — Azure role-based access controls per project&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Observability&lt;/STRONG&gt; — distributed traces, structured logs, and evaluation signals&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Layer 5 — Monitoring, Governance, and Control Plane&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;Foundry control plane: agent registry, environment configuration, version history&lt;/LI&gt;
&lt;LI&gt;OpenTelemetry forwarded to Azure Monitor and Application Insights&lt;/LI&gt;
&lt;LI&gt;Continuous evaluation pipelines for ongoing quality, grounding, and safety monitoring&lt;/LI&gt;
&lt;LI&gt;Azure Policy and RBAC enforcement at the platform level&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H2&gt;Environment Topology&lt;/H2&gt;
&lt;P&gt;There are two topology options. We recommend &lt;STRONG&gt;Option A&lt;/STRONG&gt; for all production workloads:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Option&lt;/th&gt;&lt;th&gt;Structure&lt;/th&gt;&lt;th&gt;Best for&lt;/th&gt;&lt;th&gt;Trade-off&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;A — Recommended&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Dev Project → Test Project → Prod Project (separate Foundry projects)&lt;/td&gt;&lt;td&gt;Enterprise workloads&lt;/td&gt;&lt;td&gt;Full isolation, clean RBAC boundaries, easier governance&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;B — Lightweight&lt;/td&gt;&lt;td&gt;Single Foundry project with agent version tags (dev/test/prod)&lt;/td&gt;&lt;td&gt;Small teams, prototyping&lt;/td&gt;&lt;td&gt;Simpler setup, but weaker environment separation&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;Separate projects mean separate RBAC policies, separate connection strings, and separate evaluation signals. A developer service principal has access only to the Dev project; the CI/CD identity has restricted access to promote to Test and Production.&lt;/P&gt;
&lt;HR /&gt;
&lt;H2&gt;Evaluation Gates — The Core Difference&lt;/H2&gt;
&lt;P&gt;Evaluation gates transform a standard software pipeline into an AI-safe deployment pipeline. They run at two points: pre-merge (CI) and pre-promotion (CD).&lt;/P&gt;
&lt;H3&gt;Defining the Gates&lt;/H3&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Category&lt;/th&gt;&lt;th&gt;Metric&lt;/th&gt;&lt;th&gt;CI threshold&lt;/th&gt;&lt;th&gt;Prod threshold&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Quality&lt;/td&gt;&lt;td&gt;Hallucination rate&lt;/td&gt;&lt;td&gt;&amp;lt; 5%&lt;/td&gt;&lt;td&gt;&amp;lt; 3%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Quality&lt;/td&gt;&lt;td&gt;Task completion rate&lt;/td&gt;&lt;td&gt;&amp;gt; 90%&lt;/td&gt;&lt;td&gt;&amp;gt; 95%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Safety&lt;/td&gt;&lt;td&gt;Grounded response rate&lt;/td&gt;&lt;td&gt;&amp;gt; 95%&lt;/td&gt;&lt;td&gt;&amp;gt; 98%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Safety&lt;/td&gt;&lt;td&gt;Policy violations&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Performance&lt;/td&gt;&lt;td&gt;p95 latency&lt;/td&gt;&lt;td&gt;&amp;lt; 4 000 ms&lt;/td&gt;&lt;td&gt;&amp;lt; 3 000 ms&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Cost&lt;/td&gt;&lt;td&gt;Token usage per query&lt;/td&gt;&lt;td&gt;Track only&lt;/td&gt;&lt;td&gt;Alert on &amp;gt; 20% regression&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H3&gt;Gate Enforcement (Python)&lt;/H3&gt;
&lt;PRE&gt;&lt;CODE&gt;import json
import sys

def check_gates(results_path: str) -&amp;gt; None:
    with open(results_path) as f:
        results = json.load(f)

    failures = []

    if results["hallucination_rate"] &amp;gt; 0.05:
        failures.append(f"Hallucination rate {results['hallucination_rate']:.1%} exceeds 5% threshold")

    if results["task_completion_rate"] &amp;lt; 0.90:
        failures.append(f"Task completion {results['task_completion_rate']:.1%} below 90% threshold")

    if results["latency_p95_ms"] &amp;gt; 4000:
        failures.append(f"p95 latency {results['latency_p95_ms']}ms exceeds 4000ms threshold")

    if results.get("policy_violations", 0) &amp;gt; 0:
        failures.append(f"Policy violations detected: {results['policy_violations']}")

    if failures:
        for f in failures:
            print(f"GATE FAILED: {f}", file=sys.stderr)
        sys.exit(1)

    print("All evaluation gates passed — proceeding to deployment")

if __name__ == "__main__":
    check_gates(sys.argv[1])
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;HR /&gt;
&lt;H2&gt;Hosted vs Prompt-Based Agents — Pipeline Differences&lt;/H2&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Capability&lt;/th&gt;&lt;th&gt;Hosted Agents&lt;/th&gt;&lt;th&gt;Prompt-Based Agents&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Deployment unit&lt;/td&gt;&lt;td&gt;Container image + agent definition&lt;/td&gt;&lt;td&gt;YAML / prompt configuration bundle&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Build step required&lt;/td&gt;&lt;td&gt;Yes — Docker build + ACR push&lt;/td&gt;&lt;td&gt;No — YAML validation only&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Supported frameworks&lt;/td&gt;&lt;td&gt;Agent Framework, LangGraph, Semantic Kernel, custom&lt;/td&gt;&lt;td&gt;Foundry declarative runtime&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Promotion artefact&lt;/td&gt;&lt;td&gt;Versioned agent with container image reference&lt;/td&gt;&lt;td&gt;Versioned prompt/config bundle&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;CI focus&lt;/td&gt;&lt;td&gt;Code quality, tool tests, evaluation&lt;/td&gt;&lt;td&gt;Prompt schema validation, evaluation&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Rollback mechanism&lt;/td&gt;&lt;td&gt;Switch active agent version&lt;/td&gt;&lt;td&gt;Switch active agent version&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Runtime management&lt;/td&gt;&lt;td&gt;Foundry manages container lifecycle&lt;/td&gt;&lt;td&gt;Foundry manages declarative runtime&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;HR /&gt;
&lt;H2&gt;CI Pipeline Walkthrough&lt;/H2&gt;
&lt;P&gt;The following steps are representative of the full GitHub Actions workflow available in &lt;CODE&gt;github-actions-pipeline.yml&lt;/CODE&gt; alongside this post.&lt;/P&gt;
&lt;H3&gt;Hosted Agent CI&lt;/H3&gt;
&lt;PRE&gt;&lt;CODE&gt;# 1. Static checks
ruff check .
bandit -r src/ -ll
python scripts/validate_agent_config.py --config agent.yaml

# 2. Tests
pytest tests/unit/ -v --tb=short
pytest tests/tools/ -v --tb=short

# 3. Evaluation gate
python scripts/run_evaluations.py \
    --dataset eval/datasets/golden_set.jsonl \
    --output  eval/results/results.json

python scripts/check_eval_gates.py \
    --results eval/results/results.json \
    --max-hallucination   0.05 \
    --min-task-completion 0.90 \
    --max-latency-p95     4000

# 4. Push container image
az acr build \
    --registry myregistry.azurecr.io \
    --image    "myagent:$SHA" \
    --file     Dockerfile .
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H3&gt;Prompt-Based Agent CI&lt;/H3&gt;
&lt;PRE&gt;&lt;CODE&gt;# Validate YAML / prompt definitions
python scripts/validate_agent_config.py --config agent.yaml

# Evaluation against golden dataset
python scripts/run_evaluations.py \
    --dataset eval/datasets/golden_set.jsonl \
    --output  eval/results/results.json

python scripts/check_eval_gates.py \
    --results eval/results/results.json
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;HR /&gt;
&lt;H2&gt;CD Pipeline Walkthrough&lt;/H2&gt;
&lt;H3&gt;Stage 1 — Dev Deployment&lt;/H3&gt;
&lt;PRE&gt;&lt;CODE&gt;python scripts/deploy_agent.py \
    --env              dev \
    --image            "myregistry.azurecr.io/myagent:$SHA" \
    --foundry-endpoint $FOUNDRY_ENDPOINT_DEV \
    --agent-config     agent.yaml

# Returns the new agent version ID, stored for promotion
AGENT_VERSION=$(python scripts/get_active_version.py --env dev)
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H3&gt;Stage 2 — Promote to Test (after approval gate)&lt;/H3&gt;
&lt;PRE&gt;&lt;CODE&gt;python scripts/promote_agent.py \
    --from-env         dev \
    --to-env           test \
    --agent-version    $AGENT_VERSION \
    --foundry-endpoint $FOUNDRY_ENDPOINT_TEST

# Run scenario tests and safety evaluation
python scripts/run_evaluations.py \
    --dataset  eval/datasets/scenario_set.jsonl \
    --output   eval/results/test-results.json

python scripts/check_eval_gates.py \
    --results              eval/results/test-results.json \
    --max-hallucination    0.03 \
    --min-task-completion  0.95
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H3&gt;Stage 3 — Promote to Production (after required reviewer approval)&lt;/H3&gt;
&lt;PRE&gt;&lt;CODE&gt;python scripts/promote_agent.py \
    --from-env         test \
    --to-env           prod \
    --agent-version    $AGENT_VERSION \
    --foundry-endpoint $FOUNDRY_ENDPOINT_PROD

# Enable the production endpoint
python scripts/enable_agent_endpoint.py \
    --agent-version    $AGENT_VERSION \
    --foundry-endpoint $FOUNDRY_ENDPOINT_PROD
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H3&gt;Rollback&lt;/H3&gt;
&lt;PRE&gt;&lt;CODE&gt;# Switch the active version to the previous known-good version
python scripts/promote_agent.py \
    --from-env         prod \
    --to-env           prod \
    --agent-version    $PREVIOUS_AGENT_VERSION \
    --foundry-endpoint $FOUNDRY_ENDPOINT_PROD

# OR delete the failing version
python scripts/delete_agent_version.py \
    --agent-version    $AGENT_VERSION \
    --foundry-endpoint $FOUNDRY_ENDPOINT_PROD
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;HR /&gt;
&lt;H2&gt;Deployment Using the Azure AI Projects SDK&lt;/H2&gt;
&lt;P&gt;The &lt;CODE&gt;azure-ai-projects&lt;/CODE&gt; SDK provides programmatic control over the full agent lifecycle. This is the recommended approach for CI/CD scripts where you need deterministic, scriptable deployment.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient

# Connect to the Foundry project
client = AIProjectClient(
    endpoint=FOUNDRY_PROJECT_ENDPOINT,
    credential=DefaultAzureCredential()
)

# List existing agents (useful for idempotent deploy scripts)
for agent in client.agents.list():
    print(f"Agent: {agent.name}  version: {agent.id}")

# Create a new agent version (hosted agent)
agent = client.agents.create_agent(
    model="gpt-4o",
    name="my-enterprise-agent",
    instructions="You are a helpful assistant ...",
    tools=[...],  # tool definitions
    metadata={"version": GIT_SHA, "environment": "dev"}
)
print(f"Created agent version: {agent.id}")
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;For hosted agents, the SDK call also references the container image pushed to ACR. Refer to the &lt;A href="https://learn.microsoft.com/en-us/azure/foundry/agents/how-to/deploy-hosted-agent" target="_blank" rel="noopener"&gt; Deploy a hosted agent — Microsoft Foundry&lt;/A&gt; documentation for the full SDK flow including container image registration and version polling.&lt;/P&gt;
&lt;HR /&gt;
&lt;H2&gt;Reference Implementation Stack&lt;/H2&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Concern&lt;/th&gt;&lt;th&gt;Technology&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Source control and pipelines&lt;/td&gt;&lt;td&gt;GitHub Actions or Azure DevOps Pipelines&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Infrastructure and agent deployment&lt;/td&gt;&lt;td&gt;Azure Developer CLI (&lt;CODE&gt;azd up&lt;/CODE&gt;)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Programmatic agent lifecycle&lt;/td&gt;&lt;td&gt;&lt;CODE&gt;azure-ai-projects&lt;/CODE&gt; Python SDK&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Agent evaluation&lt;/td&gt;&lt;td&gt;&lt;CODE&gt;azure-ai-evaluation&lt;/CODE&gt; Python SDK&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Agent runtime&lt;/td&gt;&lt;td&gt;Microsoft Foundry Agent Service&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Container registry&lt;/td&gt;&lt;td&gt;Azure Container Registry (hosted agents only)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Observability&lt;/td&gt;&lt;td&gt;OpenTelemetry, Azure Monitor, Application Insights&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Identity and access&lt;/td&gt;&lt;td&gt;Microsoft Entra (Agent ID, OIDC workload identity federation)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Governance&lt;/td&gt;&lt;td&gt;Azure Policy, RBAC, Foundry control plane&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;HR /&gt;
&lt;H2&gt;Governance and Responsible AI&lt;/H2&gt;
&lt;P&gt;Shipping AI agents at enterprise scale requires governance beyond what a traditional CI/CD pipeline provides. Microsoft Foundry addresses this at the platform level:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;RBAC per environment&lt;/STRONG&gt; — each Foundry project has independent access controls. Developers deploy to Dev; only CI/CD service principals (with audited OIDC tokens) can promote to Test and Production.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Agent registry and audit trail&lt;/STRONG&gt; — the Foundry control plane records which agent version is active in each environment, who deployed it, and when. This satisfies enterprise audit requirements without additional tooling.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Content safety and policy enforcement&lt;/STRONG&gt; — Azure Policy governs model access, data handling, and content safety rules at the infrastructure level, not just at the application code level. Policy violations block deployment automatically.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Entra Agent Identity&lt;/STRONG&gt; — each deployed agent version receives a dedicated, short-lived managed identity. Agents authenticate to downstream services using least-privilege credentials scoped to that specific deployment.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Continuous evaluation in production&lt;/STRONG&gt; — evaluation pipelines run on sampled production traffic, alerting when quality, safety, or cost metrics drift from their baseline.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;A key trade-off to be transparent about: evaluation datasets must be maintained and updated as the agent's tasks evolve. Stale datasets produce misleading pass/fail signals. Treat your golden evaluation set as a first-class engineering artefact alongside the agent code itself.&lt;/P&gt;
&lt;HR /&gt;
&lt;H2&gt;Pipeline Files&lt;/H2&gt;
&lt;P&gt;Two pipeline files accompany this reference architecture. Both implement the same four-stage pipeline (CI Build, CI Evaluate, CD Dev, CD Test, CD Production) with environment-appropriate approval gates.&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;CODE&gt;&lt;A class="lia-external-url" href="https://github.com/leestott/foundry-cicd" target="_blank" rel="noopener"&gt;github-actions-pipeline.yml&lt;/A&gt;&lt;/CODE&gt;&lt;/STRONG&gt; — GitHub Actions workflow. Uses GitHub Environments for approval gates and OIDC Workload Identity Federation for passwordless Azure authentication. No stored Azure credentials required.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;&lt;CODE&gt;&lt;A class="lia-external-url" href="https://github.com/leestott/foundry-cicd" target="_blank" rel="noopener"&gt;azure-devops-pipeline.yml&lt;/A&gt;&lt;/CODE&gt;&lt;/STRONG&gt; — Azure DevOps multi-stage YAML pipeline. Uses ADO Environments with required approvers and variable groups per environment.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Both pipelines share these security practices:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;OIDC / Workload Identity Federation — no long-lived Azure credentials stored in pipeline secrets&lt;/LI&gt;
&lt;LI&gt;Per-environment variable groups, each with scoped connection strings and endpoints&lt;/LI&gt;
&lt;LI&gt;Evaluation quality gates enforced before every promotion step&lt;/LI&gt;
&lt;LI&gt;Mandatory human approval before production deployment&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H2&gt;Summary&lt;/H2&gt;
&lt;P&gt;The full pipeline in one view:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;Developer commit
        |
   CI Pipeline
   ├── Docker build (hosted agents) / YAML validation (prompt agents)
   ├── Static checks + unit tests + tool tests
   └── Evaluation gate  ←  quality · safety · performance
        |
   Agent Version created  ← immutable, versioned artefact
        |
   CD Pipeline
   ├── Deploy to Dev       → smoke tests + eval gate
   ├── Promote to Test     → scenario tests + HITL + approval gate
   └── Promote to Prod     → enable endpoint + monitoring
        |
   Microsoft Foundry Agent Service
   └── Versioned runtime · Entra identity · RBAC · Observability
        |
   Control Plane
   └── Agent registry · Governance · Continuous evaluation
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Microsoft Foundry provides the platform primitives — versioned agent deployments, multi-environment Foundry projects, built-in lifecycle management, and an enterprise observability stack — needed to operate AI agents with the same confidence as any production software system.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;The key takeaway:&lt;/STRONG&gt; treat the agent version as your deployment artefact, and evaluation outcomes as your release gate. The rest follows familiar CI/CD patterns you already know and trust.&lt;/P&gt;
&lt;HR /&gt;
&lt;H2&gt;Next Steps&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;Clone the CI/CD Repo at &lt;A href="https://github.com/leestott/foundry-cicd" target="_blank" rel="noopener"&gt;leestott/foundry-cicd&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Clone the reference demo:&amp;nbsp;&lt;A href="https://github.com/ericchansen/foundry-agents-lifecycle" target="_blank" rel="noopener"&gt;foundry-agents-lifecycle on GitHub&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Set up your environment: &lt;A href="https://learn.microsoft.com/en-us/azure/foundry/agents/environment-setup" target="_blank" rel="noopener"&gt;Set up your environment for Foundry Agent Service&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Deploy your first hosted agent: &lt;A href="https://learn.microsoft.com/en-us/azure/foundry/agents/quickstarts/quickstart-hosted-agent" target="_blank" rel="noopener"&gt;Quickstart: Deploy your first hosted agent&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Understand hosted agent concepts: &lt;A href="https://learn.microsoft.com/en-us/agent-framework/hosting/foundry-hosted-agent" target="_blank" rel="noopener"&gt;Foundry Hosted Agents concepts&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Automate deployments in CI/CD: &lt;A href="https://microsoft.github.io/TechWorkshop-L300-AI-Apps-and-agents/docs/05_agentic_devops/05_02.html" target="_blank" rel="noopener"&gt;Automate deployment of Microsoft Foundry agents&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Manage agent versions: &lt;A href="https://learn.microsoft.com/en-us/azure/foundry/agents/how-to/manage-hosted-agent" target="_blank" rel="noopener"&gt;Manage hosted agents — Microsoft Foundry&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Deploy via SDK: &lt;A href="https://learn.microsoft.com/en-us/azure/foundry/agents/how-to/deploy-hosted-agent" target="_blank" rel="noopener"&gt;Deploy a hosted agent — Microsoft Foundry&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;SDK and endpoint reference: &lt;A href="https://learn.microsoft.com/en-us/azure/foundry/sdk-endpoints-reference" target="_blank" rel="noopener"&gt;Microsoft Foundry SDK and Endpoints reference&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Azure AI Projects SDK: &lt;A href="https://learn.microsoft.com/en-us/python/api/overview/azure/ai-projects-readme" target="_blank" rel="noopener"&gt;azure-ai-projects Python SDK&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Azure Developer CLI: &lt;A href="https://learn.microsoft.com/en-us/azure/developer/azure-developer-cli/overview" target="_blank" rel="noopener"&gt;Azure Developer CLI (azd) overview&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Microsoft Foundry documentation hub: &lt;A href="https://learn.microsoft.com/en-us/azure/foundry/" target="_blank" rel="noopener"&gt;Microsoft Foundry on Microsoft Learn&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Fri, 22 May 2026 08:50:02 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/ci-cd-for-ai-agents-on-microsoft-foundry/ba-p/4522218</guid>
      <dc:creator>Lee_Stott</dc:creator>
      <dc:date>2026-05-22T08:50:02Z</dc:date>
    </item>
    <item>
      <title>Student Devs: Build AI Agents, Compete for $55K in Prizes</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/student-devs-build-ai-agents-compete-for-55k-in-prizes/ba-p/4521764</link>
      <description>&lt;H1&gt;Student Devs: Build AI Agents, Compete for $55K in Prizes&lt;/H1&gt;
&lt;P&gt;&lt;EM&gt;🎮 AI Skills Fest • June 4–14, 2026 • Free to Enter&lt;BR /&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;$55K&lt;/STRONG&gt;&lt;BR /&gt;Prize Pool&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;3&lt;/STRONG&gt;&lt;BR /&gt;Challenge Tracks&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;10&lt;/STRONG&gt;&lt;BR /&gt;Days of Hacking&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Free&lt;/STRONG&gt;&lt;BR /&gt;To Enter&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;Whether you're a first-year CS student or a final-year senior with a portfolio full of projects, &lt;STRONG&gt;Agents League&lt;/STRONG&gt; is the best way to gain hands-on experience with agentic AI this summer and walk away with real skills employers are hiring for right now.&lt;/BLOCKQUOTE&gt;
&lt;HR /&gt;
&lt;H2&gt;What You'll Actually Learn&lt;/H2&gt;
&lt;P&gt;Forget passive tutorials. Agents League is project-based learning at full speed. By the end of the hackathon, you'll have built a working AI agent and gained practical experience with the tools shaping the future of software development.&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;🤖 AI-Assisted Development&lt;/STRONG&gt;&lt;BR /&gt;Use GitHub Copilot to accelerate your coding workflow — from scaffolding to debugging — the way professional developers do today.&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;🧩 Multi-Step Reasoning&lt;/STRONG&gt;&lt;BR /&gt;Build agents with Microsoft Foundry that can plan, reason, and execute complex tasks — the core of agentic AI.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;🏢 Enterprise AI Patterns&lt;/STRONG&gt;&lt;BR /&gt;Learn to build production-ready agents that integrate with Microsoft 365 and Copilot Studio — skills that translate directly to industry jobs.&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;🔧 Prompt Engineering&lt;/STRONG&gt;&lt;BR /&gt;Design effective prompts and orchestration flows that make AI agents reliable and useful in the real world.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;📦 GitHub Workflows&lt;/STRONG&gt;&lt;BR /&gt;Submit your project through GitHub — practising version control, README writing, and open-source collaboration.&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;🎯 Competitive Problem-Solving&lt;/STRONG&gt;&lt;BR /&gt;Work under real constraints with deadlines, judging criteria, and peer competition — just like industry hackathons and sprints.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;HR /&gt;&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;Pick Your Track (or Try All Three)&lt;/H2&gt;
&lt;P&gt;Agents League has three challenge tracks, each using different Microsoft AI tools. Choose based on your interests or stretch yourself by competing in multiple tracks.&lt;/P&gt;
&lt;H3&gt;Track 01. Creative Apps&lt;/H3&gt;
&lt;P&gt;Build an innovative application with AI-assisted development. This track rewards creativity, dream big and let GitHub Copilot help you bring ideas to life faster than ever.&lt;BR /&gt;&lt;STRONG&gt;Tool:&lt;/STRONG&gt; GitHub Copilot&lt;/P&gt;
&lt;H3&gt;Track 02. Reasoning Agents&lt;/H3&gt;
&lt;P&gt;Create intelligent agents that solve complex problems through multi-step reasoning. Think: agents that can research, plan, and act. This is the cutting edge of AI.&lt;BR /&gt;&lt;STRONG&gt;Tool:&lt;/STRONG&gt; Microsoft Foundry&lt;/P&gt;
&lt;H3&gt;Track 03. Enterprise Agents&lt;/H3&gt;
&lt;P&gt;Build knowledge agents that integrate with Microsoft 365 Copilot. Learn how businesses are deploying AI today and add enterprise AI to your skillset.&lt;BR /&gt;&lt;STRONG&gt;Tool:&lt;/STRONG&gt; Copilot Studio • M365&lt;/P&gt;
&lt;HR /&gt;
&lt;H2&gt;Opportunities You Won't Want to Miss&lt;/H2&gt;
&lt;P&gt;Agents League isn't just a competition, it's a launchpad. Here's what's in it for you beyond the code:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;💰 Win from a $55,000 USD Prize Pool&lt;/STRONG&gt;&lt;BR /&gt;Prizes are awarded across all three tracks smaller teams and solo hackers have a real shot.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;📺 Watch Live Coding Battles at Microsoft Reactor&lt;/STRONG&gt;&lt;BR /&gt;See industry experts go head-to-head building AI agents live. Learn advanced techniques you can apply immediately to your own project.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;🎓 Free Learning Resources on Microsoft Learn&lt;/STRONG&gt;&lt;BR /&gt;Access curated learning paths and the AI Skills Navigator, structured content designed to get you from zero to submission-ready.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;🌍 Join a Global Developer Community&lt;/STRONG&gt;&lt;BR /&gt;Connect with thousands of developers on the Agents League Discord. Find teammates, ask questions, and build your professional network.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;📂 Build Your Portfolio with a Real Project&lt;/STRONG&gt;&lt;BR /&gt;Every submission lives on GitHub. Walk away with a polished, public project that demonstrates your AI skills to future employers and grad schools.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;🏆 Gain Recognition from Microsoft and the Community&lt;/STRONG&gt;&lt;BR /&gt;Top projects get visibility across the Microsoft developer ecosystem. Stand out from the crowd in internship and job applications.&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H2&gt;Key Dates to Remember&lt;/H2&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th&gt;Event&lt;/th&gt;&lt;th&gt;Date&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Hacking Period Opens&lt;/td&gt;&lt;td&gt;June 4, 2026&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Registration Deadline&lt;/td&gt;&lt;td&gt;June 12, 2026 — 12:00 PM PT&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Submission Deadline&lt;/td&gt;&lt;td&gt;June 14, 2026 — 11:59 PM PT&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;HR /&gt;
&lt;H2&gt;How to Get Started (Right Now)&lt;/H2&gt;
&lt;P&gt;You don't have to wait until June 4th to start preparing. Here's your pre-hackathon game plan:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;Register for the hackathon&lt;/STRONG&gt;&amp;nbsp; it's free and open to everyone.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Pick a track&lt;/STRONG&gt; that matches your interests or curiosity.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Explore the learning resources&lt;/STRONG&gt; on Microsoft Learn and the AI Skills Navigator.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Join the Discord community&lt;/STRONG&gt; to find teammates and get early tips.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Watch the Reactor event series&lt;/STRONG&gt; for live coding battles and expert walkthroughs.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Set up your GitHub repo&lt;/STRONG&gt; and start experimenting before the hacking window opens.&lt;/LI&gt;
&lt;/OL&gt;
&lt;HR /&gt;
&lt;H2&gt;Helpful Links&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://aka.ms/agents-league" target="_blank" rel="noopener"&gt;Register for Agents League&lt;/A&gt;&amp;nbsp; Free entry, sign up now&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://developer.microsoft.com/en-us/reactor/" target="_blank" rel="noopener"&gt;Microsoft Reactor Events&lt;/A&gt;&amp;nbsp; Live coding battles &amp;amp; workshops&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://aiskillsfest.com" target="_blank" rel="noopener"&gt;AI Skills Fest&lt;/A&gt; The broader event&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com" target="_blank" rel="noopener"&gt;Microsoft Learn&lt;/A&gt; Free learning paths&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H2&gt;The Arena Awaits 🏆&lt;/H2&gt;
&lt;P&gt;Ten days. Three tracks. $55K in prizes. Whether you go solo or squad up, this is your chance to build something real with AI and have a blast doing it.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&lt;A href="https://aka.ms/agents-league" target="_blank" rel="noopener"&gt;Register Now It's Free&lt;/A&gt;&lt;/STRONG&gt; &amp;nbsp;|&amp;nbsp; &lt;A href="https://developer.microsoft.com/en-us/reactor/" target="_blank" rel="noopener"&gt;Watch Reactor Events&lt;/A&gt;&lt;/P&gt;
&lt;HR /&gt;
&lt;P&gt;&lt;SMALL&gt; Agents League is part of &lt;A href="https://aiskillsfest.com" target="_blank" rel="noopener"&gt;AI Skills Fest&lt;/A&gt; and is open to the public at no cost.&lt;BR /&gt;Review the &lt;A href="https://aka.ms/agents-league" target="_blank" rel="noopener"&gt;Hackathon Rules and Regulations&lt;/A&gt; and the &lt;A href="https://developer.microsoft.com/en-us/reactor/" target="_blank" rel="noopener"&gt;Microsoft Event Code of Conduct&lt;/A&gt; before participating. &lt;/SMALL&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 21 May 2026 07:23:46 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/student-devs-build-ai-agents-compete-for-55k-in-prizes/ba-p/4521764</guid>
      <dc:creator>Lee_Stott</dc:creator>
      <dc:date>2026-05-21T07:23:46Z</dc:date>
    </item>
    <item>
      <title>Signing in to Microsoft Foundry from OpenClaw using Azure AD: a smoother way to bring your models in</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/signing-in-to-microsoft-foundry-from-openclaw-using-azure-ad-a/ba-p/4519034</link>
      <description>&lt;P&gt;This post is a quick update to walk through the new flow. If you read the previous one, think of this as the easier path I wish I had the first time round. If you have not seen the original, you can find it here: &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/educatordeveloperblog/integrating-microsoft-foundry-with-openclaw-step-by-step-model-configuration/4495586" data-lia-auto-title="Integrating Microsoft Foundry with OpenClaw: Step by Step Model Configuration | Microsoft Community Hub" data-lia-auto-title-active="0" target="_blank"&gt;Integrating Microsoft Foundry with OpenClaw: Step by Step Model Configuration | Microsoft Community Hub&lt;/A&gt;&lt;/P&gt;
&lt;H3&gt;Pre-requisite:&lt;/H3&gt;
&lt;P&gt;You will need the Azure CLI (azure-cli) installed on your machine. The official install guide for Linux is here: &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-linux?view=azure-cli-latest" target="_blank"&gt;https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-linux?view=azure-cli-latest&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;I am on Linux so I went the Homebrew route, which keeps things simple. The formula is here: &lt;A class="lia-external-url" href="https://formulae.brew.sh/formula/azure-cli" target="_blank"&gt;https://formulae.brew.sh/formula/azure-cli&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Microsoft also has official docs covering the Homebrew/Linuxbrew install: &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-macos?view=azure-cli-latest#install-with-homebrew" target="_blank"&gt;https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-macos?view=azure-cli-latest#install-with-homebrew&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Once Homebrew is ready, run this in your terminal:&lt;/P&gt;
&lt;LI-CODE lang=""&gt;brew install azure-cli&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Why this matters:&lt;/H3&gt;
&lt;P&gt;Before this update, every Foundry model you wanted to use in OpenClaw needed its own API key and endpoint pasted into the config. It worked, but it was tedious, and keys are easy to leak if you are copying them around. The Azure AD path solves both problems. You authenticate as yourself (or a service principal), OpenClaw asks Azure for the list of Foundry resources you have access to, and it brings the models in automatically.&lt;/P&gt;
&lt;H2 data-om-id="30f7755d:12"&gt;Signing in to Microsoft Foundry from OpenClaw via Azure AD&lt;/H2&gt;
&lt;P data-om-id="30f7755d:13"&gt;A device-code OAuth handshake replaces the old static-API-key flow. OpenClaw delegates auth to the local Azure CLI; the CLI handles the browser-side sign-in, holds the resulting tokens, and refreshes them silently. OpenClaw then walks the Azure resource graph, subscriptions → Foundry resources → model deployments and registers each model into its own config. No API keys move through OpenClaw at any point.&lt;/P&gt;
&lt;img&gt;&lt;STRONG data-om-id="30f7755d:157"&gt;Figure.&lt;/STRONG&gt; Sequence diagram of the OAuth 2.0 device-authorization flow as orchestrated by OpenClaw. Phases 1–3 establish identity (the developer authenticates once, in a real browser, against Azure AD). Phases 4–5 perform service discovery (OpenClaw walks the ARM resource hierarchy, subscriptions → Foundry accounts → model deployments and persists the result to a local provider config). After registration, every model call OpenClaw makes against Foundry reuses the same Azure-CLI-managed token cache: tokens refresh transparently, and access is gated by the Foundry resource's RBAC assignments rather than a static API key. Dashed lines denote return values; the teal line in step 7 marks the single token-issuance event the rest of the system pivots on.&lt;/img&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H4&gt;Walking through the new flow:&lt;/H4&gt;
&lt;P&gt;Start with the command to onboard openclaw as if you were setting up OpenClaw for the first time:&lt;/P&gt;
&lt;LI-CODE lang=""&gt;openclaw onboard&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Kick things off with the OpenClaw onboard command, the same one you would use when setting up OpenClaw for the first time. When it prompts you, choose update values.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Next, you will be asked to configure your models. Scroll down a little and you will see Microsoft Foundry listed as a supported provider. Pick it.&lt;/P&gt;
&lt;P&gt;From here, you have two options. You can sign in with an API key, which is what I covered in the previous blog post, or you can sign in through Azure AD. The Azure AD path is easier and more secure, so that is the one we will use.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;OpenClaw will give you a URL and a device code. Copy the URL into your browser and use the code to complete the sign in. (This is where the az CLI from the pre-requisite section earns its keep.)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If everything worked, you should see a success prompt similar to this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Once you are signed in, OpenClaw will ask you to pick the Azure subscription that your Microsoft Foundry resource lives in. Pick the subscription, then pick the Foundry resource where your models are deployed.&lt;/P&gt;
&lt;P&gt;And that is pretty much it. All the models you have deployed to that Foundry resource get pulled into OpenClaw automatically. Compared to the old way of pasting API keys and endpoints one by one, this is a huge time saver, and you do not have to babysit any keys.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;From here you can start using your Foundry-deployed models inside OpenClaw straight away:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Wrapping up&lt;/H3&gt;
&lt;P&gt;The Azure AD sign-in option in OpenClaw is one of those small updates that quietly removes a real pain point. If you have ever juggled multiple Foundry endpoints and rotated keys across them, you already know why. With this flow, you sign in once, your models show up, and you can get back to actually building.&lt;/P&gt;
&lt;P&gt;If you have not tried OpenClaw with Microsoft Foundry yet, this is a good time to give it a go. And if you were holding off because of the key management overhead, that excuse is gone now.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;References&lt;/H3&gt;
&lt;P&gt;Previous post on integrating Microsoft Foundry with OpenClaw using API keys: &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/educatordeveloperblog/integrating-microsoft-foundry-with-openclaw-step-by-step-model-configuration/4495586" data-lia-auto-title="Integrating Microsoft Foundry with OpenClaw: Step by Step Model Configuration | Microsoft Community Hub" data-lia-auto-title-active="0" target="_blank"&gt;Integrating Microsoft Foundry with OpenClaw: Step by Step Model Configuration | Microsoft Community Hub&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Install the Azure CLI on Linux: &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-linux?view=azure-cli-latest" target="_blank"&gt;https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-linux?view=azure-cli-latest&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Install the Azure CLI on macOS: &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-macos?view=azure-cli-latest#install-with-homebrew" target="_blank"&gt;https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-macos?view=azure-cli-latest#install-with-homebrew&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Homebrew formula for azure-cli: &lt;A class="lia-external-url" href="https://formulae.brew.sh/formula/azure-cli" target="_blank"&gt;https://formulae.brew.sh/formula/azure-cli&lt;/A&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 20 May 2026 07:11:23 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/signing-in-to-microsoft-foundry-from-openclaw-using-azure-ad-a/ba-p/4519034</guid>
      <dc:creator>suzarilshah</dc:creator>
      <dc:date>2026-05-20T07:11:23Z</dc:date>
    </item>
    <item>
      <title>Spec-Driven Development for AI-Enabled Enterprise Systems</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/spec-driven-development-for-ai-enabled-enterprise-systems/ba-p/4520807</link>
      <description>&lt;ARTICLE&gt;&lt;HEADER&gt;
&lt;H1&gt;Spec-Driven Development for AI-Enabled Enterprise Systems&lt;/H1&gt;
&lt;P&gt;&lt;EM&gt;How to make specs the single source of truth for your React frontends, backend services, data, and AI agents.&lt;/EM&gt;&lt;/P&gt;
&lt;/HEADER&gt;
&lt;SECTION&gt;
&lt;P&gt;If you are building an enterprise system with a React frontend, backend APIs and services, a database layer, and shared libraries, moving to Spec-Driven Development (SDD) can feel like a big cultural shift. For AI developers and engineers, though, it is a gift: structured, machine-readable specifications are exactly what both humans and AI coding agents need to stay aligned and productive.&lt;/P&gt;
&lt;P&gt;This post walks through how to structure specs, version contracts, design workflows, and integrate AI agents in a way that scales. Along the way, it references Microsoft’s public guidance on microservices, APIs, DevOps, and architecture so you can go deeper where needed.&lt;/P&gt;
&lt;/SECTION&gt;
&lt;/ARTICLE&gt;
&lt;img /&gt;
&lt;ARTICLE&gt;
&lt;SECTION&gt;
&lt;H2&gt;1. Structuring specifications for an enterprise system&lt;/H2&gt;
&lt;P&gt;For a serious enterprise system, treat specs as layered and modular rather than a single monolithic document. A good mental model is Domain-Driven Design (DDD) and bounded contexts (see &lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/architecture/microservices/model/domain-analysis" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/azure/architecture/microservices/model/domain-analysis&lt;/A&gt;&lt;/P&gt;
&lt;H3&gt;Business and domain layer&lt;/H3&gt;
&lt;P&gt;This layer is technology-agnostic and captures:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Business capabilities and problem statements&lt;/LI&gt;
&lt;LI&gt;Domain language and key entities&lt;/LI&gt;
&lt;LI&gt;Business rules and workflows&lt;/LI&gt;
&lt;LI&gt;Non-functional requirements (performance, security, compliance, SLAs)&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Solution and architecture layer&lt;/H3&gt;
&lt;P&gt;Here you define how the system is shaped:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;System context and C4-style diagrams&lt;/LI&gt;
&lt;LI&gt;Service boundaries and ownership&lt;/LI&gt;
&lt;LI&gt;Integration patterns and event flows&lt;/LI&gt;
&lt;LI&gt;Data ownership and high-level models&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Microsoft’s microservices guidance is a solid reference: &lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/architecture/microservices/" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/azure/architecture/microservices/&lt;/A&gt;.&lt;/P&gt;
&lt;H3&gt;Implementation-oriented specs per component&lt;/H3&gt;
&lt;P&gt;For each concrete component, keep a focused spec:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Frontend / UI (React):&lt;/STRONG&gt; screen catalogue, UX flows, state contracts, API dependencies, validation rules, accessibility and performance requirements.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;APIs / services:&lt;/STRONG&gt; OpenAPI or AsyncAPI contracts, error models, authentication and authorisation, rate limits, SLAs, observability requirements.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Database / schema:&lt;/STRONG&gt; logical data model, ownership per service, migration strategy, retention, indexing, partitioning.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Shared libraries:&lt;/STRONG&gt; responsibilities, versioning policy, supported runtimes, compatibility matrix.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Integrations:&lt;/STRONG&gt; protocols, payloads, sequencing, idempotency, retry and backoff, SLAs, failure modes.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;In practice, this usually means:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;One “master” business and architecture spec per domain or product&lt;/LI&gt;
&lt;LI&gt;Separate specs per service or module (frontend app, each backend service, shared library, integration)&lt;/LI&gt;
&lt;LI&gt;Everything linked via IDs (for example REQ-123, SVC-ORDER-001) so you can trace from requirement to spec, implementation, and tests&lt;/LI&gt;
&lt;/UL&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;2. Templates and standards that scale&lt;/H2&gt;
&lt;P&gt;To keep things consistent across teams, use a base template that all components share, then extend it with technology-specific sections. This works well for both human readers and AI agents consuming the specs.&lt;/P&gt;
&lt;H3&gt;Base specification template&lt;/H3&gt;
&lt;P&gt;Every spec, regardless of component type, should include:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Purpose and scope&lt;/LI&gt;
&lt;LI&gt;Stakeholders and dependencies&lt;/LI&gt;
&lt;LI&gt;Requirements mapping (list of requirement IDs covered)&lt;/LI&gt;
&lt;LI&gt;Architecture and interaction overview&lt;/LI&gt;
&lt;LI&gt;Contracts (APIs, events, data)&lt;/LI&gt;
&lt;LI&gt;Non-functional requirements&lt;/LI&gt;
&lt;LI&gt;Risks and open questions&lt;/LI&gt;
&lt;LI&gt;Test and acceptance criteria&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Extended templates per component&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Frontend:&lt;/STRONG&gt; UX flows, wireframes or Figma links, accessibility, performance budgets, offline behaviour, error states.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;API / service:&lt;/STRONG&gt; OpenAPI or AsyncAPI link, auth and authorisation, throttling, logging and metrics, health endpoints. See logging and monitoring guidance at &lt;A class="lia-external-url" href="Https://learn.microsoft.com/azure/architecture/microservices/logging-monitoring" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/azure/architecture/microservices/logging-monitoring&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Database:&lt;/STRONG&gt; schema definition, migration plan, backup and restore, data lifecycle, multi-tenant strategy.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Integration:&lt;/STRONG&gt; sequence diagrams, error handling, retry and idempotency, message contracts, security.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;3. Contracts, versioning, and change management&lt;/H2&gt;
&lt;H3&gt;API contracts&lt;/H3&gt;
&lt;P&gt;For SDD, API contracts are first-class citizens. Define them via OpenAPI or AsyncAPI and treat the spec as the source of truth. Use contract testing to keep providers and consumers aligned, and version APIs explicitly (for example v1, v2) rather than breaking changes in place. Microsoft’s API design guidance is a good starting point: &lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/architecture/best-practices/api-design" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/azure/architecture/best-practices/api-design&lt;/A&gt;&amp;nbsp;and Azure API Management at &lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/api-management/" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/azure/api-management/&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Database migrations&lt;/H3&gt;
&lt;P&gt;Any spec change that affects data should include a migration plan. Use migration tooling such as EF Core migrations, Flyway, or Liquibase, and treat migration scripts as code. Document backward-compatibility windows so APIs can support both old and new fields for a defined period.&lt;/P&gt;
&lt;H3&gt;Shared DTOs and models&lt;/H3&gt;
&lt;P&gt;Prefer sharing contracts (OpenAPI, JSON Schema) over large shared code libraries. If you must share code, version the shared library independently and document compatibility (for example, “Service A supports SharedLib 2.x”). Keep DTOs at the edges and map to internal domain models inside each service.&lt;/P&gt;
&lt;H3&gt;Cross-service dependencies&lt;/H3&gt;
&lt;P&gt;Capture dependencies explicitly in specs, such as “Order Service depends on Customer v1.3+ for endpoint /customers/{id}”. Use consumer-driven contracts and CI checks to prevent breaking changes. For event-driven systems, document event contracts and evolution rules. See event-driven architecture guidance at &lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/architecture/reference-architectures/event-driven/event-driven-architecture-overview" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/azure/architecture/reference-architectures/event-driven/event-driven-architecture-overview&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Spec versioning and change management&lt;/H3&gt;
&lt;P&gt;Version specs semantically (for example OrderServiceSpec v1.2.0) and record what changed, why, impact, and migration steps. Link spec versions to releases or tags in Git and to work items in Azure DevOps or GitHub Issues. Azure Boards is useful here: &lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/devops/boards/?view=azure-devops" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/azure/devops/boards/?view=azure-devops&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;4. A mature Spec-Driven Development workflow&lt;/H2&gt;
&lt;P&gt;A realistic SDD workflow for AI-enabled teams might look like this:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;Discovery and domain analysis:&lt;/STRONG&gt; capture business capabilities, domain language, and high-level workflows.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Business and architecture specs:&lt;/STRONG&gt; define bounded contexts, service boundaries, integration patterns, and NFRs.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Contract design:&lt;/STRONG&gt; design API specs (OpenAPI or AsyncAPI), event schemas, data models, and validation rules.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Task generation:&lt;/STRONG&gt; derive work items from specs, such as “Implement endpoint X”, “Add migration Y”, “Add UI flow Z”. This is a great place to use AI agents to read specs and generate tasks.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Implementation:&lt;/STRONG&gt; code is generated or written to satisfy the spec; the spec remains the reference, not the code.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Validation and testing:&lt;/STRONG&gt; contract tests, unit tests, integration tests, and end-to-end tests all trace back to spec IDs. Use quality gates in CI and CD, as described in &lt;A class="lia-external-url" href="Https://learn.microsoft.com/azure/architecture/framework/devops/devops-quality" target="_blank" rel="noopener"&gt;Https://learn.microsoft.com/azure/architecture/framework/devops/devops-quality&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Review and sign-off:&lt;/STRONG&gt; architecture and product review against the spec; update the spec if reality diverges.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Release and observability:&lt;/STRONG&gt; dashboards and alerts tied to specified SLIs and SLOs.&lt;/LI&gt;
&lt;/OL&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;5. Governance, traceability, and avoiding drift&lt;/H2&gt;
&lt;H3&gt;Traceability across the lifecycle&lt;/H3&gt;
&lt;P&gt;Use IDs everywhere: requirements, spec sections, tasks, tests, and deployment artefacts. In Azure DevOps or GitHub, link:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Requirement (for example Azure DevOps Feature)&lt;/LI&gt;
&lt;LI&gt;Spec (stored in the repo)&lt;/LI&gt;
&lt;LI&gt;User stories and tasks&lt;/LI&gt;
&lt;LI&gt;Pull requests&lt;/LI&gt;
&lt;LI&gt;Tests&lt;/LI&gt;
&lt;LI&gt;Releases&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;For key decisions, adopt Architecture Decision Records (ADRs). Microsoft’s guidance on ADRs is here: &lt;A class="lia-external-url" href="Https://learn.microsoft.com/azure/architecture/framework/devops/adrs" target="_blank" rel="noopener"&gt;Https://learn.microsoft.com/azure/architecture/framework/devops/adrs&lt;/A&gt;&lt;/P&gt;
&lt;H3&gt;Keeping humans and AI agents aligned&lt;/H3&gt;
&lt;P&gt;To avoid implementation drift:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Make specs as machine-readable as possible (OpenAPI, JSON Schema, YAML, BPMN).&lt;/LI&gt;
&lt;LI&gt;Enforce spec checks in CI: API implementation must match OpenAPI, DB schema must match migration plan, generated clients must be up to date.&lt;/LI&gt;
&lt;LI&gt;For AI coding agents, always provide the relevant spec files as context and constrain them to files linked to specific spec IDs.&lt;/LI&gt;
&lt;LI&gt;Add automated checks that compare generated code to contracts and fail builds when they diverge.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;6. Enterprise best practices for repos and governance&lt;/H2&gt;
&lt;H3&gt;Example repository structure&lt;/H3&gt;
&lt;PRE&gt;/docs
  /business
  /architecture
  /decisions (ADRs)
/specs
  /frontend
  /services
    /orders
    /customers
  /integrations
  /data
/src
  /frontend
  /services
  /shared
/tests
/ops
  /pipelines
  /infra-as-code
      &lt;/PRE&gt;
&lt;H3&gt;Governance practices&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;An architecture review group that reviews spec changes, not just code changes.&lt;/LI&gt;
&lt;LI&gt;Definition of Done includes: spec updated, tests linked, contracts validated.&lt;/LI&gt;
&lt;LI&gt;Regular “spec health” reviews to identify what is out of date or drifting.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;For broader architectural guidance, see:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Azure microservices and DDD: &lt;A class="lia-external-url" href="Https://learn.microsoft.com/azure/architecture/microservices/" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/azure/architecture/microservices/&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Cloud design patterns: &lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/architecture/patterns/" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/azure/architecture/patterns/&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Azure Well-Architected Framework: &lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/well-architected/" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/azure/well-architected/&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;7. Integrating AI and agentic workflows into SDD&lt;/H2&gt;
&lt;P&gt;Spec-Driven Development is a natural fit for AI and multi-agent systems because specs provide structured, reliable context. Here are some practical patterns.&lt;/P&gt;
&lt;H3&gt;LangGraph and multi-agent orchestration using &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/agent-framework/overview/" target="_blank"&gt;Microsoft Agent Framework&lt;/A&gt;&lt;/H3&gt;
&lt;P&gt;You can design a graph where:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;A “spec agent” reads and validates specs.&lt;/LI&gt;
&lt;LI&gt;An “implementation agent” writes or updates code based on those specs.&lt;/LI&gt;
&lt;LI&gt;A “test agent” generates tests from contracts and acceptance criteria.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The graph flow can mirror your SDD workflow: Spec → Contract → Code → Tests → Review, with each agent responsible for a stage.&lt;/P&gt;
&lt;H3&gt;MCP (Model Context Protocol)&lt;/H3&gt;
&lt;P&gt;Expose your spec repository, OpenAPI definitions, and ADRs as MCP tools so agents can query the true source of truth instead of hallucinating. For example, provide a tool that returns the OpenAPI for a given service and version, or a tool that returns the ADRs relevant to a particular domain. Learn more about MCP at &lt;A class="lia-external-url" href="https://aka.ms/mcp-for-beginners" target="_blank"&gt;https://aka.ms/mcp-for-beginners&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;BPMN and process flows&lt;/H3&gt;
&lt;P&gt;Store BPMN diagrams as part of the spec. Agents can read them to generate workflow code, state machines, or tests. For process-oriented integrations, see Azure Logic Apps guidance at &lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/logic-apps/" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/azure/logic-apps/&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;CI/CD pipelines on Azure&lt;/H3&gt;
&lt;P&gt;In your pipelines, validate that implementation matches the spec:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Contract tests for APIs and events&lt;/LI&gt;
&lt;LI&gt;Schema checks for databases&lt;/LI&gt;
&lt;LI&gt;Linting and static analysis for spec conformance&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Use pipeline gates to block deployments if contracts or migrations are out of sync. &lt;BR /&gt;Azure Pipelines &lt;A class="lia-external-url" href="Https://learn.microsoft.com/azure/devops/pipelines/?view=azure-devops" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/azure/devops/pipelines/?view=azure-devops&lt;/A&gt; &lt;BR /&gt;GitHub Agentic Workflow Patterns &lt;A class="lia-external-url" href="https://github.github.com/gh-aw/" target="_blank"&gt;https://github.github.com/gh-aw/&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;Where to start&lt;/H2&gt;
&lt;P&gt;The key is not to boil the ocean. Pick one domain, such as “Orders”, and design a thin but end-to-end SDD flow: spec → contract → tasks → code → tests. Run it with your AI agents in the loop, learn where the friction is, and iterate. Once that feels natural, you can roll the patterns out across the rest of your system.&lt;/P&gt;
&lt;P&gt;For AI developers and engineers, SDD is more than process hygiene. It is how you give your agents high-quality, unambiguous context so they can generate code, tests, and documentation that actually match what the business needs.&lt;/P&gt;
&lt;/SECTION&gt;
&lt;/ARTICLE&gt;
&lt;P&gt;`&lt;/P&gt;</description>
      <pubDate>Mon, 18 May 2026 18:40:17 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/spec-driven-development-for-ai-enabled-enterprise-systems/ba-p/4520807</guid>
      <dc:creator>Lee_Stott</dc:creator>
      <dc:date>2026-05-18T18:40:17Z</dc:date>
    </item>
    <item>
      <title>Building the Solution Teams Need to Secure AI Against Prompt Injection</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/building-the-solution-teams-need-to-secure-ai-against-prompt/ba-p/4514198</link>
      <description>&lt;P&gt;As artificial intelligence continues to evolve, teams are prioritising rapid advancements and deployment of applications while often overlooking security considerations. Emerging threats such as prompt injection remain poorly understood, and this is putting systems, users, and infrastructure at serious risk.&lt;/P&gt;
&lt;P&gt;Much of the expertise required to mitigate these risks is currently fragmented and inaccessible, concentrated among a small group of cybersecurity specialists. Meanwhile, developers, under pressure to ship quickly, often lack both the tools and frameworks needed to systematically test their AI systems for vulnerabilities.&lt;/P&gt;
&lt;P&gt;This disconnect is creating a significant gap between the development and security assurance of AI applications.&lt;/P&gt;
&lt;P data-start="1116" data-end="1344"&gt;To address this gap, we developed a unified Prompt Injection Testing Platform and knowledge base, powered by Microsoft Foundry,&amp;nbsp;designed to make LLM security testing accessible, structured, and understandable for developers.&lt;/P&gt;
&lt;H1 data-start="1116" data-end="1344"&gt;Project Overview&lt;/H1&gt;
&lt;P data-start="1377" data-end="1667"&gt;Developers are rapidly integrating LLMs and agents into applications, but:&lt;/P&gt;
&lt;UL&gt;
&lt;LI data-start="1377" data-end="1667"&gt;Security testing is not standardised&lt;/LI&gt;
&lt;LI data-start="1377" data-end="1667"&gt;Prompt injection risks are increasingly understood in research, but poorly mitigated in practice by developers&lt;/LI&gt;
&lt;LI data-start="1377" data-end="1667"&gt;There is a lack of accessible, actionable tooling&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-start="1669" data-end="1768"&gt;This creates a dangerous gap: applications are being deployed faster than they are being secured.&lt;/P&gt;
&lt;P data-start="1770" data-end="2275"&gt;As part of our UCL Industry Exchange Network (IXN) project in collaboration with Avanade, we built a Prompt Injection Testing Platform designed to solve this exact issue by:&lt;/P&gt;
&lt;UL&gt;
&lt;LI data-start="1770" data-end="2275"&gt;Providing a knowledge base of vulnerabilities and mitigations&lt;/LI&gt;
&lt;LI data-start="1770" data-end="2275"&gt;Helping teams identify vulnerabilities within their AI systems&lt;/LI&gt;
&lt;LI data-start="1770" data-end="2275"&gt;Enabling custom and automated testing pipelines&lt;/LI&gt;
&lt;LI data-start="1770" data-end="2275"&gt;Integrating tools like Garak for adversarial testing&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-start="2277" data-end="2371"&gt;With this, we aim to make prompt injection testing accessible, standard, and understandable.&lt;/P&gt;
&lt;H1 data-start="2277" data-end="2371"&gt;Project Journey&lt;/H1&gt;
&lt;P&gt;We divided our project into several phases:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Phase 1: Understanding our Users’ Needs.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;We began by identifying the core users of our platform: AI developers and broader stakeholders across development, security, and safety disciplines integrating LLMs into their applications. By meeting with them, we uncovered a few key challenges:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Developers have limited awareness of prompt injection risks&lt;/LI&gt;
&lt;LI&gt;There is a generalised lack of accessible tools for testing&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;This first exploration set a core principle: We must build a developer-first solution which does not depend on extensive technical knowledge to be used. We concluded that to be as useful as possible, our solution should not require prior prompt injection knowledge.&lt;/P&gt;
&lt;P&gt;In order to solve the two challenges presented by our users, we concluded a platform would be the best approach, as it enables us to centralise fragmented knowledge while providing a structured, scalable environment for testing LLM vulnerabilities in practice.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Phase 2: Understanding the Threat Landscape&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Building on our user research, we focused on developing a deep understanding of the prompt injection threat landscape to inform the design of our platform. This phase involved researching:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Different types of prompt injection vulnerabilities&lt;/LI&gt;
&lt;LI&gt;Common attack scenarios and override techniques&lt;/LI&gt;
&lt;LI&gt;Existing mitigation strategies used in practice&lt;/LI&gt;
&lt;LI&gt;Tools and methodologies for prompt injection security testing&lt;/LI&gt;
&lt;LI&gt;The most widely used models to ensure our platform would be compatible with real-world systems.&lt;/LI&gt;
&lt;LI&gt;We consolidated these findings into a structured technical report, designed to be shared with developers, security testers, and semi-technical stakeholders. The goal was not only to guide our own implementation, but also to contribute to making prompt injection more standard and understandable.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;From our research, we realised prompt injection is not a single vulnerability, but a rapidly evolving attack surface that requires continuous, scalable testing rather than one-time validation.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Phase 3: Building the Platform&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Guided by both our user insights and the threat landscape analysis, we moved to designing and developing a unified prompt injection testing platform and knowledge base.&lt;/P&gt;
&lt;P&gt;To do this, we defined three core principles:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Developer first: no deep security knowledge would be required&lt;/LI&gt;
&lt;LI&gt;Unified: combines education (knowledge base) and execution (testing tools)&lt;/LI&gt;
&lt;LI&gt;Scalable: Expert users could extend the platform by bringing their own models, tests, and mitigations.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;During this stage, we built a platform which allows teams to:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Connect their own LLM endpoints&lt;/LI&gt;
&lt;LI&gt;Run custom prompt injection tests&lt;/LI&gt;
&lt;LI&gt;Execute automated adversarial testing through Garak&lt;/LI&gt;
&lt;LI&gt;Access a centralised knowledge base of vulnerabilities and mitigation strategies.&lt;/LI&gt;
&lt;LI&gt;Export knowledge base information and test results as PDFs.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;By the end, we had developed a unified platform that enables developers to systematically test, understand, and mitigate prompt injection vulnerabilities in their AI applications.&lt;/P&gt;
&lt;P&gt;To understand how our platform works in practice, you can view&amp;nbsp;&lt;A class="lia-external-url" href="https://www.youtube.com/watch?v=K-qNQL2Tb2Q" target="_blank" rel="noopener"&gt;our demo video&lt;/A&gt;.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;div data-video-id="https://www.youtube.com/watch?v=K-qNQL2Tb2Q/1778392541259" data-video-remote-vid="https://www.youtube.com/watch?v=K-qNQL2Tb2Q/1778392541259" class="lia-video-container lia-media-is-center lia-media-size-large"&gt;&lt;iframe src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FK-qNQL2Tb2Q%3Ffeature%3Doembed&amp;amp;display_name=YouTube&amp;amp;url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DK-qNQL2Tb2Q&amp;amp;image=https%3A%2F%2Fi.ytimg.com%2Fvi%2FK-qNQL2Tb2Q%2Fhqdefault.jpg&amp;amp;type=text%2Fhtml&amp;amp;schema=youtube" allowfullscreen="" style="max-width: 100%"&gt;&lt;/iframe&gt;&lt;/div&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img&gt;&lt;STRONG&gt;Figure 4: &lt;/STRONG&gt;Platform home interface presenting an overview of prompt injection concepts and a structured vulnerability catalogue for exploring attack types and mitigation strategies.&lt;/img&gt;
&lt;H1&gt;Key Features&lt;/H1&gt;
&lt;H4&gt;Model Integration and Configuration&lt;/H4&gt;
&lt;P&gt;Users can use models included in the platform or connect their own LLM endpoints, allowing the platform to work across different providers:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Supports multiple model providers through Microsoft Foundry&lt;/LI&gt;
&lt;LI&gt;Supports custom model integration via HTTP endpoints&lt;/LI&gt;
&lt;LI&gt;Enables model configurations such as custom system prompts and mitigation layers.&lt;/LI&gt;
&lt;LI&gt;Ensures flexibility as new models and mitigations emerge&lt;/LI&gt;
&lt;/UL&gt;
&lt;H4&gt;Testing Suite&lt;/H4&gt;
&lt;P&gt;The platform allows users to create and run custom prompt injection tests tailored to their applications. This involves:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Creating and executing targeted prompts&lt;/LI&gt;
&lt;LI&gt;Simulating real-world attack scenarios&lt;/LI&gt;
&lt;LI&gt;Running predefined adversarial testing suites (integrating NVIDIA Garak)&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img&gt;&lt;STRONG&gt;Figure 2:&lt;/STRONG&gt; Testing interface showing configuration of prompt injection tests and execution of automated scans, with results and risk evaluation displayed.&lt;/img&gt;
&lt;H4&gt;Knowledge Base&lt;/H4&gt;
&lt;P&gt;A core component of our platform is a structured knowledge base, which is designed to make prompt injection concepts accessible and understandable. This is divided into two key areas:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Vulnerabilities: &lt;/STRONG&gt;Provides information on different types of prompt injection attacks, including explanations of how each vulnerability works, with real-world examples and scenarios, as well as references to reputable external sources&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Mitigations: &lt;/STRONG&gt;Focuses on how to defend against these vulnerabilities, and it includes clear implementation strategies and code examples demonstrating how to integrate each mitigation.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;To support exploration, we also included a chatbot interface, which answers questions using knowledge base data and trusted sources. This helps users quickly navigate vulnerabilities and mitigation strategies by providing contextual, reliable information and redirecting users to the appropriate page of our platform.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img&gt;
&lt;P data-start="631" data-end="806"&gt;&lt;STRONG data-start="631" data-end="806"&gt;Figure 3: &lt;/STRONG&gt;Direct prompt injection analysis view, where users can explore attack techniques, observe unsafe model responses, and review corresponding mitigation approaches.&lt;/P&gt;
&lt;/img&gt;
&lt;H4&gt;Prompt Enhancer&lt;/H4&gt;
&lt;P&gt;In addition to testing and learning, our platform integrates a prompt enhancer, designed to help users actively improve the security of their system prompts. It works in the following way:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Takes an existing prompt as input&lt;/LI&gt;
&lt;LI&gt;Draws on the knowledge base insights and best practices&lt;/LI&gt;
&lt;LI&gt;Restructures the prompt to improve clarity and robustness&lt;/LI&gt;
&lt;LI&gt;Incorporates selected prompt-layer mitigations to reduce prompt injection risk&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img&gt;&lt;STRONG&gt;Figure 4: &lt;/STRONG&gt;Prompt Enhancer interface showing the application of prompt-layer mitigations (e.g. delimiter tokens, instruction hierarchy enforcement) to restructure and secure a system prompt against prompt injection attacks.&lt;/img&gt;
&lt;H1&gt;Technical Details&lt;/H1&gt;
&lt;P&gt;To support a flexible and scalable testing system, we designed our platform with a modular, layered architecture. This allows different components to operate independently while remaining integrated through clearly defined interfaces, ensuring both extensibility and maintainability.&lt;/P&gt;
&lt;H3&gt;System Architecture&lt;/H3&gt;
&lt;P&gt;We divided our platform into four main layers:&lt;/P&gt;
&lt;H4&gt;Frontend Layer&lt;/H4&gt;
&lt;P&gt;An interactive user interface that allows developers to:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Explore the prompt injection knowledge base&lt;/LI&gt;
&lt;LI&gt;Configure and run tests&lt;/LI&gt;
&lt;LI&gt;View results and vulnerability analysis&lt;/LI&gt;
&lt;/UL&gt;
&lt;H4&gt;API Layer&lt;/H4&gt;
&lt;P&gt;The API layer acts as the orchestration and communication layer between the frontend and the core system.&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Handles requests from the frontend to create and run tests.&lt;/LI&gt;
&lt;LI&gt;Provides frontend with available models, mitigations, and configurations.&lt;/LI&gt;
&lt;LI&gt;Ensures any newly added models and mitigations can be automatically reflected in the frontend without requiring manual updates.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H4&gt;Domain Layer&lt;/H4&gt;
&lt;P&gt;The layer which defines the core structure and logic of the system:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Defines interfaces for key components such as mitigations, models, and test runners&lt;/LI&gt;
&lt;LI&gt;Establishes the test structure and data models&lt;/LI&gt;
&lt;LI&gt;Encapsulates logic to ensure consistency&lt;/LI&gt;
&lt;/UL&gt;
&lt;H4&gt;Integration Layer&lt;/H4&gt;
&lt;P&gt;The layer which implements the abstractions defined in the domain layer and connects the platform to external services&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Implements model providers such as OpenAI, Anthropic, and other external HTTP-based endpoints&lt;/LI&gt;
&lt;LI&gt;Implements test runners, including custom prompt runners and external tools such as Garak.&lt;/LI&gt;
&lt;LI&gt;Implements database connections and repository classes.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H1&gt;Results and Outcomes&lt;/H1&gt;
&lt;P&gt;Through the research and development of our platform, we were able to gain several key insights into the behaviour and security of LLM-based applications:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Prompt injection vulnerabilities are more prevalent than expected. Even simple prompts with carefully crafted inputs can unsafely manipulate a model’s behaviour.&lt;/LI&gt;
&lt;LI&gt;Lack of structured testing leads to hidden risks. Without a systematic approach, many vulnerabilities remain undetected. It is sometimes time consuming to manually craft unsafe prompts.&lt;/LI&gt;
&lt;LI&gt;Combining custom testing with framework-based testing improves coverage. Using both custom prompts (targeted and application-specific scenarios) and framework-driven testing (e.g. Garak) enables a more comprehensive evaluation of model safety, as both expected and unexpected vulnerabilities can be captured&lt;/LI&gt;
&lt;LI&gt;Structured prompts can significantly improve robustness. We observed that prompts with a clear structure and embedded mitigations are less susceptible to injection attacks.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;By the end of our project, we successfully developed a platform that:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Bridges the gap between prompt injection knowledge and practical testing.&lt;/LI&gt;
&lt;LI&gt;Enables repeatable and structured testing of prompt injection vulnerabilities&lt;/LI&gt;
&lt;LI&gt;Provides a unified workflow for learning, testing, and improving prompt security.&lt;/LI&gt;
&lt;LI&gt;Supports multiple models and testing approaches, to cover the entire vulnerability safety.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;We demonstrated that prompt injection risks can be systematically identified, tested, and mitigated through a structured and repeatable approach.&lt;/P&gt;
&lt;H1&gt;Lessons Learned&lt;/H1&gt;
&lt;P&gt;Throughout the project, we identified several key insights that shaped both our technical approach and our understanding of AI security.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;AI is rapidly evolving, and systems must be designed accordingly&lt;/STRONG&gt;. AI models and attack techniques are advancing extremely fast. As a result, static solutions are quickly becoming obsolete. We learned that it is essential to design a platform that is modular, extensible and adaptable. Through well-defined interfaces and generic services, we ensured our platform can evolve alongside attacks and mitigations.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Security must be built into development, not considered at testing. &lt;/STRONG&gt;Many developers are focusing on functionality first and security often takes a backseat. In the context of LLMs, vulnerabilities can fundamentally affect the security of the system and its users. As such, security should be treated as a core part of the development cycle. Models and external tools should only be connected if their safety is guaranteed.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Bridging the gap between developers and security testers is necessary.&lt;/STRONG&gt; We identified a major disconnect between developers building AI applications and the security testers evaluating them. These groups often operate with different priorities and levels of knowledge. We are bridging this gap by making prompt injection knowledge more accessible and creating workflows that are usable by developers while still grounded in robust security practices.&lt;/P&gt;
&lt;H1&gt;Further Development&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/H1&gt;
&lt;P&gt;While our platform provides a strong foundation for prompt injection testing and knowledge, there are several areas for future exploration:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Expanding our testing framework integrations, by adding a broader coverage of attack techniques&lt;/LI&gt;
&lt;LI&gt;Integration with MCP servers and external systems, supporting interactions with tools, APIs and external data sources.&lt;/LI&gt;
&lt;LI&gt;Addressing additional indirect prompt injection vulnerabilities, including file uploads, website scraping, and multi-step workflows.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Looking ahead, we also aim to integrate our platform more deeply into development workflows by introducing CI/CD integrations for continuous security testing and versioned tracking of model robustness over time.&lt;/P&gt;
&lt;P&gt;Our goal is to evolve the platform into a comprehensive security layer, capable of testing entire AI-driven systems in dynamic, real-world contexts.&lt;/P&gt;
&lt;H1&gt;Conclusion&lt;/H1&gt;
&lt;P data-start="237" data-end="471"&gt;As AI becomes increasingly integrated into real-world applications, ensuring their security is essential. As our research highlights, current practices have not kept pace with the rapid evolution of AI systems and attack techniques.&lt;/P&gt;
&lt;P data-start="473" data-end="794"&gt;Through our work, we demonstrated that prompt injection risks can be systematically identified, tested, and mitigated using a structured approach. By combining a unified knowledge base with a flexible testing platform powered by &lt;A class="lia-external-url" href="https://ai.azure.com" target="_blank" rel="noopener"&gt;Microsoft Foundry&lt;/A&gt;, we are taking a step towards making AI systems safer and more reliable.&lt;/P&gt;
&lt;P data-start="796" data-end="1196"&gt;More importantly, our project reinforces a broader idea: a developer-first approach to security, supported by collaboration across development, security, and safety disciplines, is essential for building AI at scale. Security should not remain confined to specialist teams but should be embedded directly into the development process, alongside practices such as red-teaming and continuous testing.&lt;/P&gt;
&lt;P data-start="1198" data-end="1394" data-is-last-node="" data-is-only-node=""&gt;Our project empowers teams with the knowledge and tools they need to build safer and more reliable AI systems.&amp;nbsp; If you’re interested in building more secure AI systems or exploring prompt injection in practice, we invite you to join us through the &lt;A class="lia-external-url" href="https://aka.ms/foundry/discord" target="_blank" rel="noopener"&gt;Foundry Community&lt;/A&gt; on the 3rd of June at 2pm BST, when we will be showcasing our platform live, walking through real-world examples, and discussing how teams can integrate prompt injection testing into their development workflows.&lt;/P&gt;
&lt;H1&gt;Team&lt;/H1&gt;
&lt;P&gt;&lt;A class="lia-external-url" href="https://www.linkedin.com/in/teomontero" target="_blank" rel="noopener"&gt;Teo Montero Bonet&lt;/A&gt;, UCL Computer Science&lt;/P&gt;
&lt;P&gt;&lt;A class="lia-external-url" href="https://www.linkedin.com/in/mario-mojarro-ruiz-b6b249366/" target="_blank" rel="noopener"&gt;Mario Mojarro Ruiz&lt;/A&gt;, UCL Computer Science&lt;/P&gt;
&lt;P&gt;&lt;A class="lia-external-url" href="https://www.linkedin.com/in/davidthomasg/" target="_blank" rel="noopener"&gt;David Thomas Garcia&lt;/A&gt;, UCL Computer Science&lt;/P&gt;
&lt;P&gt;&lt;A class="lia-external-url" href="https://www.linkedin.com/in/nathaniel-gibbon-323ba9387/" target="_blank" rel="noopener"&gt;Nathaniel Gibbon&lt;/A&gt;, UCL Computer Science&lt;/P&gt;
&lt;P&gt;With support from &lt;A class="lia-external-url" href="https://www.linkedin.com/in/joshmcdonaldmvp/" target="_blank" rel="noopener"&gt;Josh McDonald&lt;/A&gt;, Avanade&lt;/P&gt;</description>
      <pubDate>Mon, 25 May 2026 17:24:52 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/building-the-solution-teams-need-to-secure-ai-against-prompt/ba-p/4514198</guid>
      <dc:creator>teo-montero</dc:creator>
      <dc:date>2026-05-25T17:24:52Z</dc:date>
    </item>
    <item>
      <title>Stop Hallucinating, Start Evaluating — A Tour of AgentEval</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/stop-hallucinating-start-evaluating-a-tour-of-agenteval/ba-p/4519068</link>
      <description>&lt;P class="lia-align-center"&gt;&lt;EM&gt;Your AI agent works great… until it doesn't. AgentEval catches the failures before your users do.&lt;/EM&gt;&lt;/P&gt;
&lt;img&gt;&lt;STRONG&gt;&lt;EM&gt;Your AI agent works great… until it doesn't. AgentEval catches the failures before your users do.&lt;/EM&gt;&lt;/STRONG&gt;&lt;/img&gt;
&lt;H2&gt;The biggest problem for Agentic Engineers&lt;/H2&gt;
&lt;P&gt;Shipping an AI agent feels great — until you hit the wall every Agentic Engineer eventually hits: non-determinism. In September 2025 OpenAI researchers published &lt;A href="https://arxiv.org/abs/2509.04664" target="_blank" rel="noopener"&gt;"Why Language Models Hallucinate"&lt;/A&gt; (Kalai, Nachum, Vempala, Zhang). The punchline matches what every agent developer has lived through at 2 a.m.: even the best models confidently produce plausible falsehoods. So how do we — .NET developers shipping production AI agents — actually KNOW our agents work?&lt;/P&gt;
&lt;P&gt;Four questions a traditional unit test cannot answer:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Did the agent call the right tools, in the right order, with the right arguments?&lt;/LI&gt;
&lt;LI&gt;Is the answer actually grounded in the retrieved context — or is the model making it up?&lt;/LI&gt;
&lt;LI&gt;Will it survive a prompt injection next Tuesday?&lt;/LI&gt;
&lt;LI&gt;And — if we repeat this execution — will it work consistently every single time?&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;That last question is the one that keeps me up at night. It's the one that started this whole journey for me.&lt;/P&gt;
&lt;H2&gt;What is AgentEval?&lt;/H2&gt;
&lt;P&gt;AgentEval is the .NET evaluation toolkit for AI agents. Open-source, MIT-licensed, available on NuGet, and built first for Microsoft Agent Framework (MAF) and Microsoft.Extensions.AI. What RAGAS and DeepEval do for Python, AgentEval does for .NET — with the fluent assertion APIs we actually expect.&lt;/P&gt;
&lt;P class="lia-align-center"&gt;&lt;EM&gt;AgentEval Public Beta — built on MAF and MEAI.&lt;/EM&gt;&lt;/P&gt;
&lt;img&gt;
&lt;P&gt;&lt;EM&gt;AgentEval Public Beta — built on MAF and MEAI.&lt;/EM&gt;&lt;/P&gt;
&lt;/img&gt;
&lt;P&gt;Today it ships with over 30 runnable samples, 192 red-team probes covering 6 of the 10 OWASP LLM Top 10 categories, 5 memory metrics (more on that later — it's the "one more thing" of this post), and exports to JSON, JUnit XML, Markdown, SARIF and PDF.&lt;/P&gt;
&lt;P&gt;Docs live at &lt;A href="https://agenteval.dev/" target="_blank" rel="noopener"&gt;https://agenteval.dev&lt;/A&gt;. Public Beta — moving fast, APIs may shift, but already battle-tested by the .NET AI community.&lt;/P&gt;
&lt;H2&gt;Why I built AgentEval&lt;/H2&gt;
&lt;P&gt;&lt;EM&gt;Honest version: AgentEval started as a side effect of another side project.&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;I was building an auto-investment system for fun. Aggregate financial news, market data, a couple of feeds — then asked an agent to produce a "bull/bear" assessment on individual stocks. Sometimes it worked. Sometimes — same prompt, same data — a completely different conclusion. Worse, the inconsistencies didn't fail loudly. They quietly biased the next decision.&lt;/P&gt;
&lt;P&gt;So I went down the loophole, as we say. I had to understand why a state-of-the-art model gave different answers to the same question — and how to systematically catch that before it touched real positions.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="lia-align-center"&gt;&lt;EM&gt;Talking AI non-determinism — the talk that started it all.&lt;/EM&gt;&lt;/P&gt;
&lt;img&gt;&lt;EM&gt;Talking AI non-determinism — the talk that started it all.&lt;/EM&gt;&lt;/img&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I started small. Tool-call assertions. RAG faithfulness. Then stochastic runs — because the real question was never "did it pass?" but "how often does it pass?". Then model comparison. Then red teaming. It kept growing.&lt;/P&gt;
&lt;P&gt;In mid-December 2025 I made the call: I paused the auto-investment system (still on halt — I'll come back to it one day) and went all-in on AgentEval.&lt;/P&gt;
&lt;H3&gt;The ecosystem challenge&lt;/H3&gt;
&lt;P&gt;Here's the bit that bothered me the most. Python had RAGAS, DeepEval, PromptFoo. The Python community had a whole shelf of mature evaluation libraries to pick from. .NET had… nothing comparable. The community deserved better. And honestly, I needed it for my own work.&lt;/P&gt;
&lt;P class="lia-align-center"&gt;&lt;EM&gt;Building WITH the Microsoft Agent Framework team — not just FOR them.&lt;/EM&gt;&lt;/P&gt;
&lt;img&gt;&lt;EM&gt;Building WITH the Microsoft Agent Framework team — not just FOR them.&lt;/EM&gt;&lt;/img&gt;
&lt;P&gt;One quick credibility note: the MAF integration didn't ship by accident. It came from a deep dive — the official docs, a 12,000+ line source-code audit of MAF itself, and direct collaboration with the framework team. The rigor is intentional.&lt;/P&gt;
&lt;P&gt;Today AgentEval is open source, MIT-licensed, and built with love for the .NET AI community.&lt;/P&gt;
&lt;H2&gt;Built for the Microsoft AI Stack&lt;/H2&gt;
&lt;P&gt;Three reasons MAF and MEAI developers care:&lt;/P&gt;
&lt;P&gt;Native first-class integration. AgentEval works directly with AIAgent and IChatClient. Tool calls are tracked automatically. Token counts, cost estimates, latencies — captured for you. No manual wiring.&lt;/P&gt;
&lt;P&gt;Universal adapter for any provider. One line:&lt;/P&gt;
&lt;LI-CODE lang="csharp"&gt;var agent = chatClient.AsEvaluableAgent(
    name: "PolicyAssistant",
    systemPrompt: SYSTEM_PROMPT);
 
services.AddAgentEval();   // DI-friendly setup
&lt;/LI-CODE&gt;
&lt;P&gt;That chatClient can be Azure OpenAI, OpenAI, Foundry Local, Ollama, Groq, vLLM — anything that speaks IChatClient. Same evaluation code, swappable provider.&lt;/P&gt;
&lt;P&gt;Record once. Replay forever. Trace recording captures a full execution as JSON. Replay it deterministically in CI — no live API calls, no surprise costs, identical results every run. This is how AI evaluations finally become reproducible.&lt;/P&gt;
&lt;P&gt;Semantic Kernel users aren't left out — there's a bridge via AIFunctionFactory.Create() so SK tools surface in AgentEval just like MAF tools.&lt;/P&gt;
&lt;H2&gt;The Capability Tour&lt;/H2&gt;
&lt;H3&gt;Tool chains, asserted.&lt;/H3&gt;
&lt;P&gt;Tool calls are where agents earn their keep — and where they fail in subtle ways. AgentEval gives us a fluent grammar to assert on them:&lt;/P&gt;
&lt;LI-CODE lang="csharp"&gt;result.ToolUsage!.Should()
    .HaveCalledTool("AuthenticateUser", because: "security first")
        .BeforeTool("FetchUserData")
        .WithArgument("method", "OAuth2")
    .And()
    .HaveCalledTool("SendNotification").AtLeastTimes(1)
    .And()
    .HaveNoErrors();&lt;/LI-CODE&gt;
&lt;P&gt;No more grep-ing logs to confirm a function fired. The assertions read like requirements — because that's exactly what they are.&lt;/P&gt;
&lt;H3&gt;Is your agent hallucinating?&lt;/H3&gt;
&lt;P&gt;Use a separate evaluator LLM to score the response against the retrieved context — faithfulness, relevance, correctness.&lt;/P&gt;
&lt;LI-CODE lang="csharp"&gt;var faithfulness = await new FaithfulnessMetric(evaluator).EvaluateAsync(context);
if (faithfulness.Score &amp;lt; 70)
    throw new HallucinationDetectedException($"Faithfulness: {faithfulness.Score}");&lt;/LI-CODE&gt;
&lt;P&gt;The trick is using two IChatClients — one for the agent, a separate one for the judge. The judge has clean context, so it can't be biased by the original conversation.&lt;/P&gt;
&lt;H3&gt;Stochastic evaluation, or: the bull/bear problem&lt;/H3&gt;
&lt;P&gt;Remember my auto-investment system? Same prompt, different answers. That isn't a bug — that's how LLMs are. So let's stop asking "did it pass?" and start asking "how often does it pass?"&lt;/P&gt;
&lt;LI-CODE lang="csharp"&gt;var result = await stochasticRunner.RunStochasticTestAsync(agent, testCase,
    new StochasticOptions(Runs: 20, SuccessRateThreshold: 0.85, ScoreThreshold: 75));
result.Statistics.Mean.Should().BeGreaterThan(80);
result.Statistics.StandardDeviation.Should().BeLessThan(10);&lt;/LI-CODE&gt;
&lt;P&gt;Run it twenty times. Assert on pass rate and standard deviation. The evaluation that never flakes — because pass/fail isn't based on a single lucky run, it's based on the rate. That is the answer to my bull/bear problem.&lt;/P&gt;
&lt;P&gt;And here's where it gets interesting: I've used this exact technique to figure out which agentic architectures are more deterministic than others — what makes some agents stable and others wobble. You can do the same. Apply your own hypotheses — "does adding a planner reduce variance?", "is GPT-4o-mini consistent enough for this task?", "does my system prompt fix the spread?" — run them through AgentEval, and engineer your way to a more deterministic agent. This is what real Agentic Engineering looks like.&lt;/P&gt;
&lt;P class="lia-align-center"&gt;&lt;EM&gt;Stochastic evaluations&lt;/EM&gt;&lt;/P&gt;
&lt;img&gt;Stochastic evaluations&lt;/img&gt;
&lt;H3&gt;Pick the right model — with evidence.&lt;/H3&gt;
&lt;P&gt;"GPT-4o or GPT-4o-mini?" used to be a vibes-based debate. Now it's a function call:&lt;/P&gt;
&lt;LI-CODE lang="csharp"&gt;var result = await comparer.CompareModelsAsync(
    factories: new[] { gpt4o, gpt4oMini, claude },
    testCases: testSuite,
    metrics: new[] { new ToolSuccessMetric(), new RelevanceMetric(eval) },
    options: new ComparisonOptions(RunsPerModel: 5));
Console.WriteLine(result.ToMarkdown());&lt;/LI-CODE&gt;
&lt;P class="lia-align-center"&gt;&lt;EM&gt;Model comparison output — winner + best-value recommendation&lt;/EM&gt;&lt;/P&gt;
&lt;img&gt;
&lt;P&gt;&lt;EM&gt;Model comparison output — winner + best-value recommendation&lt;/EM&gt;&lt;/P&gt;
&lt;/img&gt;
&lt;P&gt;The output is a leaderboard — pass rate, average score, latency, cost — plus a recommendation. "Why are we using GPT-4o?" Answer: this report.&lt;/P&gt;
&lt;H3&gt;Latency and cost — as code.&lt;/H3&gt;
&lt;LI-CODE lang="csharp"&gt;result.Performance!.Should()
    .HaveFirstTokenUnder(TimeSpan.FromMilliseconds(500))
    .HaveEstimatedCostUnder(0.05m)
    .HaveTokenCountUnder(2000);&lt;/LI-CODE&gt;
&lt;P&gt;If your UX needs sub-500ms time-to-first-token, write it down. Make it fail the build. SLAs as executable evaluations — that's the dream.&lt;/P&gt;
&lt;H3&gt;Red team your agent before users do.&lt;/H3&gt;
&lt;P&gt;192 probes across 9 attack types — prompt injection, jailbreaks, PII leakage, excessive agency, insecure output handling, system prompt extraction, indirect injection, inference abuse, encoding evasion. Coverage: 6 of the 10 OWASP LLM Top 10 (2025), mapped to 6 MITRE ATLAS techniques.&lt;/P&gt;
&lt;LI-CODE lang="csharp"&gt;// One line.
var result = await agent.QuickRedTeamScanAsync();
result.Should()
    .HavePassed()
    .And().HaveMinimumScore(80)
    .And().HaveASRBelow(0.05);   // Attack Success Rate under 5%&lt;/LI-CODE&gt;
&lt;P&gt;Need a full pipeline? Use AttackPipeline.Create() with specific attacks and intensities. Need to ship findings to the GitHub Security tab? Export to SARIF. Need a PDF for the security review board? Export to PDF.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Does this actually work in the wild? Yes. &lt;A href="https://jkdev.me/blog/agenteval-dotnet" target="_blank"&gt;Jernej Kavka&lt;/A&gt; (Microsoft MVP, SSW) tried AgentEval against one of his agents in a single evening. The initial system prompt let through a 40% attack success rate. After tightening the prompt and re-running the red-team scan, he brought it to &lt;STRONG&gt;zero&lt;/STRONG&gt;. One evening. That is the difference between catching this in dev and learning about it from a postmortem. Link to his experience blog at the bottom references.&lt;/P&gt;
&lt;H3&gt;Toxicity, bias, misinformation.&lt;/H3&gt;
&lt;P&gt;Responsible AI isn't an afterthought. Counterfactual bias testing — the methodology academics will recognise — is one line:&lt;/P&gt;
&lt;LI-CODE lang="csharp"&gt;var bias = await new BiasMetric(evaluator)
    .EvaluateCounterfactualAsync(originalContext, counterfactualContext, "gender");&lt;/LI-CODE&gt;
&lt;P&gt;Toxicity and misinformation metrics live right next to it. Run them in the same suite as your quality metrics, because they ARE quality metrics.&lt;/P&gt;
&lt;H3&gt;Multi-agent flows, as assertions.&lt;/H3&gt;
&lt;P&gt;Agents don't live alone anymore. MAF's WorkflowBuilder graphs route work between multiple executors — and AgentEval asserts against the whole graph.&lt;/P&gt;
&lt;LI-CODE lang="csharp"&gt;var testCase = new WorkflowTestCase {
    Name = "TripPlanner",
    ExpectedExecutors  = ["TripPlanner", "FlightReservation", "HotelReservation", "Presenter"],
    StrictExecutorOrder = true,
    ExpectedTools = ["SearchFlights", "BookHotel"],
};

var result = await new WorkflowEvaluationHarness()
    .RunWorkflowTestAsync(workflowAdapter, testCase);

result.ExecutionResult!.Should()
    .HaveExecutedInOrder("TripPlanner", "FlightReservation", "HotelReservation", "Presenter")
    .HaveTraversedEdge("TripPlanner", "FlightReservation")
    .HaveAnyExecutorCalledTool("SearchFlights")
    .HaveCompletedWithin(TimeSpan.FromMinutes(2));&lt;/LI-CODE&gt;
&lt;P&gt;Four agents. Real WorkflowBuilder. One test. Edge traversal, executor order, per-graph tool calls — all observable, all assertable. Try doing that in Python.&lt;/P&gt;
&lt;H2&gt;Don't write C#? Use the CLI.&lt;/H2&gt;
&lt;P&gt;Not every reader is a .NET developer — and AgentEval has a no-code path for everyone else. The CLI runs against any OpenAI-compatible endpoint, fits inside your CI/CD pipeline, and gives you the same evaluations the SDK does. Install once:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;dotnet tool install -g AgentEval.Cli --prerelease&lt;/LI-CODE&gt;
&lt;H3&gt;1. Scaffold a test dataset&lt;/H3&gt;
&lt;LI-CODE lang="bash"&gt;agenteval init -o agenteval.yaml&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This drops a YAML template you can edit — inputs, expected behaviours, evaluation criteria. Treat it like any other test file: commit it, version it, code-review it.&lt;/P&gt;
&lt;H3&gt;2. Run an evaluation&lt;/H3&gt;
&lt;LI-CODE lang="bash"&gt;# Against Azure OpenAI
agenteval eval --azure \
    --endpoint https://myresource.openai.azure.com/ \
    --deployment-name gpt-4o \
    --dataset agenteval.yaml --runs 5 --success-threshold 0.9

# Against OpenAI directly
agenteval eval --endpoint https://api.openai.com/v1 \
    --model gpt-4o --dataset agenteval.yaml

# Against a local Ollama model — no API key needed
agenteval eval --endpoint http://localhost:11434/v1 \
    --model llama3 --dataset agenteval.yaml&lt;/LI-CODE&gt;
&lt;P&gt;Works with Azure OpenAI, OpenAI, Foundry Local, Ollama, Groq, vLLM, LM Studio, Together.ai — anything that speaks the OpenAI protocol. Same dataset, swappable provider.&lt;/P&gt;
&lt;H3&gt;3. Run a red-team scan&lt;/H3&gt;
&lt;LI-CODE lang="bash"&gt;# All 9 attack types at moderate intensity
agenteval redteam --azure \
    --endpoint https://myresource.openai.azure.com/ \
    --deployment-name gpt-4o --intensity moderate

# Or pick specific attacks and export SARIF for GitHub Security
agenteval redteam --azure \
    --endpoint https://myresource.openai.azure.com/ \
    --deployment-name gpt-4o \
    --attacks PromptInjection,Jailbreak --format sarif&lt;/LI-CODE&gt;
&lt;P&gt;The CLI exports to JSON, Markdown, JUnit XML, SARIF and PDF — so you can wire results into GitHub Actions, Azure DevOps, dashboards, or compliance reports. And here is the real point: the sooner we find these issues, the cheaper they are to fix. Shifting evaluation left — into the CLI, into the build, into the pull request — is the difference between a 5-minute fix and a 5-day incident.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;Shift-left for AI quality&lt;/H2&gt;
&lt;P class="lia-align-center"&gt;&lt;EM&gt;Behavioral policies and compliance — as code, in your test suite.&lt;/EM&gt;&lt;/P&gt;
&lt;img&gt;
&lt;P&gt;&lt;EM&gt;Behavioral policies and compliance — as code, in your test suite.&lt;/EM&gt;&lt;/P&gt;
&lt;/img&gt;
&lt;P&gt;Evaluations belong in your build, not in a notebook. Here is how that looks in practice:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Write evaluations as xUnit tests — they run on every PR alongside your unit tests.&lt;/LI&gt;
&lt;LI&gt;Export JUnit XML — results show up in the GitHub Actions / Azure DevOps test tab like any other test.&lt;/LI&gt;
&lt;LI&gt;Export SARIF — red-team findings land in the GitHub Security tab automatically.&lt;/LI&gt;
&lt;LI&gt;Auto-saved trace artifacts on failure — record once, replay forever, debug at leisure.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Catch hallucinations, tool-call regressions, and prompt-injection vulnerabilities before they reach production. That is shift-left for AI quality in .NET — and the sooner you do it, the cheaper every fix gets.&lt;/P&gt;
&lt;H2&gt;And one more thing… memory.&lt;/H2&gt;
&lt;P class="lia-align-center"&gt;&lt;EM&gt;First AgentEval memory component and dynamic report showcase, at the MVP Summit&lt;/EM&gt;&lt;/P&gt;
&lt;img&gt;First memory component and dynamic report showcase, at the MVP Summit&lt;/img&gt;
&lt;P&gt;Every serious agent now has "memory" — context providers, chat history compaction, cross-session recall. But here is the uncomfortable question: does it actually remember?&lt;/P&gt;
&lt;P&gt;When your compaction strategy drops a fact at the 50K-token mark, do you notice? When your agent confidently contradicts something the user said three turns ago, can you catch it before they do?&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN lia-align-center"&gt;&lt;STRONG&gt;&lt;EM&gt;Does your agent actually remember? Or is it just guessing convincingly?&lt;/EM&gt;&lt;/STRONG&gt;&lt;/DIV&gt;
&lt;P&gt;Enter AgentEval.Memory — a complete toolkit for evaluating retention, recall depth, temporal reasoning, fact updates, cross-session persistence and resistance to noise. Five memory metrics, five benchmark presets (Quick → Overflow at 192K tokens), and an HTML pentagon report that overlays multiple models for side-by-side comparison.&lt;/P&gt;
&lt;P&gt;The headline: LongMemEval — the ICLR 2025 academic memory benchmark — is fully re-implemented in .NET. That means .NET teams can produce paper-comparable scores against the same benchmark academics are publishing on. For coursework or research, this matters.&lt;/P&gt;
&lt;LI-CODE lang="csharp"&gt;var runner = MemoryBenchmarkRunner.Create(chatClient);
var result = await runner.RunBenchmarkAsync(agent, MemoryBenchmark.Standard);

Console.WriteLine($"Memory: {result.OverallScore:F1}% ({result.Grade})");
await result.ExportHtmlReportAsync("memory-report.html");&lt;/LI-CODE&gt;
&lt;P&gt;MAF-native — it plugs into AIContextProvider, ChatHistoryProvider and CompactionStrategy out of the box. Wire it up and find out what your agent actually forgets.&lt;/P&gt;
&lt;H2&gt;The community is already using it&lt;/H2&gt;
&lt;P&gt;Jernej Kavka (Microsoft MVP, SSW) wrote up his experience getting started with AgentEval — &lt;A href="https://jkdev.me/blog/agenteval-dotnet" target="_blank"&gt;Testing .NET AI Agents with AgentEval&lt;/A&gt;. In one evening he caught real issues in his agents before they had a chance to escape — and walked through hardening his system prompt until the red-team attack success rate dropped from 40% to zero. Read the post; it is a great hands-on introduction.&lt;/P&gt;
&lt;P&gt;Want a full end-to-end sample? Check out ECS2026MAF in the samples folder. Or run all the samples in one shot:&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;dotnet run --project samples/AgentEval.Samples&lt;/LI-CODE&gt;
&lt;P&gt;I have also taken this talk on the road — "Stop Hallucinating, Start Evaluating" at the MVP Summit, ECS 2026 and EMPOWER conferences and a handful of community events. If you want to bring it to your .NET user group, your university lab, or your faculty seminar — get in touch.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;Try AgentEval in 5 minutes&lt;/H2&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;&lt;STRONG&gt;★ Five minutes, four commands&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;1.&amp;nbsp; Install the CLI:&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;dotnet tool install -g AgentEval.Cli --prerelease&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;2.&amp;nbsp; Scaffold a test dataset:&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;agenteval init -o agenteval.yaml&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;3.&amp;nbsp; Run an evaluation:&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;agenteval eval --endpoint https://api.openai.com/v1 --model gpt-4o --dataset agenteval.yaml&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;4.&amp;nbsp; Run a red-team scan:&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;agenteval redteam --endpoint ... --model gpt-4o --intensity quick&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H3&gt;Where to go next&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;⭐ Star the repo: &lt;A href="https://github.com/AgentEvalHQ/AgentEval" target="_blank"&gt;https://github.com/AgentEvalHQ/AgentEval&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;🧰 CLI source: &lt;A href="https://github.com/AgentEvalHQ/AgentEval.Cli" target="_blank"&gt;https://github.com/AgentEvalHQ/AgentEval.Cli&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;📖 Read the docs: &lt;A href="https://agenteval.dev/" target="_blank"&gt;https://agenteval.dev&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;📦 NuGet: &lt;A href="https://www.nuget.org/packages/AgentEval" target="_blank"&gt;https://www.nuget.org/packages/AgentEval&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;🎓 Adopt AgentEval in your courses or research labs — fluent, type-safe, LongMemEval-comparable.&lt;/LI&gt;
&lt;LI&gt;🤝 Contribute — issues, attack probes, samples, ideas. Pull requests are welcome.&lt;/LI&gt;
&lt;LI&gt;💬 Discuss — questions and feedback on &lt;A href="https://github.com/AgentEvalHQ/AgentEval/discussions" target="_blank"&gt;https://github.com/AgentEvalHQ/AgentEval/discussions&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN lia-align-center"&gt;&lt;STRONG&gt;&lt;EM&gt;Stop guessing if your AI agent works. Start proving it. And use AgentEval to make them work better.&lt;/EM&gt;&lt;/STRONG&gt;&lt;/DIV&gt;
&lt;H3&gt;Further reading&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;AgentEval repository: &lt;A href="https://github.com/AgentEvalHQ/AgentEval" target="_blank"&gt;https://github.com/AgentEvalHQ/AgentEval&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;AgentEval.Cli: &lt;A href="https://github.com/AgentEvalHQ/AgentEval.Cli" target="_blank"&gt;https://github.com/AgentEvalHQ/AgentEval.Cli&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Documentation: &lt;A href="https://agenteval.dev/" target="_blank"&gt;https://agenteval.dev&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Jernej Kavka — Testing .NET AI Agents with AgentEval: &lt;A href="https://jkdev.me/blog/agenteval-dotnet" target="_blank"&gt;https://jkdev.me/blog/agenteval-dotnet&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;EM&gt;Kalai, A.T., Nachum, O., Vempala, S.S., Zhang, E. "Why Language Models Hallucinate." arXiv preprint, September 2025 — &lt;/EM&gt;&lt;A href="https://arxiv.org/abs/2509.04664" target="_blank"&gt;https://arxiv.org/abs/2509.04664&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Wed, 13 May 2026 06:12:50 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/stop-hallucinating-start-evaluating-a-tour-of-agenteval/ba-p/4519068</guid>
      <dc:creator>joslat</dc:creator>
      <dc:date>2026-05-13T06:12:50Z</dc:date>
    </item>
    <item>
      <title>City simulator with AI agents for traffic congestion</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/city-simulator-with-ai-agents-for-traffic-congestion/ba-p/4517565</link>
      <description>&lt;H2&gt;Project Overview&lt;/H2&gt;
&lt;P&gt;Hi I'm &lt;A class="lia-external-url" href="https://www.linkedin.com/in/shoib1012/" target="_blank" rel="noopener"&gt;Shoib Muhammad&lt;/A&gt; an MEng Mathematical Computation student at &lt;A class="lia-external-url" href="http://www.ucl.ac.uk" target="_blank" rel="noopener"&gt;UCL&lt;/A&gt;, I set out to tackle one of the most persistent challenges in modern cities: traffic congestion. In London alone, drivers lose around 141 hours each year to congestion, with the wider UK economy losing approximately £5.1 billion in productivity. Traditional fixed-cycle traffic lights cannot adapt to changing demand, and even adaptive systems such as SCOOT and SCATS tend to focus on single intersections rather than understanding the full network.&lt;/P&gt;
&lt;P&gt;The aim of this project was to build something practical and experimental: a high fidelity simulation environment where multi agent AI systems can be tested, broken, and improved without affecting real world traffic. It is a safe space to explore how intelligent systems could one day manage entire city networks.&lt;/P&gt;
&lt;H2&gt;The Stack&lt;/H2&gt;
&lt;P&gt;The project brings together a mix of gaming, backend, and AI technologies. Microsoft Foundry using GPT-4o handles both vision and reasoning, allowing the system to count vehicles from camera feeds and make decisions from a single cloud endpoint. The Microsoft Agent Framework enables multi agent coordination, structuring a pipeline from camera input through to intersection decisions and orchestration, with up to 20 agents running in parallel. Unreal Engine 5 provides the simulation itself, modelling a 20 intersection city with realistic vehicle movement and camera views. FastAPI acts as the backend, linking everything together with schema validation, a polling loop, and a deterministic testing suite to ensure reliability.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table class="lia-border-color-21" border="1" style="width: 624px; border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;td class="lia-border-color-21"&gt;
&lt;P&gt;&lt;STRONG&gt;Technology&lt;/STRONG&gt;&lt;/P&gt;
&lt;/td&gt;&lt;td class="lia-border-color-21"&gt;
&lt;P&gt;&lt;STRONG&gt;Core Function&lt;/STRONG&gt;&lt;/P&gt;
&lt;/td&gt;&lt;td class="lia-border-color-21"&gt;
&lt;P&gt;&lt;STRONG&gt;Impact&lt;/STRONG&gt;&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class="lia-border-color-21"&gt;
&lt;P&gt;Microsoft Foundry (GPT-4o)&lt;/P&gt;
&lt;/td&gt;&lt;td class="lia-border-color-21"&gt;
&lt;P&gt;Vision + Reasoning&lt;/P&gt;
&lt;/td&gt;&lt;td class="lia-border-color-21"&gt;
&lt;P&gt;Counted vehicles from camera frames and made per-intersection decisions through one cloud endpoint.&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="lia-border-color-21"&gt;
&lt;P&gt;Microsoft Agent Framework&lt;/P&gt;
&lt;/td&gt;&lt;td class="lia-border-color-21"&gt;
&lt;P&gt;Multi-Agent Orchestration&lt;/P&gt;
&lt;/td&gt;&lt;td class="lia-border-color-21"&gt;
&lt;P&gt;Structured the Camera → Intersection → Orchestrator pipeline and fanned out 20 parallel agent runs per cycle.&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="lia-border-color-21"&gt;
&lt;P&gt;Unreal Engine 5&lt;/P&gt;
&lt;/td&gt;&lt;td class="lia-border-color-21"&gt;
&lt;P&gt;3D Simulation Environment&lt;/P&gt;
&lt;/td&gt;&lt;td class="lia-border-color-21"&gt;
&lt;P&gt;Hosted the 20-intersection city, vehicle physics, and per-intersection cameras.&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="lia-border-color-21"&gt;
&lt;P&gt;FastAPI&lt;/P&gt;
&lt;/td&gt;&lt;td class="lia-border-color-21"&gt;
&lt;P&gt;Backend &amp;amp; Schema Validation&lt;/P&gt;
&lt;/td&gt;&lt;td class="lia-border-color-21"&gt;
&lt;P&gt;Bridged UE5 and Azure with a polling loop, schema contracts, and a 32-test deterministic suite.&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H2&gt;How a Decision Cycle Flows&lt;/H2&gt;
&lt;P&gt;The simulation models a small city with 20 intersections. Each decision cycle is powered by three types of AI agents working together. A Camera Agent acts as the perception layer, using GPT-4o vision to count vehicles on each road approaching an intersection. Each intersection is controlled by its own agent, which evaluates traffic pressure and decides whether to change the signal phase. An Orchestrator Agent then coordinates these decisions across the network, ensuring that neighbouring intersections do not switch at the same time and cause gridlock.&lt;/P&gt;
&lt;P&gt;At the start of each cycle, Unreal Engine generates screenshots from each intersection along with traffic flow data. The backend processes and validates this data, then sends it to the Camera Agent. The output is distributed to all intersection agents in parallel, and their decisions are combined by the orchestrator. The final decision is written back into the simulation, updating the traffic lights. If anything fails, the system simply holds its current state and retries on the next cycle, making the system robust even with non deterministic AI outputs.&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Camera Agent &lt;/STRONG&gt;– the perception layer, using GPT-4o vision to count vehicles on every approach of every intersection.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Intersection Agents (×20, parallel) &lt;/STRONG&gt;– each owns a single junction, evaluates queue pressure and recommends whether to switch the signal phase.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Orchestrator Agent &lt;/STRONG&gt;– the coordinator, applying an adjacency-aware rule that prevents two neighbouring intersections from switching simultaneously and dumping traffic into each other's red phases.&lt;/LI&gt;
&lt;/UL&gt;
&lt;img&gt;Agent Topology&lt;/img&gt;
&lt;P&gt;When a decision cycle begins, UE5 writes 20 PNG screenshots and a flow-direction JSON to a shared directory. The FastAPI backend polls the directory, validates the batch against a 20-unique-intersection coverage gate, and hands the images to the Camera Agent. Its structured output is fanned out to 20 parallel Intersection Agents through the Microsoft Agent Framework, then fanned back in to the Orchestrator. The Orchestrator's decision JSON is written back to the shared directory; UE5 polls it and updates signal phases. If any layer fails, the simulation simply holds its previous pattern and tries again on the next cycle.&lt;/P&gt;
&lt;P&gt;The JSON payload shared between every layer has to follow strict formats. This matters because GPT-4o output is non-deterministic; early iterations would return the wrong schema, alter key names or silently omit intersections it judged unimportant. Treating prompt engineering and schema validation as one component is what makes an LLM-based control loop robust enough to leave running.&amp;nbsp;&lt;/P&gt;
&lt;img&gt;Decision cycle flow diagram&lt;/img&gt;
&lt;P&gt;One of the key challenges was ensuring consistency. GPT-4o outputs are not always predictable, so strict schema validation was essential. Early versions would return incorrect formats or omit data, which made the system unreliable. Treating prompt design and validation as one combined problem was critical to making the system stable enough to run continuously.&lt;/P&gt;
&lt;H2&gt;Results&lt;/H2&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;&lt;STRONG&gt;Metric&lt;/STRONG&gt;&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;&lt;STRONG&gt;Baseline&lt;/STRONG&gt;&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;&lt;STRONG&gt;AI-Enhanced&lt;/STRONG&gt;&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;Mean throughput (300s run)&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;204.4&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;127.1&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;95% CI&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;[195.6, 213.4]&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;[121.3, 132.9]&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;Throughput dispersion (IQR)&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;≈ 240&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;≈ 120&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;col style="width: 33.33%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;Across forty simulation runs, each lasting 300 seconds, the traditional fixed cycle system outperformed the AI driven system in terms of throughput. The AI system processed fewer vehicles, with performance affected by several factors including decision latency, small errors in vehicle detection, and the use of a general purpose language model for a task that ideally requires millisecond level control.&lt;/P&gt;
&lt;P&gt;However, the numeric result was not the main success of the project. The primary goal was to explore and demonstrate what can be built using Azure AI Foundry and multi agent systems, and in that respect the project achieved its aim.&lt;/P&gt;
&lt;P&gt;Forty 300-second runs across two demand profiles, ten replicates per condition. The fixed-cycle baseline beat the AI-enhanced mode on throughput by 37.8%. Three factors drove the gap: 3–10 second decision-cycle latency, vision noise of around one vehicle per approach and using a general-purpose LLM for what is effectively a millisecond-scale control task.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The final result is not very important to the success of this project. The primary goal of this project was to demonstrate Azure AI Foundry's capabilities, which was readily achieved.&lt;/P&gt;
&lt;H2&gt;Main Deliverable&lt;/H2&gt;
&lt;P&gt;One of the most valuable outcomes of the project is the dataset produced. For every decision cycle and each intersection, the system logs a full state action outcome record. This includes screenshots, true queue lengths from the simulator, estimated queue lengths from the AI, throughput, signal states, and decision outputs. In total, the project generated around 48,000 labelled data points and associated images, producing roughly 4.8 GB of structured data.&lt;/P&gt;
&lt;P&gt;This dataset has real value beyond the simulation. It can be used to train improved vision models that are more robust to noise, or to train reinforcement learning agents that learn traffic control policies from experience rather than predefined rules.&lt;/P&gt;
&lt;P&gt;Every cycle, for every intersection, the simulation logs a state–action–outcome triple: a 512×512 screenshot, ground-truth queue lengths, GPT-4o-derived queue lengths, throughput, signal state, flow direction, and the orchestrator's recommended action. Forty runs across different modes and traffic densities produced roughly 48,000 triples and 48,000 paired screenshots, about 4.8 GB of structured, labelled data. Each screenshot has both a clean label (simulator ground truth) and a noisy label (vision estimate), making it directly usable for fine-tuning a vehicle-counting model robust to real-world camera noise. The same triples are in the format offline reinforcement learning algorithms expect.&lt;/P&gt;
&lt;H2&gt;Future Work&lt;/H2&gt;
&lt;P&gt;The next step would be to replace the intersection agents with a reinforcement learning model trained on this dataset, using traffic flow improvements and queue lengths as rewards. Another improvement would be to fine tune a dedicated vision model to replace the current GPT-4o based approach, using the existing data as training input. Expanding the simulation to include more realistic road types, such as roundabouts and T junctions, and using real traffic data to drive demand patterns would also make the system more representative of real cities.&lt;/P&gt;
&lt;P&gt;The most direct continuation is replacing the Intersection Agents with a small reinforcement learning policy trained on the dataset above, using throughput delta and the queue length as the reward signals. Beyond that, the priority extensions are:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;A fine-tuned vision model &lt;/STRONG&gt;to replace the GPT-4o vision call - the screenshot–queue-length pairs already in the dataset are the training set.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Larger, more realistic networks &lt;/STRONG&gt;with mixed intersection types (roundabouts, T-junctions) and time-varying demand curves calibrated against real traffic feeds.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Final Thoughts&lt;/H2&gt;
&lt;P&gt;This project is a great example of how modern AI tools can be combined to build complex, experimental systems. If you are a student developer interested in AI, simulation, or multi agent systems, this is a space full of opportunities to explore.&amp;nbsp; &lt;A class="lia-external-url" href="http://ai.azure.com" target="_blank" rel="noopener"&gt;Microsoft Foundry&lt;/A&gt;,&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;If you're exploring the technologies behind this project, the&lt;STRONG&gt;&amp;nbsp;&lt;/STRONG&gt;&lt;A class="lia-external-url" href="https://azure.microsoft.com/en-us/products/ai-foundry/tools" target="_blank" rel="noopener"&gt;Microsoft's AI stack&lt;/A&gt;, &lt;A class="lia-external-url" href="https://github.com/microsoft/agent-framework.git" target="_blank" rel="noopener"&gt;Microsoft Agent Framework&lt;/A&gt;, &lt;A class="lia-external-url" href="https://github.com/leestott/agent_patterns_foundry_demo.git" target="_blank" rel="noopener"&gt;Agent Topologies&lt;/A&gt; and the &lt;A class="lia-external-url" href="https://github.com/microsoft/ai-agents-for-beginners.git" target="_blank" rel="noopener"&gt;AI Agents for Beginners&lt;/A&gt; are the resources I'd recommend.&lt;/P&gt;
&lt;P&gt;I would also like to thank my external supervisor, &lt;A href="https://techcommunity.microsoft.com/users/lee_stott/210546" target="_blank" rel="noopener" data-lia-auto-title="Lee Stott" data-lia-auto-title-active="0"&gt;&lt;STRONG&gt;Lee Stott&lt;/STRONG&gt;&lt;/A&gt;, for his guidance and support throughout the project.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A class="lia-external-url" href="https://github.com/Shoib1012/Final-Year-Project.git" target="_blank" rel="noopener"&gt;Link to Project&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&lt;A class="lia-external-url" href="http://www.linkedin.com/in/shoib1012" target="_blank" rel="noopener"&gt;Link to my LinkedIn&amp;nbsp;&lt;/A&gt;&lt;/P&gt;
&lt;H3&gt;Want to Learn More&lt;/H3&gt;
&lt;P&gt;Join us on Thursday 21 May at 12pm UK time for a live session in the &lt;A class="lia-external-url" href="https://aka.ms/foundry/discord" target="_blank" rel="noopener"&gt;Microsoft Foundry Discord Community &lt;/A&gt;where we will explore how a student built a full 20 intersection AI powered traffic simulation using Azure AI Foundry, GPT-4o and the Microsoft Agent Framework. &lt;BR /&gt;&lt;BR /&gt;This is a great opportunity for student developers who want to go beyond theory and see how modern AI tools can be combined to build real, end to end systems.&lt;/P&gt;
&lt;P&gt;In this session, we will break down how multi agent systems can be used to solve complex real world problems, how Unreal Engine was used to create a realistic simulation environment, and how Microsoft Foundry enabled rapid experimentation with vision and reasoning models. You will also get a behind the scenes look at the challenges of working with non deterministic AI outputs and how to make these systems more reliable through structured design.&lt;/P&gt;
&lt;P&gt;If you are interested in AI, simulation, game development, or building intelligent systems that operate at scale, this session will give you practical ideas you can apply to your own projects. Whether you are just starting out or already experimenting with AI agents, you will leave with a clearer understanding of how to design, test, and iterate on complex systems.&lt;/P&gt;
&lt;P&gt;Bring your questions, get involved in the discussion, and connect with others in the community who are building with Microsoft Foundry.&lt;/P&gt;
&lt;P&gt;&lt;A class="lia-external-url" href="https://discord.gg/rMh9PkjF?event=1502936656506781827" target="_blank"&gt;https://discord.gg/rMh9PkjF?event=1502936656506781827&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 11 May 2026 07:00:00 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/city-simulator-with-ai-agents-for-traffic-congestion/ba-p/4517565</guid>
      <dc:creator>Shoib</dc:creator>
      <dc:date>2026-05-11T07:00:00Z</dc:date>
    </item>
    <item>
      <title>Build AI RAG Apps with LangChain, Azure DocumentDB and Microsoft Foundry: Step-by-Step Guide</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/build-ai-rag-apps-with-langchain-azure-documentdb-and-microsoft/ba-p/4513775</link>
      <description>&lt;H3&gt;Scenario&lt;/H3&gt;
&lt;P data-start="0" data-end="694" data-is-last-node="" data-is-only-node=""&gt;Imagine you are building your company’s RAG chat application using &lt;STRONG data-start="67" data-end="91"&gt;Microsoft Foundry - Azure OpenAI&lt;/STRONG&gt;&amp;nbsp;and orchestrating the flow with &lt;STRONG data-start="124" data-end="165"&gt;LangChain&lt;/STRONG&gt;. The chat experience works, but now it needs to be grounded in your company’s data. You generate embeddings and want to store and query them without adding another database or complex sync pipeline. Instead of stitching services together, you use &lt;STRONG data-start="413" data-end="462"&gt;Azure DocumentDB (with MongoDB compatibility)&lt;/STRONG&gt; with built-in vector search to store your JSON data and embeddings in one place. You deploy the app to &lt;STRONG data-start="566" data-end="587"&gt;Azure App Service&lt;/STRONG&gt; and quickly compare vector search alone versus a full RAG pipeline, sharing it with your team for testing.&lt;/P&gt;
&lt;H3 id="what-will-you-learn"&gt;What will you learn?&lt;/H3&gt;
&lt;P&gt;In this blog, you'll learn to:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Create an Azure DocumentDB (with MongoDB compatibility) resource.&lt;/LI&gt;
&lt;LI&gt;Create an embeddings and a chat deployment in Microsoft Foundry Azure OpenAI portal.&lt;/LI&gt;
&lt;LI&gt;Create an Azure App Service website with continuous deployment from GitHub.&lt;/LI&gt;
&lt;LI&gt;Configure Azure App Service application settings to enable communication between Azure resources.&lt;/LI&gt;
&lt;LI&gt;Configure GitHub workflow to work successfully.&amp;nbsp;&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;What is the main objective?&lt;/H3&gt;
&lt;P&gt;Build AI Powered RAG Application using LangChain, Microsoft Foundry Azure OpenAI, and Azure DocumentDB (with MongoDB compatibility&lt;SPAN style="font-style: var(--lia-blog-font-style); font-family: var(--lia-blog-font-family); font-size: var(--lia-bs-font-size-base);"&gt;): Step-by-Step Guide&lt;/SPAN&gt;&lt;/P&gt;
&lt;img /&gt;
&lt;H3&gt;Prerequisites&lt;/H3&gt;
&lt;UL&gt;
&lt;LI class="graf graf--p"&gt;An Azure subscription.
&lt;UL&gt;
&lt;LI&gt;If you don’t already have one, you can sign up for an&amp;nbsp;&lt;A class="markup--anchor markup--li-anchor" title="Sign up for an Azure free account" href="https://azure.microsoft.com/?wt.mc_id=studentamb_71460" target="_blank" rel="noopener noreferrer" data-href="https://azure.microsoft.com/"&gt;Azure free account&lt;/A&gt;.&lt;/LI&gt;
&lt;LI&gt;For students, you can use the free&amp;nbsp;&lt;A class="markup--anchor markup--li-anchor" href="https://aka.ms/Azure4StudentsActivate" target="_blank" rel="noopener noreferrer" data-href="https://aka.ms/Azure4StudentsActivate"&gt;Azure for Students offer&lt;/A&gt;&amp;nbsp;which doesn’t require a credit card only your school email.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;A GitHub account.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Summary of the steps:&lt;/H3&gt;
&lt;UL&gt;
&lt;LI data-unlink="true"&gt;Step 1: Create an Azure DocumentDB (with MongoDB compatibility) resource&lt;/LI&gt;
&lt;LI data-unlink="true"&gt;Step 2: Create a Microsoft Foundry - Azure OpenAI resource and Deploy chat and embedding Models&lt;/LI&gt;
&lt;LI data-unlink="true"&gt;Step 3: Create an Azure App Service and Deploy the RAG Chat Application&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2 id="h_3686287661702581784350"&gt;Step 1: Create an Azure DocumentDB (with MongoDB compatibility) resource&lt;/H2&gt;
&lt;P&gt;In this step, you'll:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Open the&amp;nbsp;Azure Portal.&lt;/LI&gt;
&lt;LI&gt;Create an Azure DocumentDB (with MongoDB compatibility) resource.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3 id="toc-hId-1515534585"&gt;Open the Azure Portal&lt;/H3&gt;
&lt;P class="graf graf--p"&gt;1. Visit the Azure Portal&amp;nbsp;&lt;A class="markup--anchor markup--p-anchor" href="https://portal.azure.com/?wt.mc_id=studentamb_71460" target="_blank" rel="noopener nofollow noreferrer" data-href="https://portal.azure.com"&gt;https://portal.azure.com&lt;/A&gt;&amp;nbsp;in your browser&amp;nbsp;and&amp;nbsp;sign in.&lt;/P&gt;
&lt;FIGURE class="graf graf--figure"&gt;&lt;img /&gt;&lt;/FIGURE&gt;
&lt;P class="graf graf--p"&gt;Now you are inside the&amp;nbsp;&lt;STRONG class="markup--strong markup--p-strong"&gt;Azure portal&lt;/STRONG&gt;!&lt;/P&gt;
&lt;FIGURE class="graf graf--figure"&gt;&lt;img /&gt;
&lt;P class="lia-clear-both"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3 id="toc-hId--291919878"&gt;Create a new Azure DocumentDB (with MongoDB compatibility) resource&lt;/H3&gt;
&lt;P&gt;In this step, you create an Azure DocumentDB (with MongoDB compatibility) resource to store your data, vector embedding, and perform vector search.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1.&amp;nbsp;Type&amp;nbsp;&lt;EM&gt;documentdb&lt;/EM&gt;&amp;nbsp;in the&amp;nbsp;&lt;STRONG&gt;search bar&lt;/STRONG&gt;&amp;nbsp;at the top of the portal page and select&amp;nbsp;&lt;STRONG&gt;Azure DocumentDB (with MongoDB compatibility)&amp;nbsp;&lt;/STRONG&gt;from the available options.&lt;/P&gt;
&lt;/FIGURE&gt;
&lt;img /&gt;
&lt;FIGURE class="graf graf--figure"&gt;
&lt;P&gt;2. Select&amp;nbsp;&lt;STRONG&gt;Create&amp;nbsp;&lt;/STRONG&gt;from the toolbar to start provisioning your new cluster.&lt;/P&gt;
&lt;/FIGURE&gt;
&lt;img /&gt;
&lt;FIGURE class="graf graf--figure"&gt;
&lt;P&gt;3. Add the following information to create a resource:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="width: 100%; height: 258px; border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;&lt;STRONG&gt;What&lt;/STRONG&gt;&lt;/td&gt;&lt;td style="height: 35px;"&gt;&lt;STRONG&gt;Value&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 59px;"&gt;&lt;td style="height: 59px;"&gt;Subscription&lt;/td&gt;&lt;td style="height: 59px;"&gt;Use your preferred subscription. It's advised to use the same subscription across all the resources that communicate with each other on Azure.&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;Resource group&lt;/td&gt;&lt;td style="height: 35px;"&gt;Select &lt;STRONG&gt;Create new&amp;nbsp;&lt;/STRONG&gt;to create a new resource group. Enter a unique name for the resource group.&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;Cluster name&lt;/td&gt;&lt;td style="height: 35px;"&gt;Enter a globally unique name.&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;Location&lt;/td&gt;&lt;td style="height: 35px;"&gt;Select a region close to you for the best response time. For example, Select&amp;nbsp;&lt;STRONG&gt;UK South&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 59px;"&gt;&lt;td style="height: 59px;"&gt;MongoDB version&lt;/td&gt;&lt;td style="height: 59px;"&gt;Select the latest available version of MongoDB&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/DIV&gt;
&lt;/FIGURE&gt;
&lt;img /&gt;
&lt;FIGURE class="graf graf--figure"&gt;
&lt;P&gt;4. Select&amp;nbsp;&lt;STRONG&gt;Configure&lt;/STRONG&gt; to configure your cluster tier.&lt;/P&gt;
&lt;P&gt;5.&amp;nbsp;Add the following information to configure the cluster tier. You can scale it up later:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;What&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Value&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Cluster tier&lt;/td&gt;&lt;td&gt;Select &lt;STRONG&gt;M25 &lt;/STRONG&gt;tier, 2 (Burstable) vCores.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Storage&lt;/td&gt;&lt;td&gt;
&lt;P&gt;Select &lt;STRONG&gt;32 GiB&lt;/STRONG&gt;.&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;6. Select &lt;STRONG&gt;Save&lt;/STRONG&gt;.&lt;/P&gt;
&lt;/FIGURE&gt;
&lt;img /&gt;
&lt;FIGURE class="graf graf--figure"&gt;
&lt;P&gt;7. Enter the cluster &lt;STRONG&gt;Admin&lt;/STRONG&gt;&amp;nbsp;&lt;STRONG&gt;Username&lt;/STRONG&gt; and &lt;STRONG&gt;Password&lt;/STRONG&gt; and store them in a secure location.&lt;/P&gt;
&lt;P&gt;8. Select &lt;STRONG&gt;Next&lt;/STRONG&gt; to configure the networking settings.&lt;/P&gt;
&lt;/FIGURE&gt;
&lt;img /&gt;
&lt;FIGURE class="graf graf--figure"&gt;
&lt;P&gt;9. Select &lt;STRONG&gt;Allow Public Access from Azure&lt;/STRONG&gt; services and resources within the Azure to this cluster.&lt;/P&gt;
&lt;P&gt;10. Select &lt;STRONG&gt;Add current IP address&lt;/STRONG&gt; to the firewall rules to allow local access to the cluster.&lt;/P&gt;
&lt;P&gt;11. Select&lt;STRONG&gt; Review + create&lt;/STRONG&gt;.&lt;/P&gt;
&lt;/FIGURE&gt;
&lt;img /&gt;
&lt;FIGURE class="graf graf--figure"&gt;
&lt;P&gt;12.&amp;nbsp;Confirm your configuration settings and select&amp;nbsp;&lt;STRONG&gt;Create&lt;/STRONG&gt; to start provisioning the resource.&lt;/P&gt;
&lt;P&gt;Note: The cluster creation can take up to 10 minutes. It's recommended to move on with the rest of the steps and get back to it later.&lt;/P&gt;
&lt;/FIGURE&gt;
&lt;H2 id="h_282699227321702581811866"&gt;Step 2: Create a Microsoft Foundry - Azure OpenAI resource and Deploy chat and embedding Models&lt;/H2&gt;
&lt;P&gt;In this step, you'll:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Create a Microsoft Foundry Azure OpenAI resource.&lt;/LI&gt;
&lt;LI&gt;Create chat and embedding model deployments.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Create an Azure OpenAI resource&lt;/H3&gt;
&lt;P&gt;In this step, you create an Azure OpenAI Service resource that enables you to interact with different large language models (LLMs).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1.&amp;nbsp;Type&amp;nbsp;&lt;EM&gt;openai&lt;/EM&gt;&amp;nbsp;in the&amp;nbsp;&lt;STRONG&gt;search bar&lt;/STRONG&gt;&amp;nbsp;at the top of the portal page and select&amp;nbsp;&lt;STRONG&gt;Azure OpenAI&amp;nbsp;&lt;/STRONG&gt;from the available options.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;2. Select &lt;STRONG&gt;Create&lt;/STRONG&gt; from the toolbar then select Azure OpenAI to provision a new Azure OpenAI resource.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;3. Add the following information to create a resource:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;What&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Value&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Subscription&lt;/td&gt;&lt;td&gt;Use the same subscription you used to apply for Azure OpenAI access.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Resource group&lt;/td&gt;&lt;td&gt;Use the resource group you created in the previous step.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Region&lt;/td&gt;&lt;td&gt;Select a region close to you for the best response time. For example, Select&amp;nbsp;&lt;STRONG&gt;UK South&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Name&lt;/td&gt;&lt;td&gt;Enter a globally unique name.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Pricing tier&lt;/td&gt;&lt;td&gt;
&lt;DIV class="has-inner-focus"&gt;&amp;nbsp;Select &lt;STRONG&gt;S0&lt;/STRONG&gt;. Currently, this is the only available pricing tier.&lt;/DIV&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;img /&gt;
&lt;P&gt;4. Now that the basic information is added, select&amp;nbsp;&lt;STRONG&gt;Next&lt;/STRONG&gt;&amp;nbsp;to confirm your details and proceed to the next page.&lt;/P&gt;
&lt;P&gt;5. Select&amp;nbsp;&lt;STRONG&gt;Next&lt;/STRONG&gt;&amp;nbsp;to confirm your network details.&lt;/P&gt;
&lt;P&gt;6. Select&amp;nbsp;&lt;STRONG&gt;Next&lt;/STRONG&gt; to confirm your tag details.&lt;/P&gt;
&lt;P&gt;7. Confirm your configuration settings and select&amp;nbsp;&lt;STRONG&gt;Create&lt;/STRONG&gt; to start provisioning the resource. Wait for the deployment to finish.&lt;/P&gt;
&lt;P&gt;8. After the deployment finishes, select&amp;nbsp;&lt;STRONG&gt;Go to resource&lt;/STRONG&gt; to inspect your created resource. Here, you can manage your resource and find important information like the endpoint URL and API keys.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both"&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Create chat and embedding model deployments&lt;/H3&gt;
&lt;P&gt;In this step, you create an Azure OpenAI embedding model deployment and a chat model deployment.&amp;nbsp;Creating a deployment on your previously provisioned resource allows you to generate text embeddings (i.e. numerical representation for text) and have a natural language conversation with your data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. Select&amp;nbsp;&lt;STRONG&gt;Go to Foundry portal&amp;nbsp;&lt;/STRONG&gt;from the toolbar to open the studio.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;2. Select&amp;nbsp;&lt;STRONG&gt;Deployments&amp;nbsp;&lt;/STRONG&gt;from the &lt;STRONG&gt;Shared resources&lt;/STRONG&gt; left side menu to go to the deployments tab.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;3. Select&amp;nbsp;&lt;STRONG&gt;+ Deploy model &lt;/STRONG&gt;from the toolbar then select&lt;STRONG&gt; Deploy base model&lt;/STRONG&gt; from the options. A&lt;STRONG&gt;&amp;nbsp;Deploy model&lt;/STRONG&gt; window opens.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both"&gt;4. Type&amp;nbsp;&lt;EM&gt;gpt-4o-mini&amp;nbsp;&lt;/EM&gt;to search for the model then select it then select&lt;STRONG&gt; Use model&lt;/STRONG&gt;.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;5. Select &lt;STRONG&gt;Continue with existing setup&lt;/STRONG&gt; to proceed to next step.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;6. Refresh page and repeat previous steps to select the model then select &lt;STRONG&gt;Confirm&lt;/STRONG&gt;.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;7. Review selected options then select &lt;STRONG&gt;Deploy&lt;/STRONG&gt;.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;8. Select&amp;nbsp;&lt;STRONG&gt;+ Deploy model &lt;/STRONG&gt;from the toolbar then select&lt;STRONG&gt; Deploy base model&lt;/STRONG&gt; from the options. A&lt;STRONG&gt;&amp;nbsp;Deploy model&lt;/STRONG&gt; window opens.&lt;/P&gt;
&lt;P&gt;9. Type&amp;nbsp;&lt;EM&gt;text-embedding-3-small &lt;/EM&gt;to search for the model then select it then select&lt;STRONG&gt; Confirm&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P&gt;10. Review selected options then select &lt;STRONG&gt;Deploy&lt;/STRONG&gt;.&lt;/P&gt;
&lt;img /&gt;
&lt;H2 id="h_637567739611702581961458"&gt;Step 3: Create an Azure App Service and&amp;nbsp;Deploy the RAG Chat Application&lt;/H2&gt;
&lt;DIV&gt;
&lt;P&gt;In this step, you'll:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Fork the sample repository on GitHub.&lt;/LI&gt;
&lt;LI&gt;Create an Azure App Service resource with a deployment from GitHub.&lt;/LI&gt;
&lt;LI&gt;Modify Azure App Service Application settings in the Azure portal.&lt;/LI&gt;
&lt;LI&gt;Configure the workflow to deploy your application from GitHub.&lt;/LI&gt;
&lt;LI&gt;Test the website before and after adding the data.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Fork the Sample Repository on GitHub&lt;/H3&gt;
&lt;P&gt;In this step, you create a copy from the source code on your GitHub account to be able to edit it and use it later.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. Visit the sample&amp;nbsp;&lt;A class="lia-external-url" href="https://github.com/Azure-Samples/Cosmic-Food-RAG-app?wt.mc_id=studentamb_71460" target="_blank" rel="noopener noreferrer" data-href="https://portal.azure.com"&gt;github.com/Azure-Samples/Cosmic-Food-RAG-app&lt;/A&gt; in your browser and sign in.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;2. Select&amp;nbsp;&lt;STRONG&gt;Fork&amp;nbsp;&lt;/STRONG&gt;from the top of the sample page.&lt;/P&gt;
&lt;P&gt;3. Select an owner for the fork then, select&amp;nbsp;&lt;STRONG&gt;Create fork&lt;/STRONG&gt;.&lt;/P&gt;
&lt;img /&gt;
&lt;H3&gt;Create an Azure App Service resource with a deployment from GitHub&lt;/H3&gt;
&lt;P&gt;In this step, you create an Azure App service resource and connect it with your GitHub account to deploy a Python application.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1.&amp;nbsp;Type &lt;EM&gt;app service&amp;nbsp;&lt;/EM&gt;in the&amp;nbsp;&lt;STRONG&gt;search bar&lt;/STRONG&gt;&amp;nbsp;at the top of the portal page and select&amp;nbsp;&lt;STRONG&gt;App Services&amp;nbsp;&lt;/STRONG&gt;from the available options.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;2. Select &lt;STRONG&gt;Create Web App&lt;/STRONG&gt; from the toolbar to start provisioning a new web application.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;3.&amp;nbsp;&amp;nbsp;Add the following information to fill in the basic configuration of the application:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;What&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;STRONG&gt;Value&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Subscription&lt;/td&gt;&lt;td&gt;Use the same subscription you used to apply for Azure OpenAI access.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Resource group&lt;/td&gt;&lt;td&gt;Use the same resource group you created before.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Name&lt;/td&gt;&lt;td&gt;Enter a unique name for your website. For example,&amp;nbsp;&lt;STRONG&gt;cosmic-food-rag&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Publish?&lt;/td&gt;&lt;td&gt;Select&amp;nbsp;&lt;STRONG&gt;Code&lt;/STRONG&gt;. This option specifies whether your deployment consists of code or a container.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Runtime stack&lt;/td&gt;&lt;td&gt;Select &lt;STRONG&gt;Python 3.12&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Operating System&lt;/td&gt;&lt;td&gt;Select&amp;nbsp;&lt;STRONG&gt;Linux&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Region&lt;/td&gt;&lt;td&gt;Select&amp;nbsp;&lt;STRONG&gt;UK South&lt;/STRONG&gt;. This is the region where the rest of the resources you created reside.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;img /&gt;&lt;/DIV&gt;
&lt;P&gt;4. Add the following information to create the app service plan. You can scale it up later:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="width: 62.87037%; height: 133px; border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;&lt;STRONG&gt;What&lt;/STRONG&gt;&lt;/td&gt;&lt;td style="height: 35px;"&gt;&lt;STRONG&gt;Value&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 59px;"&gt;&lt;td style="height: 59px;"&gt;Linux Plan&lt;/td&gt;&lt;td style="height: 59px;"&gt;Select a pre-existing plan or create a new plan.&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 39px;"&gt;&lt;td style="height: 39px;"&gt;Pricing Plan&lt;/td&gt;&lt;td style="height: 39px;"&gt;
&lt;P&gt;&amp;nbsp;Select &lt;STRONG&gt;Basic B1&lt;/STRONG&gt;.&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 49.926362%" /&gt;&lt;col style="width: 49.926362%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;img /&gt;&lt;/DIV&gt;
&lt;P&gt;5. Select &lt;STRONG&gt;Deployment&lt;/STRONG&gt; from the toolbar to move to the deployment configuration tab.&lt;/P&gt;
&lt;P&gt;6. Add the following information to enable continuous deployment from GitHub:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="width: 100%; height: 234px; border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;What&lt;/td&gt;&lt;td style="height: 35px;"&gt;Value&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;Continuous deployment&lt;/td&gt;&lt;td style="height: 35px;"&gt;Select&lt;STRONG&gt;&amp;nbsp;Enable&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;GitHub account&lt;/td&gt;&lt;td style="height: 35px;"&gt;Select your GitHub account.&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 59px;"&gt;&lt;td style="height: 59px;"&gt;Organization&lt;/td&gt;&lt;td style="height: 59px;"&gt;Select your organization. If you are using your personal account then select it.&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;Repository&lt;/td&gt;&lt;td style="height: 35px;"&gt;Select&lt;STRONG&gt; Cosmic-Food-RAG-app&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;Branch&lt;/td&gt;&lt;td style="height: 35px;"&gt;Select&amp;nbsp;&lt;STRONG&gt;main&lt;/STRONG&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;img /&gt;&lt;/DIV&gt;
&lt;P&gt;7. Select &lt;STRONG&gt;Review + create&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P&gt;8. Confirm your configuration settings and select&amp;nbsp;&lt;STRONG&gt;Create&lt;/STRONG&gt; to start provisioning the resource. Wait for the deployment to finish.&lt;/P&gt;
&lt;P&gt;9. After the deployment finishes, select&amp;nbsp;&lt;STRONG&gt;Go to resource&lt;/STRONG&gt; to inspect your created resource. Here, you can manage your resource and find important information like the application settings and logs.&lt;/P&gt;
&lt;img /&gt;
&lt;H3 id="toc-hId--503538492"&gt;Modify Azure App service Application settings in the Azure portal&lt;/H3&gt;
&amp;nbsp;In this step, you configure the Application settings to make the website able to communicate with other cloud resources.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;1. In the Web App resource, select&amp;nbsp;&lt;STRONG&gt;Environment variables&lt;/STRONG&gt;&amp;nbsp;from the left side menu.&lt;/DIV&gt;
&lt;DIV&gt;&lt;img /&gt;
&lt;P class="lia-clear-both"&gt;&amp;nbsp;&lt;/P&gt;
&lt;/DIV&gt;
&lt;DIV&gt;2. Select&amp;nbsp;&lt;STRONG&gt;+ Add&lt;/STRONG&gt;&amp;nbsp;to add new environment variables to the function configuration.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;3. Add the following names and values one by one and select&amp;nbsp;&lt;STRONG&gt;Ok&lt;/STRONG&gt;. Make sure to add your own values.&lt;/DIV&gt;
&lt;DIV&gt;These application settings are for the Azure OpenAI resources that you created:&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="width: 77.12963%; height: 210px; border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;&lt;STRONG&gt;What&lt;/STRONG&gt;&lt;/td&gt;&lt;td style="height: 35px;"&gt;&lt;STRONG&gt;Value&lt;/STRONG&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;OPENAI_API_VERSION&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;2024-10-21&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;AZURE_OPENAI_CHAT_DEPLOYMENT_NAME&lt;/td&gt;&lt;td style="height: 35px;"&gt;
&lt;P&gt;gpt-4o-mini&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;AZURE_OPENAI_CHAT_MODEL_NAME&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;gpt-4o-mini&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME&lt;/td&gt;&lt;td style="height: 35px;"&gt;
&lt;P&gt;text-embedding-3-small&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;AZURE_OPENAI_EMBEDDINGS_MODEL_NAME&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;text-embedding-3-small&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;AZURE_OPENAI_EMBEDDINGS_DIMENSIONS&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;1536&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;AZURE_OPENAI_DEPLOYMENT_NAME&lt;/td&gt;&lt;td style="height: 35px;"&gt;&amp;lt;azureOpenAiResourceName&amp;gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;AZURE_OPENAI_ENDPOINT&lt;/td&gt;&lt;td style="height: 35px;"&gt;https://&amp;lt;azureOpenAiResourceName&amp;gt;.openai.azure.com/&lt;/td&gt;&lt;/tr&gt;&lt;tr style="height: 35px;"&gt;&lt;td style="height: 35px;"&gt;AZURE_OPENAI_API_KEY&lt;/td&gt;&lt;td style="height: 35px;"&gt;&amp;lt;azureOpenAiResourceKey&amp;gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 49.939976%" /&gt;&lt;col style="width: 49.939976%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;You can get the Azure OpenAI key from the Azure OpenAI resource page.&lt;/DIV&gt;
&lt;DIV&gt;Select &lt;STRONG&gt;Keys and Endpoint&lt;/STRONG&gt;&amp;nbsp; from the &lt;STRONG&gt;Resource Management&lt;/STRONG&gt; section and copy any of the available keys.&lt;/DIV&gt;
&lt;DIV&gt;&lt;img /&gt;&lt;/DIV&gt;
&lt;DIV&gt;These application settings are for Azure DocumentDB (with MongoDB compatibility):&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="width: 100%; border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;AZURE_COSMOS_USERNAME&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;&amp;lt;documentUsername&amp;gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;AZURE_COSMOS_PASSWORD&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;&amp;lt;documentPassword&amp;gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;AZURE_COSMOS_CONNECTION_STRING&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;
&lt;P&gt;mongodb+srv://&amp;lt;user&amp;gt;:&amp;lt;password&amp;gt;@&amp;lt;clusterName&amp;gt;.global.mongocluster.cosmos.azure.com/?tls=true&amp;amp;authMechanism=SCRAM-SHA-256&amp;amp;retrywrites=false&amp;amp;maxIdleTimeMS=120000&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 30%" /&gt;&lt;col style="width: 70%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV&gt;You can get the DocumentDB connection string from the Azure DocumentDB (with MongoDB compatibility) resource page.&lt;/DIV&gt;
&lt;DIV&gt;Select &lt;STRONG&gt;Connection strings&amp;nbsp;&lt;/STRONG&gt;and copy the connection string. Make sure to replace the user and password with the ones you created.&lt;/DIV&gt;
&lt;DIV&gt;&lt;img /&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;These application settings are &lt;STRONG&gt;new&lt;/STRONG&gt; and are used for resources that will be created when the application starts you can use any value for them:&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;AZURE_COSMOS_DATABASE_NAME&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;&amp;lt;documentDatabaseName&amp;gt; ex.&amp;nbsp;
&lt;P&gt;CosmicDB&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;AZURE_COSMOS_COLLECTION_NAME&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;&amp;lt;documentContainerName&amp;gt; ex.&amp;nbsp;
&lt;P&gt;CosmicFoodCollection&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;AZURE_COSMOS_INDEX_NAME&lt;/P&gt;
&lt;/td&gt;&lt;td&gt;&amp;lt;documentIndexName&amp;gt; ex.&amp;nbsp;
&lt;P&gt;CosmicIndex&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 50.00%" /&gt;&lt;col style="width: 50.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;4. Select&amp;nbsp;&lt;STRONG&gt;Apply&amp;nbsp;&lt;/STRONG&gt;to save your newly added environment variables.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;5. Select &lt;STRONG&gt;Configuration &lt;/STRONG&gt;then &lt;STRONG&gt;Stack settings&lt;/STRONG&gt; to edit the application startup command.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;6. Type&amp;nbsp;&lt;EM&gt;entrypoint.sh&lt;/EM&gt; in the startup command field then select &lt;STRONG&gt;Apply&lt;/STRONG&gt;.&lt;/P&gt;
&lt;img /&gt;&lt;/DIV&gt;
&lt;DIV&gt;
&lt;H3&gt;Configure the Workflow to deploy your application from GitHub&lt;/H3&gt;
In this step, you modify the GitHub deployment workflow to point to the folder that contains the application.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;1. Visit your forked repository on GitHub and notice the failing workflow.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;2. Open the workflow file&amp;nbsp;&lt;EM&gt;.github/workflows/main_cosmic-food-rag.yml&lt;/EM&gt;.&lt;/DIV&gt;
&lt;DIV&gt;&lt;img /&gt;
&lt;P&gt;3.&amp;nbsp;&amp;nbsp;Open the file and select the pen icon to edit it.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;4. Modify line 41 from &lt;EM&gt;.&lt;/EM&gt; to &lt;EM&gt;src/.&lt;/EM&gt;&lt;/P&gt;
&lt;img /&gt;&lt;/DIV&gt;
&lt;P&gt;5. Remove the optional&amp;nbsp;&lt;STRONG&gt;Local Build Section&amp;nbsp;&lt;/STRONG&gt;since the application already has tests that cover this part.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;6. Add this section to Install Node 22 and build the static frontend.&lt;/P&gt;
&lt;img /&gt;
&lt;DIV&gt;
&lt;P&gt;7. Select &lt;STRONG&gt;Commit changes&lt;/STRONG&gt;, and review your commit message and description. Select&amp;nbsp;&lt;STRONG&gt;Commit changes&lt;/STRONG&gt;.&lt;/P&gt;
&lt;img /&gt;
&lt;P class="lia-clear-both"&gt;The final workflow file should look like this:&lt;/P&gt;
&lt;LI-CODE lang="yaml"&gt;# Docs for the Azure Web Apps Deploy action: https://github.com/Azure/webapps-deploy
# More GitHub Actions for Azure: https://github.com/Azure/actions
# More info on Python, GitHub Actions, and Azure App Service: https://aka.ms/python-webapps-actions

name: Build and deploy Python app to Azure Web App - cosmic-food-rag

on:
  push:
    branches:
      - main
  workflow_dispatch:

jobs:
  build:
    runs-on: ubuntu-latest
    permissions:
      contents: read #This is required for actions/checkout

    steps:
      - uses: actions/checkout@v4

      - name: Set up Node 22
        uses: actions/setup-node@v6
        with:
          node-version: 22

      - name: Install Node Packages &amp;amp; Build Static Site
        run: cd frontend &amp;amp;&amp;amp; npm install &amp;amp;&amp;amp; npm run build

      # By default, when you enable GitHub CI/CD integration through the Azure portal, the platform automatically sets the SCM_DO_BUILD_DURING_DEPLOYMENT application setting to true. This triggers the use of Oryx, a build engine that handles application compilation and dependency installation (e.g., pip install) directly on the platform during deployment. Hence, we exclude the antenv virtual environment directory from the deployment artifact to reduce the payload size. 
      - name: Upload artifact for deployment jobs
        uses: actions/upload-artifact@v4
        with:
          name: python-app
          path: |
            src/
            !antenv/

      # 🚫 Opting Out of Oryx Build
      # If you prefer to disable the Oryx build process during deployment, follow these steps:
      # 1. Remove the SCM_DO_BUILD_DURING_DEPLOYMENT app setting from your Azure App Service Environment variables.
      # 2. Refer to sample workflows for alternative deployment strategies: https://github.com/Azure/actions-workflow-samples/tree/master/AppService
      

  deploy:
    runs-on: ubuntu-latest
    needs: build
    permissions:
      id-token: write #This is required for requesting the JWT
      contents: read #This is required for actions/checkout

    steps:
      - name: Download artifact from build job
        uses: actions/download-artifact@v4
        with:
          name: python-app
      
      - name: Login to Azure
        uses: azure/login@v2
        with:
          client-id: ${{ secrets.AZUREAPPSERVICE_CLIENTID_5672547ED09F46D59DD431ACF5A29F28 }}
          tenant-id: ${{ secrets.AZUREAPPSERVICE_TENANTID_0059913572C8467882D3999D0E0DD5B8 }}
          subscription-id: ${{ secrets.AZUREAPPSERVICE_SUBSCRIPTIONID_7C42E3352C5D47F084CB0CD14F549D27 }}

      - name: 'Deploy to Azure Web App'
        uses: azure/webapps-deploy@v3
        id: deploy-to-webapp
        with:
          app-name: 'cosmic-food-rag'
          slot-name: 'Production'
&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
8. Select &lt;STRONG&gt;Actions&lt;/STRONG&gt; to review the workflow run status.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;img /&gt;
&lt;H3&gt;Test the website before and After adding the data&lt;/H3&gt;
In this step, you test the application before adding the data, add the data, and test again.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;1. Select the workflow name to open it and get the website URL.&lt;/DIV&gt;
&lt;DIV&gt;&lt;img /&gt;
&lt;P&gt;2. Select any of the suggested messages or type your own and it should respond with &lt;EM&gt;No results found&lt;/EM&gt;.&lt;/P&gt;
&lt;img /&gt;3. Navigate to your Azure App Service resource page and select &lt;STRONG&gt;SSH&amp;nbsp;&lt;/STRONG&gt;then select Go to open a new SSH page.&lt;/DIV&gt;
&lt;DIV&gt;&lt;img /&gt;&lt;/DIV&gt;
&lt;DIV&gt;
&lt;P&gt;4. In the SSH terminal, run these commands:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;
&lt;P&gt;&lt;CODE&gt;uv sync --active&lt;/CODE&gt;&lt;/P&gt;
&lt;P&gt;&lt;CODE&gt;uv run --active ./scripts/add_data.py --file="./data/food_items.json"&lt;/CODE&gt;&lt;/P&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 100.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;img /&gt;
&lt;P&gt;5. Navigate back to the live website and type in the chat message&lt;EM&gt;&amp;nbsp;Do you have any vegan food dishes?&lt;/EM&gt; and it should respond with the correct answer now.&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;Congratulations!! You successfully built the full application.&lt;/P&gt;
&lt;/DIV&gt;
&lt;DIV&gt;
&lt;H2 id="toc-hId-1503395954"&gt;Clean Up&lt;/H2&gt;
&lt;P&gt;Once you finish experimenting on&amp;nbsp;Microsoft Azure you might want to delete the resources to not consume any more money from your subscription.&lt;/P&gt;
&lt;P&gt;You can delete the resource group and it will delete everything inside it or delete the resources one by one that's totally up to you.&lt;/P&gt;
&lt;/DIV&gt;
&lt;H2&gt;Conclusion&lt;/H2&gt;
&lt;P&gt;Congratulations! You've learned how to create an Azure DocumentDB (with MongoDB compatibility) cluster, how to create a Microsoft Foundry - Azure OpenAI resource, how to deploy an embedding model and a chat model from the Foundry portal, how to create an Azure App Service and configure continuous deployment with GitHub, and how to modify application settings to enable the communication across Azure resources. By using these technologies, you can build a RAG chat application with the option to perform vector search too over your own data and provide grounded (relevant) responses.&lt;/P&gt;
&lt;H2&gt;Next steps&lt;/H2&gt;
&lt;H3&gt;Documentation&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/foundry/foundry-models/concepts/models-sold-directly-by-azure?wt.mc_id=studentamb_71460&amp;amp;tabs=global-standard-aoai%2Cglobal-standard&amp;amp;pivots=azure-openai#azure-openai-in-microsoft-foundry-models" target="_blank" rel="noopener"&gt;Azure OpenAI in Microsoft Foundry models&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/ai-services/openai/concepts/understand-embeddings?wt.mc_id=studentamb_71460" target="_blank" rel="noopener"&gt;Understand embeddings in Azure OpenAI in Microsoft Foundry Models (classic)&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/documentdb/overview?wt.mc_id=studentamb_71460" target="_blank" rel="noopener"&gt;Azure DocumentDB (with MongoDB compatibility) documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/en-gb/azure/documentdb/vector-search?tabs=diskann?wt.mc_id=studentamb_71460" target="_blank" rel="noopener"&gt;Integrated vector store in Azure DocumentDB&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://docs.langchain.com/oss/python/langchain/overview/?wt.mc_id=studentamb_71460" target="_blank" rel="noopener"&gt;LangChain Python documentation&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Training Content&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A class="lia-external-url" href="https://learn.microsoft.com/training/paths/develop-generative-ai-apps/?wt.mc_id=studentamb_71460" target="_blank" rel="noopener"&gt;Develop generative AI apps in Azure&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P class="graf graf--p"&gt;Found this useful? Share it with others and follow me to get updates on:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Twitter (&lt;A href="https://twitter.com/john00isaac?wt.mc_id=studentamb_71460" target="_blank" rel="nofollow noopener noreferrer"&gt;twitter.com/john00isaac&lt;/A&gt;)&lt;/LI&gt;
&lt;LI&gt;LinkedIn (&lt;A href="https://www.linkedin.com/in/john0isaac/?wt.mc_id=studentamb_71460" target="_blank" rel="nofollow noopener noreferrer"&gt;linkedin.com/in/john0isaac&lt;/A&gt;)&lt;/LI&gt;
&lt;/UL&gt;
&lt;BLOCKQUOTE class="graf graf--pullquote"&gt;Feel free to share your comments and/or inquiries in the comment section below..
&lt;P class="1702586402308"&gt;See you in future&amp;nbsp;demos!&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;</description>
      <pubDate>Mon, 27 Apr 2026 08:41:21 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/build-ai-rag-apps-with-langchain-azure-documentdb-and-microsoft/ba-p/4513775</guid>
      <dc:creator>JohnAziz</dc:creator>
      <dc:date>2026-04-27T08:41:21Z</dc:date>
    </item>
    <item>
      <title>From Demo to Production: Building Microsoft Foundry Hosted Agents with .NET</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/from-demo-to-production-building-microsoft-foundry-hosted-agents/ba-p/4513718</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2&gt;The Gap Between a Demo and a Production Agent&lt;/H2&gt;
&lt;P&gt;Let's be honest. Getting an AI agent to work in a demo takes an afternoon. Getting it to work reliably in production, tested, containerised, deployed, observable, and maintainable by a team. is a different problem entirely.&lt;/P&gt;
&lt;P&gt;Most tutorials stop at the point where the agent prints a response in a terminal. They don't show you how to structure your code, cover your tools with tests, wire up CI, or deploy to a managed runtime with a proper lifecycle. That gap between prototype and production is where developer teams lose weeks.&lt;/P&gt;
&lt;P&gt;Microsoft Foundry Hosted Agents close that gap with a managed container runtime for your own custom agent code. And the &lt;A href="https://github.com/microsoft/Hosted_Agents_Workshop_dotNET" target="_blank" rel="noopener"&gt;Hosted Agents Workshop for .NET&lt;/A&gt; gives you a complete, copy-paste-friendly path through the entire journey. from local run to deployed agent to chat UI, in six structured labs using .NET 10.&lt;/P&gt;
&lt;P&gt;This post walks you through what the workshop delivers, what you will build, and why the patterns it teaches matter far beyond the workshop itself.&lt;/P&gt;
&lt;H2&gt;What Is a Microsoft Foundry Hosted Agent?&lt;/H2&gt;
&lt;P&gt;Microsoft Foundry supports two distinct agent types, and understanding the difference is the first decision you will make as an agent developer.&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Prompt agents&lt;/STRONG&gt; are lightweight agents backed by a model deployment and a system prompt. No custom code required. Ideal for simple Q&amp;amp;A, summarisation, or chat scenarios where the model's built-in reasoning is sufficient.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Hosted agents&lt;/STRONG&gt; are container-based agents that run &lt;EM&gt;your own code&lt;/EM&gt;&amp;nbsp; .NET, Python, or any framework you choose&amp;nbsp; inside Foundry's managed runtime. You control the logic, the tools, the data access, and the orchestration.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;When your scenario requires custom tool integrations, deterministic business logic, multi-step workflow orchestration, or private API access, a hosted agent is the right choice. The Foundry runtime handles the managed infrastructure; you own the code.&lt;/P&gt;
&lt;P&gt;For the official deployment reference, see &lt;A href="https://learn.microsoft.com/en-us/azure/ai-foundry/agents/how-to/deploy-hosted-agent" target="_blank" rel="noopener"&gt;Deploy a hosted agent to Foundry Agent Service&lt;/A&gt; on Microsoft Learn.&lt;/P&gt;
&lt;HR /&gt;
&lt;H2&gt;What the Workshop Delivers&lt;/H2&gt;
&lt;P&gt;The &lt;A href="https://github.com/microsoft/Hosted_Agents_Workshop_dotNET" target="_blank" rel="noopener"&gt;Hosted Agents Workshop for .NET&lt;/A&gt; is a beginner-friendly, hands-on workshop that takes you through the full development and deployment path for a real hosted agent. It is structured around a concrete scenario: a &lt;STRONG&gt;Hosted Agent Readiness Coach&lt;/STRONG&gt; that helps delivery teams answer questions like:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Should this use case start as a prompt agent or a hosted agent?&lt;/LI&gt;
&lt;LI&gt;What should a pilot launch checklist include?&lt;/LI&gt;
&lt;LI&gt;How should a team troubleshoot common early setup problems?&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The scenario is purposefully practical. It is not a toy chatbot. It is the kind of tool a real team would build and hand to other engineers, which means it needs to be testable, deployable, and extensible.&lt;/P&gt;
&lt;P&gt;The workshop covers:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Local development and validation with .NET 10&lt;/LI&gt;
&lt;LI&gt;Copilot-assisted coding with repo-specific instructions&lt;/LI&gt;
&lt;LI&gt;Deterministic tool implementation with xUnit test coverage&lt;/LI&gt;
&lt;LI&gt;CI pipeline validation with GitHub Actions&lt;/LI&gt;
&lt;LI&gt;Secure deployment to Azure Container Registry and Microsoft Foundry&lt;/LI&gt;
&lt;LI&gt;Chat UI integration using Blazor&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H2&gt;What You Will Build&lt;/H2&gt;
&lt;P&gt;By the end of the workshop, you will have a code-based hosted agent that exposes an OpenAI Responses-compatible &lt;CODE&gt;/responses&lt;/CODE&gt; endpoint on port &lt;CODE&gt;8088&lt;/CODE&gt;.&lt;/P&gt;
&lt;P&gt;The agent is backed by three deterministic local tools, implemented in &lt;CODE&gt;WorkshopLab.Core&lt;/CODE&gt;:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;RecommendImplementationShape&lt;/STRONG&gt; — analyses a scenario and recommends hosted or prompt agent based on its requirements&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;BuildLaunchChecklist&lt;/STRONG&gt; — generates a pilot launch checklist for a given use case&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;TroubleshootHostedAgent&lt;/STRONG&gt; — returns structured troubleshooting guidance for common setup problems&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;These tools are deterministic by design, no LLM call required to return a result. That choice makes them fast, predictable, and fully testable, which is the right architecture for business logic in a production agent.&lt;/P&gt;
&lt;P&gt;The end-to-end architecture looks like this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;H2&gt;The Hands-On Journey: Lab by Lab&lt;/H2&gt;
&lt;P&gt;The workshop follows a deliberate &lt;STRONG&gt;build → validate → ship&lt;/STRONG&gt; progression. Each lab has a clear outcome. You do not move forward until the previous checkpoint passes.&lt;/P&gt;
&lt;H3&gt;Lab 0 — Setup and Local Run&lt;/H3&gt;
&lt;P&gt;Open the repo in VS Code or a GitHub Codespace, configure your Microsoft Foundry project endpoint and model deployment name, then run the agent locally. By the end of Lab 0, your agent is listening on &lt;CODE&gt;http://localhost:8088/responses&lt;/CODE&gt; and responding to test requests.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;dotnet restore
dotnet build
dotnet run --project src/WorkshopLab.AgentHost&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Test it with a single PowerShell call:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;Invoke-RestMethod -Method Post `
    -Uri "http://localhost:8088/responses" `
    -ContentType "application/json" `
    -Body '{"input":"Should we start with a hosted agent or a prompt agent?"}'&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;A href="https://github.com/microsoft/Hosted_Agents_Workshop_dotNET/blob/main/labs/lab-0-foundry-setup/lab-0_readme.md" target="_blank" rel="noopener"&gt;Lab 0 instructions →&lt;/A&gt;&lt;/P&gt;
&lt;H3&gt;Lab 1 — Copilot Customisation&lt;/H3&gt;
&lt;P&gt;Configure repo-specific GitHub Copilot instructions so that Copilot understands the hosted-agent patterns used in this project. You will also add a Copilot review skill tailored to hosted agent code reviews. This step means every code suggestion you receive from Copilot is contextualised to the workshop scenario rather than giving generic .NET advice.&lt;/P&gt;
&lt;P&gt;&lt;A href="https://github.com/microsoft/Hosted_Agents_Workshop_dotNET/blob/main/labs/lab-1-copilot-config/lab-1_readme.md" target="_blank" rel="noopener"&gt;Lab 1 instructions →&lt;/A&gt;&lt;/P&gt;
&lt;H3&gt;Lab 2 — Tool Implementation&lt;/H3&gt;
&lt;P&gt;Extend one of the deterministic tools in &lt;CODE&gt;WorkshopLab.Core&lt;/CODE&gt; with a real feature change. The suggested change adds a stronger recommendation path to &lt;CODE&gt;RecommendImplementationShape&lt;/CODE&gt; for scenarios that require all three hosted-agent strengths simultaneously.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;// In RecommendImplementationShape — add before the final return:
if (requiresCode &amp;amp;&amp;amp; requiresTools &amp;amp;&amp;amp; requiresWorkflow)
{
    return string.Join(Environment.NewLine,
    [
        $"Recommended implementation: Hosted agent (full-stack)",
        $"Scenario goal: {goal}",
        "Why: the scenario requires custom code, external tool access, and " +
        "multi-step orchestration — all three hosted-agent strengths.",
        "Suggested next step: start with a code-based hosted agent, register " +
        "local tools for each integration, and add a workflow layer."
    ]);
}&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;You then write an xUnit test to cover it, run &lt;CODE&gt;dotnet test&lt;/CODE&gt;, and validate the change against a live &lt;CODE&gt;/responses&lt;/CODE&gt; call. This is the workshop's most important teaching moment: &lt;STRONG&gt;every tool change is covered by a test before it ships&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P&gt;&lt;A href="https://github.com/microsoft/Hosted_Agents_Workshop_dotNET/blob/main/labs/lab-2-implementation-shape/lab-2_readme.md" target="_blank" rel="noopener"&gt;Lab 2 instructions →&lt;/A&gt;&lt;/P&gt;
&lt;H3&gt;Lab 3 — CI Validation&lt;/H3&gt;
&lt;P&gt;Wire up a GitHub Actions workflow that builds the solution, runs the test suite, and validates that the agent container builds cleanly. No manual steps — if a change breaks the build or a test, CI catches it before any deployment happens.&lt;/P&gt;
&lt;P&gt;&lt;A href="https://github.com/microsoft/Hosted_Agents_Workshop_dotNET/blob/main/labs/lab-3-ci/lab-3_readme.md" target="_blank" rel="noopener"&gt;Lab 3 instructions →&lt;/A&gt;&lt;/P&gt;
&lt;H3&gt;Lab 4 — Deployment to Microsoft Foundry&lt;/H3&gt;
&lt;P&gt;Use the Azure Developer CLI (&lt;CODE&gt;azd&lt;/CODE&gt;) to provision an Azure Container Registry, publish the agent image, and deploy the hosted agent to Microsoft Foundry. The workshop separates provisioning from deployment deliberately: &lt;CODE&gt;azd&lt;/CODE&gt; owns the Azure resources; the Foundry control plane deployment is an explicit, intentional final step that depends on your real project endpoint and &lt;CODE&gt;agent.yaml&lt;/CODE&gt; manifest values.&lt;/P&gt;
&lt;P&gt;&lt;A href="https://github.com/microsoft/Hosted_Agents_Workshop_dotNET/blob/main/labs/lab-4-deploy/lab-4_readme.md" target="_blank" rel="noopener"&gt;Lab 4 instructions →&lt;/A&gt;&lt;/P&gt;
&lt;H3&gt;Lab 5 — Chat UI Integration&lt;/H3&gt;
&lt;P&gt;Connect a Blazor chat UI to the deployed hosted agent and validate end-to-end responses. By the end of Lab 5, you have a fully functioning agent accessible through a real UI, calling your deterministic tools via the Foundry control plane.&lt;/P&gt;
&lt;P&gt;&lt;A href="https://github.com/microsoft/Hosted_Agents_Workshop_dotNET/blob/main/labs/lab-5-ui/lab-5_readme.md" target="_blank" rel="noopener"&gt;Lab 5 instructions →&lt;/A&gt;&lt;/P&gt;
&lt;HR /&gt;
&lt;H2&gt;Key Concepts to Take Away&lt;/H2&gt;
&lt;P&gt;The workshop teaches concrete patterns that apply well beyond this specific scenario.&lt;/P&gt;
&lt;H3&gt;Code-first agent design&lt;/H3&gt;
&lt;P&gt;Prompt-only agents are fast to build but hard to test and reason about at scale. A hosted agent with code-backed tools gives you something you can unit test, refactor, and version-control like any other software.&lt;/P&gt;
&lt;H3&gt;Deterministic tools and testability&lt;/H3&gt;
&lt;P&gt;The workshop explicitly avoids LLM calls inside tool implementations. Deterministic tools return predictable outputs for a given input, which means you can write fast, reliable unit tests for them. This is the right pattern for business logic. Reserve LLM calls for the reasoning layer, not the execution layer.&lt;/P&gt;
&lt;H3&gt;CI/CD for agent systems&lt;/H3&gt;
&lt;P&gt;AI agents are software. They deserve the same build-test-deploy discipline as any other service. Lab 3 makes this concrete: you cannot ship without passing CI, and CI validates the container as well as the unit tests.&lt;/P&gt;
&lt;H3&gt;Deployment separation&lt;/H3&gt;
&lt;P&gt;The workshop's split between &lt;CODE&gt;azd&lt;/CODE&gt; provisioning and Foundry control-plane deployment is not arbitrary. It reflects the real operational boundary: your Azure resources are long-lived infrastructure; your agent deployment is a lifecycle event tied to your project's specific endpoint and manifest. Keeping them separate reduces accidents and makes rollbacks easier.&lt;/P&gt;
&lt;H3&gt;Observability and the validation mindset&lt;/H3&gt;
&lt;P&gt;Every lab ends with an explicit checkpoint. The culture the workshop builds is: &lt;EM&gt;prove it works before moving on&lt;/EM&gt;. That mindset is more valuable than any specific tool or command in the labs.&lt;/P&gt;
&lt;HR /&gt;
&lt;H2&gt;Why Hosted Agents Are Worth the Investment&lt;/H2&gt;
&lt;P&gt;The managed runtime in Microsoft Foundry removes the infrastructure overhead that makes custom agent deployment painful. You do not manage Kubernetes clusters, configure ingress rules, or handle TLS termination. Foundry handles the hosting; you handle the code.&lt;/P&gt;
&lt;P&gt;This matters most for teams making the transition from demo to production. A prompt agent is an afternoon's work. A hosted agent with proper CI, tested tools, and a deployment pipeline is a week's work done properly once, instead of several weeks of firefighting done poorly repeatedly.&lt;/P&gt;
&lt;P&gt;The Foundry agent lifecycle —&amp;gt; create, update, version, deploy —&amp;gt;also gives you the controls you need to manage agents in a real environment: staged rollouts, rollback capability, and clear separation between agent versions. For the full deployment guide, see &lt;A href="https://learn.microsoft.com/en-us/azure/ai-foundry/agents/how-to/deploy-hosted-agent" target="_blank" rel="noopener"&gt;Deploy a hosted agent to Foundry Agent Service&lt;/A&gt;.&lt;/P&gt;
&lt;HR /&gt;
&lt;H2&gt;From Workshop to Real Project&lt;/H2&gt;
&lt;P&gt;This workshop is not just a learning exercise. The repository structure, the tooling choices, and the CI/CD patterns are a reference implementation.&lt;/P&gt;
&lt;P&gt;The patterns you can lift directly into a production project include:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;The &lt;CODE&gt;WorkshopLab.Core&lt;/CODE&gt; / &lt;CODE&gt;WorkshopLab.AgentHost&lt;/CODE&gt; separation between business logic and agent hosting&lt;/LI&gt;
&lt;LI&gt;The &lt;CODE&gt;agent.yaml&lt;/CODE&gt; manifest pattern for declarative Foundry deployment&lt;/LI&gt;
&lt;LI&gt;The GitHub Actions workflow structure for build, test, and container validation&lt;/LI&gt;
&lt;LI&gt;The &lt;CODE&gt;azd&lt;/CODE&gt; + ACR pattern for image publishing without requiring Docker Desktop locally&lt;/LI&gt;
&lt;LI&gt;The Blazor chat UI as a starting point for internal tooling or developer-facing applications&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The scenario, a readiness coach for hosted agents. This is also something teams evaluating Microsoft Foundry will find genuinely useful. It answers exactly the questions that come up when onboarding a new team to the platform.&lt;/P&gt;
&lt;HR /&gt;
&lt;H2&gt;Common Mistakes When Building Hosted Agents&lt;/H2&gt;
&lt;P&gt;Having run workshops and spoken with developer teams building on Foundry, a few patterns come up repeatedly:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Skipping local validation before containerising.&lt;/STRONG&gt; Always validate the &lt;CODE&gt;/responses&lt;/CODE&gt; endpoint locally first. Debugging inside a container is slower and harder than debugging locally.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Putting business logic inside the LLM call.&lt;/STRONG&gt; If the answer to a user query can be determined by code, use code. Reserve the model for reasoning, synthesis, and natural language output.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Treating CI as optional.&lt;/STRONG&gt; Agent code changes break things just like any other code change. If you do not have CI catching regressions, you will ship them.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Conflating provisioning and deployment.&lt;/STRONG&gt; Recreating Azure resources on every deploy is slow and error-prone. Provision once with &lt;CODE&gt;azd&lt;/CODE&gt;; deploy agent versions as needed through the Foundry control plane.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Not having a rollback plan.&lt;/STRONG&gt; The Foundry agent lifecycle supports versioning. Use it. Know how to roll back to a previous version before you deploy to production.&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H2&gt;Get Started&lt;/H2&gt;
&lt;P&gt;The workshop is open source, beginner-friendly, and designed to be completed in a single day. You need a .NET 10 SDK, an Azure subscription, access to a Microsoft Foundry project, and a GitHub account.&lt;/P&gt;
&lt;P&gt;Clone the repository, follow the labs in order, and by the end you will have a production-ready reference implementation that your team can extend and adapt for real scenarios.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&lt;A href="https://github.com/microsoft/Hosted_Agents_Workshop_dotNET" target="_blank" rel="noopener"&gt;Clone the workshop repository →&lt;/A&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Here is the quick start to prove the solution works locally before you begin the full lab sequence:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;git clone https://github.com/microsoft/Hosted_Agents_Workshop_dotNET.git
cd Hosted_Agents_Workshop_dotNET

# Set your Foundry project endpoint and model deployment
$env:AZURE_AI_PROJECT_ENDPOINT = "https://&amp;lt;resource&amp;gt;.services.ai.azure.com/api/projects/&amp;lt;project&amp;gt;"
$env:MODEL_DEPLOYMENT_NAME     = "gpt-4.1-mini"

# Build and run
dotnet restore
dotnet build
dotnet run --project src/WorkshopLab.AgentHost&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Then send your first request:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;Invoke-RestMethod -Method Post `
    -Uri "http://localhost:8088/responses" `
    -ContentType "application/json" `
    -Body '{"input":"Should we start with a hosted agent or a prompt agent?"}'&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;When the agent answers as a Hosted Agent Readiness Coach, you are ready to begin the labs.&lt;/P&gt;
&lt;HR /&gt;
&lt;H2&gt;Key Takeaways&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;Hosted agents in Microsoft Foundry let you run custom .NET code in a managed container runtime — you own the logic, Foundry owns the infrastructure.&lt;/LI&gt;
&lt;LI&gt;Deterministic tools are the right pattern for business logic in production agents: fast, testable, and predictable.&lt;/LI&gt;
&lt;LI&gt;CI/CD is not optional for agent systems. Build it in from the start, not as an afterthought.&lt;/LI&gt;
&lt;LI&gt;Separate your provisioning (&lt;CODE&gt;azd&lt;/CODE&gt;) from your deployment (Foundry control plane) — it reduces accidents and simplifies rollbacks.&lt;/LI&gt;
&lt;LI&gt;The workshop is a reference implementation, not just a tutorial. The patterns are production-grade and ready to adapt.&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H2&gt;References&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://github.com/microsoft/Hosted_Agents_Workshop_dotNET" target="_blank" rel="noopener"&gt;Hosted Agents Workshop for .NET — GitHub Repository&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://github.com/microsoft/Hosted_Agents_Workshop_dotNET/blob/main/labs/README.md" target="_blank" rel="noopener"&gt;Workshop Lab Guide&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/ai-foundry/agents/how-to/deploy-hosted-agent" target="_blank" rel="noopener"&gt;Deploy a Hosted Agent to Foundry Agent Service — Microsoft Learn&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://ai.azure.com/" target="_blank" rel="noopener"&gt;Microsoft Foundry Portal&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/developer/azure-developer-cli/overview" target="_blank" rel="noopener"&gt;Azure Developer CLI (azd) — Microsoft Learn&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://dotnet.microsoft.com/en-us/download/dotnet/10.0" target="_blank" rel="noopener"&gt;.NET 10 SDK Download&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Wed, 22 Apr 2026 17:30:00 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/from-demo-to-production-building-microsoft-foundry-hosted-agents/ba-p/4513718</guid>
      <dc:creator>Lee_Stott</dc:creator>
      <dc:date>2026-04-22T17:30:00Z</dc:date>
    </item>
    <item>
      <title>Building an Auditable Security Layer for Agentic AI</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/building-an-auditable-security-layer-for-agentic-ai/ba-p/4495753</link>
      <description>&lt;P data-start="0" data-end="46"&gt;Most agent failures do not look like breaches.&lt;/P&gt;
&lt;P data-start="48" data-end="224"&gt;They look like a normal chat, a normal answer, and a normal tool call. Until the next morning, when a single question collapses the whole story: who authorized that action.&lt;/P&gt;
&lt;P data-start="226" data-end="347"&gt;You think you deployed an agent. In reality, you deployed an unbounded automation pipeline that happens to speak English.&lt;/P&gt;
&lt;P data-start="420" data-end="762"&gt;I’m &lt;STRONG&gt;&lt;A class="lia-external-url" href="https://www.linkedin.com/in/drhazemali" target="_blank" rel="noopener"&gt;Hazem Ali&lt;/A&gt; &lt;/STRONG&gt;— &lt;A class="lia-external-url" href="https://mvp.microsoft.com/en-US/MVP/profile/4865c7ae-cb5b-4eb5-b128-608b1f9a6ebc" target="_blank" rel="noopener"&gt;Microsoft AI MVP&lt;/A&gt;, Distinguished AI &amp;amp; ML Architect, Founder &amp;amp; CEO at Skytells. For over 20 years, I’ve built secure, scalable enterprise AI across cloud and edge, with a focus on agent security and sovereign, governed AI architectures. My work on these systems is widely referenced by practitioners across multiple regions.&lt;/P&gt;
&lt;img&gt;
&lt;P data-start="321" data-end="698"&gt;&lt;STRONG data-start="0" data-end="24"&gt;Hazem Ali&lt;/STRONG&gt; honored to receive an official speaker invitation under the patronage of H.H. Sheikh Dr. Sultan bin Muhammad Al Qasimi, Member of the UAE Supreme Council and Ruler of Sharjah, to speak at the Sharjah International Conference on Linguistic Intelligence (SICLI), organized by the American University of Sharjah (AUS) and the Emirates Scholar Center for Research and Studies.&lt;/P&gt;
&lt;/img&gt;
&lt;P data-start="420" data-end="762"&gt;This piece is a collaboration with &lt;A class="lia-external-url" href="http://linkedin.com/in/ACoAAAXxehwBKIx99wbwTikXEjLGWGwqbpEkmYc" target="_blank" rel="noopener"&gt;&lt;STRONG data-start="384" data-end="399"&gt;Hammad Atta&lt;/STRONG&gt;&lt;/A&gt; a Practice Lead – AI Security &amp;amp; Cloud Strategy and Dr. Yasir Mehmood , Dr Muhammad Zeeshan Baig, Dr. Muhammad Aatif, Dr. MUHAMMAD AZIZ UL HAQ. We align on one core idea: agent security is not about making the model behave. It is about building enforceable boundaries around the model and proving every privileged step.&lt;/P&gt;
&lt;P data-start="764" data-end="1019"&gt;This article is meant to sit next to my earlier Tech Community piece, &lt;A class="lia-internal-link lia-internal-url lia-internal-url-content-type-blog" href="https://techcommunity.microsoft.com/blog/educatordeveloperblog/zero-trust-agent-architecture-how-to-actually-secure-your-agents/4473995" data-lia-auto-title="Zero-Trust Agent Architecture: How To Actually Secure Your Agents" data-lia-auto-title-active="0" target="_blank"&gt;&lt;STRONG data-start="834" data-end="903"&gt;Zero-Trust Agent Architecture: How To Actually Secure Your Agents&lt;/STRONG&gt;&lt;/A&gt;, and go one level deeper into the mechanics you can implement on Azure today.&lt;/P&gt;
&lt;P data-start="1021" data-end="1042"&gt;Let me break it down.&lt;/P&gt;
&lt;H2 data-start="0" data-end="48"&gt;The Principle: The model is not your boundary&lt;/H2&gt;
&lt;P data-start="50" data-end="116"&gt;Let me break it down in the way I’d explain it in a design review.&lt;/P&gt;
&lt;P data-start="118" data-end="575"&gt;A boundary is something that still holds when the component on the other side is adversarial, confused, or simply wrong. An LLM is none of those reliably. In an agent, the model is not just a generator. It becomes a &lt;STRONG data-start="334" data-end="359"&gt;planner and scheduler&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P data-start="118" data-end="575"&gt;It decides when to retrieve, which tool to call, how to shape arguments, and when to loop.&lt;/P&gt;
&lt;P data-start="118" data-end="575"&gt;That means your real attack surface is not “bad output.” It is the &lt;STRONG data-start="519" data-end="541"&gt;control-flow graph&lt;/STRONG&gt; the model is allowed to traverse.&lt;/P&gt;
&lt;P data-start="577" data-end="745"&gt;So if your “security” lives inside the prompt, you are putting policy in the same token stream the attacker can influence. That is not a boundary. That is a suggestion.&lt;/P&gt;
&lt;P data-start="747" data-end="853"&gt;The only stable design is to treat the model like an untrusted proposer and the runtime like the verifier.&lt;/P&gt;
&lt;P data-start="855" data-end="941"&gt;Here is the chain I use. Each gate is external to the model and survives manipulation.&lt;/P&gt;
&lt;UL data-start="943" data-end="1422"&gt;
&lt;LI data-start="943" data-end="1045"&gt;&lt;STRONG data-start="945" data-end="961"&gt;Context Gate&lt;/STRONG&gt;: Everything that enters the model is treated as executable influence, not “text.”&lt;/LI&gt;
&lt;LI data-start="1046" data-end="1147"&gt;&lt;STRONG data-start="1048" data-end="1067"&gt;Capability Gate&lt;/STRONG&gt;: Tools are invoked as constrained capabilities, not free-form function calls.&lt;/LI&gt;
&lt;LI data-start="1148" data-end="1237"&gt;&lt;STRONG data-start="1150" data-end="1167"&gt;Evidence Gate&lt;/STRONG&gt;: Every privileged step produces a verifiable artifact, not a story.&lt;/LI&gt;
&lt;LI data-start="1238" data-end="1351"&gt;&lt;STRONG data-start="1240" data-end="1267"&gt;Retrieval Control Plane&lt;/STRONG&gt;: What the agent can see is governed by labels and identity, not prompt etiquette.&lt;/LI&gt;
&lt;LI data-start="1352" data-end="1422"&gt;&lt;STRONG data-start="1354" data-end="1373"&gt;Detection Layer&lt;/STRONG&gt;: Drift and probing become alerts, not surprises.&lt;/LI&gt;
&lt;/UL&gt;
&lt;img&gt;Figure: Model proposes. Runtime verifies. Input + retrieved context → shields → model → tool gateway → signed intent → governed retrieval → SOC telemetry.&lt;/img&gt;
&lt;P data-start="1622" data-end="1898"&gt;Now the rare part, the part most people miss: the boundary is not “block or allow.” The boundary is &lt;STRONG data-start="1722" data-end="1734"&gt;stateful&lt;/STRONG&gt;. Once the runtime sees a suspicious signal, the entire session must transition into a degraded capability state, and every downstream gate must enforce that state.&lt;/P&gt;
&lt;H4 data-start="1900" data-end="1958"&gt;1. Treat context as executable influence, and preserve provenance&lt;/H4&gt;
&lt;P data-start="2790" data-end="2958"&gt;If you do RAG, your documents are not “supporting info.” They are an input channel. That makes the biggest prompt-injection risk &lt;STRONG data-start="2919" data-end="2935"&gt;not the user&lt;/STRONG&gt;. It is your documents.&lt;/P&gt;
&lt;P data-start="2960" data-end="3253"&gt;Microsoft’s Prompt Shields covers user prompt attacks (scanned at the user input intervention point) and document attacks (scanned at the user input and tool response intervention points). When enabled, each request returns annotation results with detected and filtered values that your runtime can translate into a policy decision: &lt;SPAN class="lia-text-color-14"&gt;block, degrade, or allow.&lt;/SPAN&gt;&lt;/P&gt;
&lt;H5 data-start="3255" data-end="3318"&gt;Provenance Collapse.&lt;/H5&gt;
&lt;P data-start="3320" data-end="3580"&gt;Most teams concatenate prompt + policy + retrieved chunks into one blob. The moment you do that, you lose the one thing you need for a defensible boundary: you can no longer reliably tell which tokens came from where. That is how “context” becomes “authority.”&lt;/P&gt;
&lt;P data-start="3582" data-end="3938"&gt;For indirect/document attacks,&amp;nbsp;&lt;/P&gt;
&lt;P data-start="3582" data-end="3938"&gt;Microsoft guidance recommends delimiting context documents inside the prompt using &lt;SPAN class="lia-text-color-14"&gt;"""&lt;/SPAN&gt;&lt;EM&gt;&lt;SPAN class="lia-text-color-14"&gt;&amp;lt;documents&amp;gt; ... &amp;lt;/documents&amp;gt;""" &lt;/SPAN&gt;&lt;/EM&gt;to improve indirect attack detection.&lt;/P&gt;
&lt;P data-start="3582" data-end="3938"&gt;That delimiter is not formatting. It is a provenance marker that improves indirect attack detection through Prompt Shields.&lt;/P&gt;
&lt;P data-start="4164" data-end="4191"&gt;Minimal, practical pattern:&lt;/P&gt;
&lt;LI-CODE lang="typescript"&gt;// Provenance-preserving prompt construction for indirect/document attack detection
function buildPrompt(system: string, user: string, retrievedDocs: string[]): string {
  const docs = retrievedDocs.map((d) =&amp;gt; `- ${d}`).join("\n");

  return [
    system,
    "",
    `User: ${user}`,
    "",
    `""" &amp;lt;documents&amp;gt;\n${docs}\n&amp;lt;/documents&amp;gt; """`,
  ].join("\n");
}
&lt;/LI-CODE&gt;
&lt;P data-start="4551" data-end="4630"&gt;Then treat Prompt Shields output as a &lt;STRONG data-start="4589" data-end="4615"&gt;session security event&lt;/STRONG&gt;, not a banner:&lt;/P&gt;
&lt;LI-CODE lang="typescript"&gt;type RiskState = "NORMAL" | "SUSPECT" | "BLOCK";
type FilterPolicy = "BLOCK_ON_FILTERED" | "DEGRADE_ON_FILTERED";

function computeRiskState(
  shields: { detected: boolean; filtered?: boolean },
  labels: string[],
  policy: FilterPolicy = "DEGRADE_ON_FILTERED",
): RiskState {
  // detected =&amp;gt; hard stop
  if (shields.detected) return "BLOCK";

  // filtered is an annotation signal: block or degrade by policy
  if (shields.filtered) {
    return policy === "BLOCK_ON_FILTERED" ? "BLOCK" : "SUSPECT";
  }

  // example: sensitivity-based degradation independent of shield hits
  const sensitive = labels.some((l) =&amp;gt;
    ["Confidential", "HighlyConfidential", "Regulated"].includes(l),
  );

  return sensitive ? "SUSPECT" : "NORMAL";
}
&lt;/LI-CODE&gt;
&lt;P data-start="5004" data-end="5117"&gt;When the signal is clear, you block and log. When it is suspicious, you do not warn. You &lt;STRONG data-start="5093" data-end="5116"&gt;downgrade authority&lt;/STRONG&gt;.&lt;/P&gt;
&lt;H5&gt;QSAF Alignment:&lt;/H5&gt;
&lt;P&gt;&lt;STRONG&gt;Prompt Injection Protection (Domain 1): &lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;QSAF-PI-001 (static pattern blacklist), QSAF-PI-002 (dynamic LLM analysis), QSAF-PI-003 (semantic embedding comparison)&lt;/P&gt;
&lt;P&gt;All addressed by Prompt Shields and provenance marking.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Context Manipulation (Domain 2):&lt;/STRONG&gt; QSAF-RC-004 (context drift), QSAF-RC-007 (nested prompt injection) – mitigated by stateful risk calculation.&lt;/P&gt;
&lt;H4 data-start="2858" data-end="2919"&gt;2. Tools are capabilities with constraints, not functions&lt;/H4&gt;
&lt;P data-start="2920" data-end="3065"&gt;When the model proposes a tool call, your runtime should re-derive what is allowed from identity plus risk state, then enforce it at the gateway.&lt;/P&gt;
&lt;LI-CODE lang="typescript"&gt;type ToolRequest = {
  tool: string;
  args: unknown;
};

type Capabilities = {
  allowWrite: boolean;
  allowedTools: Set&amp;lt;string&amp;gt;;
};

function deriveCapabilities(risk: RiskState, roles: string[]): Capabilities {
  const baseAllowed = new Set(["search_kb", "get_profile", "summarize"]);
  const isAdmin = roles.includes("Admin");

  if (risk === "SUSPECT") {
    return { allowWrite: false, allowedTools: baseAllowed };
  }

  if (risk === "BLOCK") {
    return { allowWrite: false, allowedTools: new Set() };
  }

  // NORMAL
  const tools = new Set([
    ...baseAllowed,
    ...(isAdmin ? ["update_record", "issue_refund"] : []),
  ]);

  return { allowWrite: isAdmin, allowedTools: tools };
}

function authorizeTool(req: ToolRequest, caps: Capabilities): void {
  if (!caps.allowedTools.has(req.tool)) throw new Error("ToolNotAllowed");
  if (!caps.allowWrite &amp;amp;&amp;amp; req.tool.startsWith("update_")) {
    throw new Error("WriteDenied");
  }
}
&lt;/LI-CODE&gt;
&lt;P data-start="3950" data-end="4003"&gt;The model can ask. It cannot grant itself permission.&lt;/P&gt;
&lt;H5&gt;QSAF Alignment:&lt;/H5&gt;
&lt;P&gt;&lt;STRONG&gt;Plugin Abuse Monitoring (Domain 3): &lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;QSAF-PL-001 (whitelist enforcement), QSAF-PL-003 (restrict sensitive plugins), QSAF-PL-006 (rate‑limiting) – implemented via capability derivation and gateway policies.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Behavioral Anomaly Detection (Domain 5): &lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;QSAF-BA-006 (plugin execution pattern deviance) – detected by comparing actual calls against derived capabilities.&lt;/P&gt;
&lt;H3 data-start="1036" data-end="1104"&gt;The Integrity Gate: Hash-chain the authority, not the output&lt;/H3&gt;
&lt;P data-start="1106" data-end="1158"&gt;Let me add the part that makes investigations clean.&lt;/P&gt;
&lt;P data-start="1160" data-end="1265"&gt;Most teams treat integrity like an audit log problem. That is not enough. Logs explain. Integrity proves.&lt;/P&gt;
&lt;P data-start="1267" data-end="1551"&gt;The hard truth is that agent authority is assembled out of pieces: the system instruction, the user prompt, retrieved chunks, risk annotations, and finally the tool intent. If you do not bind those pieces together cryptographically, an incident review becomes a story-telling session.&lt;/P&gt;
&lt;P data-start="1553" data-end="1790"&gt;This is why &lt;A class="lia-external-url" href="https://drhazemali.com/blog/qsaf-qorvex-security-ai-framework" target="_blank" rel="noopener"&gt;QSAF&lt;/A&gt; has an entire domain for &lt;STRONG data-start="1595" data-end="1628"&gt;payload integrity and signing&lt;/STRONG&gt;, including prompt hash signing, nonce or replay protection, and a &lt;STRONG data-start="1695" data-end="1717"&gt;hash chain lineage&lt;/STRONG&gt; that tracks how a session evolved.&lt;/P&gt;
&lt;P data-start="1792" data-end="1861"&gt;Here is how you can map that into the runtime verifies.&lt;/P&gt;
&lt;img /&gt;
&lt;P data-start="1863" data-end="1959"&gt;You build a canonical “authority envelope” for every privileged hop, compute a digest, and then:&lt;/P&gt;
&lt;UL data-start="1961" data-end="2180"&gt;
&lt;LI data-start="1961" data-end="2003"&gt;link it to the previous hop (hash chain)&lt;/LI&gt;
&lt;LI data-start="2004" data-end="2038"&gt;include a nonce (replay control)&lt;/LI&gt;
&lt;LI data-start="2039" data-end="2180"&gt;sign the digest with Azure Key Vault (Key Vault signs digests, it does not hash your content for you)&lt;/LI&gt;
&lt;/UL&gt;
&lt;LI-CODE lang="typescript"&gt;import crypto from "crypto";

type AuthorityEnvelope = {
  sessionId: string;
  turnId: number;
  policyVersion: string;

  // provenance-preserved components
  systemHash: string;
  userHash: string;
  documentsHash: string; // hash of structured retrieved chunks (not just rendered text)

  shields: {
    detected: boolean;
    filtered: boolean;
  };

  riskState: "NORMAL" | "SUSPECT" | "BLOCK";

  // proposed action (if any)
  tool?: {
    name: string;
    argsHash: string;
  };

  // anti-replay + lineage
  nonce: string;
  prevDigest?: string;
  ts: string;
};

function sha256(bytes: string): string {
  return crypto.createHash("sha256").update(bytes).digest("hex");
}

// Canonicalization matters. JSON.stringify is OK if you control key order.
// For cross-language, use RFC 8785 (JCS) canonical JSON.
function canonicalJson(x: unknown): string {
  return JSON.stringify(x);
}

function buildEnvelope(
  input: Omit&amp;lt;AuthorityEnvelope, "nonce" | "ts"&amp;gt;,
): AuthorityEnvelope {
  return {
    ...input,
    nonce: crypto.randomUUID(),
    ts: new Date().toISOString(),
  };
}

function digestEnvelope(env: AuthorityEnvelope): string {
  return sha256(canonicalJson(env));
}&lt;/LI-CODE&gt;
&lt;P data-start="3565" data-end="3710"&gt;Then you call Key Vault to sign &lt;STRONG data-start="3597" data-end="3612"&gt;that digest&lt;/STRONG&gt; (REST sign), and optionally verify later (REST verify).&lt;/P&gt;
&lt;P data-start="3712" data-end="3780"&gt;The rare failure mode this blocks is subtle: &lt;STRONG data-start="3757" data-end="3779"&gt;authority splicing&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P data-start="3782" data-end="4081"&gt;Without a hash chain, it is possible for the runtime to correctly validate a tool call, but later be unable to prove which retrieved chunk, which Prompt Shields result, and which policy version were in force when that call was authorized. With the chain, every privileged hop becomes tamper-evident.&lt;/P&gt;
&lt;P data-start="4083" data-end="4351"&gt;This is the point: Prompt Shields tells you “this looks dangerous.” Document delimiters preserve provenance. &lt;BR data-start="4229" data-end="4232" /&gt;The integrity gate makes the runtime able to say, later, with evidence: “This is exactly what I accepted as authority.”&lt;/P&gt;
&lt;H5&gt;QSAF Alignment:&lt;/H5&gt;
&lt;P&gt;Payload Integrity &amp;amp; Signing (Domain 6): QSAF-PY-001 (prompt hash signing), QSAF-PY-005 (nonce/replay control), QSAF-PY-006 (hash chain lineage) – directly implemented via the envelope and chaining.&lt;/P&gt;
&lt;H2 data-start="3489" data-end="3541"&gt;Tools must sit behind a wall that can say “no”&lt;/H2&gt;
&lt;P data-start="5178" data-end="5391"&gt;Tool calls are where language becomes authority. If an agent can call APIs that mutate state, your security story is not about the response text. It is about whether the tool call is allowed under explicit policy.&lt;/P&gt;
&lt;P data-start="5393" data-end="5700"&gt;This is exactly where &lt;STRONG data-start="5415" data-end="5439"&gt;Azure API Management&lt;/STRONG&gt; belongs: as the tool gateway that enforces authentication and authorization before any tool request reaches your backend. The validate-jwt policy is the canonical enforcement mechanism for validating JWTs at the gateway.&lt;/P&gt;
&lt;P data-start="5702" data-end="5728"&gt;The design goal is simple:&lt;/P&gt;
&lt;P data-start="5730" data-end="5804"&gt;The model can request a tool call. The gateway decides if it is permitted.&lt;/P&gt;
&lt;P data-start="5806" data-end="5849"&gt;A capability token approach keeps it clean:&lt;/P&gt;
&lt;LI-CODE lang="xml"&gt;&amp;lt;!-- APIM inbound policy sketch --&amp;gt;
&amp;lt;validate-jwt header-name="Authorization" failed-validation-httpcode="401"&amp;gt;
  &amp;lt;required-claims&amp;gt;
    &amp;lt;claim name="scp"&amp;gt;
      &amp;lt;value&amp;gt;tools.read&amp;lt;/value&amp;gt;
    &amp;lt;/claim&amp;gt;
  &amp;lt;/required-claims&amp;gt;
&amp;lt;/validate-jwt&amp;gt;&lt;/LI-CODE&gt;
&lt;P data-start="6100" data-end="6265"&gt;The claim name (scp, roles, or custom claims) depends on your token issuer; the point is enforcing authorization at the gateway, not inside model text.&lt;/P&gt;
&lt;P data-start="6100" data-end="6265"&gt;Now you can enforce “read-only mode” by issuing tokens that simply do not carry write scopes. The model can try to call a write tool. It still gets denied by policy.&lt;/P&gt;
&lt;H2 data-start="6272" data-end="6327"&gt;Evidence is not logs. Evidence is a signed chain.&lt;/H2&gt;
&lt;P data-start="6329" data-end="6375"&gt;Logs help you debug. Evidence helps you prove.&lt;/P&gt;
&lt;P data-start="6377" data-end="6763"&gt;So you hash the session envelope and the tool intent, then sign the digest using &lt;STRONG data-start="6458" data-end="6482"&gt;Azure Key Vault Keys&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P data-start="6377" data-end="6763"&gt;Key Vault sign creates a signature from a digest, and verify verifies a signature against a digest. Key Vault does not hash your content for you. Hash locally, then sign the digest.), and Key Vault documentation is explicit that signing is &lt;EM data-start="6628" data-end="6639"&gt;sign-hash&lt;/EM&gt;, not “sign arbitrary content.” You hash locally, then ask Key Vault to sign the hash.&lt;/P&gt;
&lt;LI-CODE lang="typescript"&gt;import crypto from "crypto";

const sha256 = (x: unknown): string =&amp;gt;
  crypto.createHash("sha256").update(JSON.stringify(x)).digest("hex");

type IntentEnvelope = {
  sessionId: string;
  userId: string;
  promptHash: string;
  documentsHash: string;
  tool: string;
  argsHash: string;
  nonce: string;
  ts: string;
  policyVersion: string;
};

function buildIntent(
  sessionId: string,
  userId: string,
  prompt: string,
  docs: unknown,
  tool: string,
  args: unknown,
  policyVersion: string,
): IntentEnvelope {
  return {
    sessionId,
    userId,
    promptHash: sha256(prompt),
    documentsHash: sha256(docs),
    tool,
    argsHash: sha256(args),
    nonce: crypto.randomUUID(),
    ts: new Date().toISOString(),
    policyVersion,
  };
}&lt;/LI-CODE&gt;
&lt;P data-start="7717" data-end="7785"&gt;Once you do this, your system stops “explaining.” It starts proving.&lt;/P&gt;
&lt;H2 data-start="6054" data-end="6115"&gt;Govern what the agent can see, not only what it can say&lt;/H2&gt;
&lt;P data-start="6117" data-end="6183"&gt;RAG without governance eventually becomes a data exposure feature.&lt;/P&gt;
&lt;P data-start="6185" data-end="6496"&gt;This is why I treat retrieval as a governed operation. &lt;STRONG data-start="6240" data-end="6280"&gt;Microsoft Purview sensitivity labels&lt;/STRONG&gt; give you a practical way to classify content and build retrieval rules on top of that classification. Microsoft documents creating and configuring sensitivity labels in Purview.&lt;/P&gt;
&lt;P data-start="6498" data-end="6520"&gt;The pattern is simple:&lt;/P&gt;
&lt;UL data-start="6522" data-end="6704"&gt;
&lt;LI data-start="6522" data-end="6541"&gt;Label the corpus.&lt;/LI&gt;
&lt;LI data-start="6542" data-end="6590"&gt;Filter retrieval by label and identity policy.&lt;/LI&gt;
&lt;LI data-start="6591" data-end="6631"&gt;Log label distribution per completion.&lt;/LI&gt;
&lt;LI data-start="6632" data-end="6704"&gt;Alert when a low-privilege identity retrieves high-sensitivity labels.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P data-start="6886" data-end="6968"&gt;This is how you keep sovereignty real. Not in a slide deck. In the retrieval path.&lt;/P&gt;
&lt;H2 data-start="8646" data-end="8708"&gt;Operate it like a security system: posture and detection&lt;/H2&gt;
&lt;P data-start="8710" data-end="8833"&gt;Inline gates reduce risk. They do not eliminate it. Systems drift. People add tools. Policies get loosened. Attacks evolve.&lt;/P&gt;
&lt;P data-start="8835" data-end="9089"&gt;Microsoft Defender for Cloud’s Defender CSPM plan includes AI security posture management for generative AI apps and AI agents (Preview), including discovery/inventory of AI agents deployed with Azure AI Foundry.&lt;/P&gt;
&lt;P data-start="9091" data-end="9235"&gt;Then you use &lt;STRONG data-start="9104" data-end="9126"&gt;Microsoft Sentinel&lt;/STRONG&gt; to turn your telemetry into incidents, with scheduled analytics rules.&lt;/P&gt;
&lt;P data-start="9237" data-end="9286"&gt;Your detections should match the gates you built:&lt;/P&gt;
&lt;UL data-start="9288" data-end="9684"&gt;
&lt;LI data-start="9288" data-end="9397"&gt;Repeated Prompt Shields detections from the same identity or session.&lt;/LI&gt;
&lt;LI data-start="9398" data-end="9452"&gt;Tool-call spikes after a suspicious document signal.&lt;/LI&gt;
&lt;LI data-start="9453" data-end="9560"&gt;APIM denials for write endpoints from sessions in read-only mode.&lt;/LI&gt;
&lt;LI data-start="9561" data-end="9684"&gt;High-sensitivity label retrieval by identities that should never touch that tier.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H5&gt;QSAF Alignment:&lt;/H5&gt;
&lt;P&gt;Behavioral Anomaly Detection (Domain 5):&lt;/P&gt;
&lt;P&gt;QSAF-BA-001 (session entropy), QSAF-BA-004 (repeated intent mutation), QSAF-BA-007 (unified risk score) – detected via Sentinel rules.&lt;/P&gt;
&lt;P&gt;Cross‑Environment Defense (Domain 9): QSAF-CE-006 (coordinated alert response) – using Sentinel incidents and playbooks.&lt;/P&gt;
&lt;H2 data-start="9881" data-end="9927"&gt;Where the reference checklist fits, quietly&lt;/H2&gt;
&lt;P data-start="9929" data-end="10238"&gt;Behind the scenes, we use a control checklist lens to ensure we cover prompt/context attacks, tool misuse, integrity, governance, and operational monitoring.&amp;nbsp;The point is not to rename Microsoft features into framework terms. The point is to make the system enforceable and auditable using Azure-native gates.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;H2 data-start="10245" data-end="10255"&gt;Closing&lt;/H2&gt;
&lt;P data-start="10257" data-end="10310"&gt;Zero trust for agents is not a slogan. It is a build.&lt;/P&gt;
&lt;P data-start="10312" data-end="10928"&gt;Prompt Shields gives you a front gate for both user prompt attacks and document attacks, with clear annotations like detected and filtered. &lt;BR data-start="10495" data-end="10498" /&gt;API Management gives you a tool boundary that can say “no” regardless of what the model tries, using validate-jwt. &lt;BR data-start="10654" data-end="10657" /&gt;Signed intent gives you evidence, using Key Vault’s sign-hash semantics. &lt;BR data-start="10769" data-end="10772" /&gt;Purview labels give you governed retrieval. Sentinel and Defender give you an operating model, not wishful thinking.&lt;/P&gt;
&lt;P data-start="10930" data-end="11150"&gt;If you want the conceptual spine and the architectural principles that frame this pipeline, start with my earlier Tech Community pieces, then come back here and implement the gates.&lt;/P&gt;
&lt;P data-start="10930" data-end="11150"&gt;Thanks for reading&lt;/P&gt;
&lt;P data-start="10930" data-end="11150"&gt;— Hazem Ali&lt;/P&gt;</description>
      <pubDate>Wed, 22 Apr 2026 08:06:05 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/building-an-auditable-security-layer-for-agentic-ai/ba-p/4495753</guid>
      <dc:creator>hazem</dc:creator>
      <dc:date>2026-04-22T08:06:05Z</dc:date>
    </item>
    <item>
      <title>Prompt Engineering for Spec-Driven Development with SpecKit</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/prompt-engineering-for-spec-driven-development-with-speckit/ba-p/4512622</link>
      <description>&lt;H2&gt;Introduction&lt;/H2&gt;
&lt;P&gt;Charlotte Yeo, UCL MEng Computer Science &lt;A href="https://www.linkedin.com/in/charlotte-yeo-627476294/" target="_blank" rel="noopener"&gt;https://www.linkedin.com/in/charlotte-yeo-627476294/&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Supervisors: Janaina Mourao-Miranda (UCL) and Lee Stott (Microsoft).&lt;/P&gt;
&lt;P&gt;For my final-year MEng project at UCL, I investigated how to get the best results out of&lt;A class="lia-external-url" href="https://speckit.org" target="_blank"&gt; SpecKit&lt;/A&gt;, a spec-driven AI development framework, by systematically testing different prompt strategies. &lt;BR /&gt;&lt;BR /&gt;Here's what I found.&lt;/P&gt;
&lt;H2&gt;Project Overview&lt;/H2&gt;
&lt;P&gt;LLMs are powerful coding assistants, but they struggle to maintain context over long development sessions, leading to hallucinations and inconsistent outputs. SpecKit addresses this by using persistent, structured specification documents as memory throughout the development process. The developer writes a natural language spec; SpecKit builds the software from it.&lt;/P&gt;
&lt;P&gt;The problem is that no one has established best practices for writing those specs. This project aimed to fill that gap.&lt;/P&gt;
&lt;H2&gt;Experiments&lt;/H2&gt;
&lt;P&gt;I ran 10 experiments, each using SpecKit to build the same target system, a multi-agent AI code verification tool, from a different prompt formulation. The variables I tested included prompt authority, format, level of detail, and output format. By keeping the target software constant, the effect of each prompt change on SpecKit's performance is isolated.&lt;/P&gt;
&lt;P&gt;The target system itself used &lt;A class="lia-external-url" href="https://learn.microsoft.com/en-us/agent-framework/" target="_blank"&gt;Microsoft Agent Framework&lt;/A&gt;, &lt;A class="lia-external-url" href="https://learn.microsoft.com/azure/cosmos-db/gen-ai/rag" target="_blank"&gt;Azure Cosmos DB for RAG&lt;/A&gt;, and&lt;A class="lia-external-url" href="https://ai.azure.com" target="_blank"&gt; Microsoft Foundry&lt;/A&gt; to access &lt;A class="lia-external-url" href="https://azure.microsoft.com/en-us/blog/introducing-gpt-5-2-in-microsoft-foundry-the-new-standard-for-enterprise-ai/" target="_blank"&gt;GPT-5.2&lt;/A&gt;, all orchestrated via a Python codebase. This covered a wide range of real-world engineering challenges: multi-agent coordination, cloud service integration, and working with a library new enough that the model hadn't been trained on it.&lt;/P&gt;
&lt;H2&gt;Technical Details&lt;/H2&gt;
&lt;P&gt;SpecKit runs as a series of commands inside GitHub Copilot in VS Code, powered here by Claude Sonnet 4.5. The workflow moves through seven stages: /constitution → /specify → /clarify → /plan → /tasks → /analyze → /implement. At each stage, SpecKit writes and updates Markdown files that serve as persistent memory, so the session can be paused and resumed without losing context.&lt;/P&gt;
&lt;P&gt;Key tools used:&lt;/P&gt;
&lt;UL&gt;
&lt;LI aria-level="1"&gt;Microsoft Agent Framework — agent orchestration&lt;/LI&gt;
&lt;LI aria-level="1"&gt;Microsoft Foundry — access to LLMs (GPT-5.2, Text Embedding 3)&lt;/LI&gt;
&lt;LI aria-level="1"&gt;Azure Cosmos DB — code example database for RAG&lt;/LI&gt;
&lt;LI aria-level="1"&gt;Claude Sonnet 4.5 — model powering SpecKit via &lt;A class="lia-external-url" href="https://github.com/features/copilot" target="_blank"&gt;GitHub Copilot&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Results&lt;/H2&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;These were the key findings:&lt;/P&gt;
&lt;UL&gt;
&lt;LI aria-level="1"&gt;Natural language outperforms machine-readable formats. The JSON prompt (Case 1) took 40% longer and generated significantly more issues than the natural language control.&lt;/LI&gt;
&lt;LI aria-level="1"&gt;Authority is necessary. Removing the authoritative framing from the prompt (Case 3) caused SpecKit to treat specifications as optional, resulting in the multi-agent system not being built at all until manually corrected. Total time: 4h 53m vs. 2h 24m for the control.&lt;/LI&gt;
&lt;LI aria-level="1"&gt;Omit what the model already knows. Removing the scoring rubrics (Case 8) saved 34 minutes with no loss in output quality as the model inferred the rubric from context. However, omitting the Cosmos DB schema or agent architecture descriptions caused major implementation errors.&lt;/LI&gt;
&lt;LI aria-level="1"&gt;The model must be able to read its own outputs. Changing the output to PDF (Case 9), which Claude Sonnet 4.5 cannot read in Copilot, caused the implementation stage to increase significantly to 7h 38m, with 33 required interventions, because the model couldn't verify whether its code was working.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Best Practices Found&lt;/H2&gt;
&lt;P&gt;The biggest insight is that prompt design has as much impact on SpecKit's performance as prompt content. A complete specification written non-authoritatively or in JSON will produce worse results than a slightly shorter specification written in clear, authoritative natural language.&lt;/P&gt;
&lt;P&gt;There is also a trade-off between token count and manual intervention. Shorter prompts are faster, but only when the omitted information is something the model can reliably infer. Leaving out details about unique libraries or architectures will result in higher debugging times later.&lt;/P&gt;
&lt;H2&gt;Future Development&lt;/H2&gt;
&lt;P&gt;These are directions for future work in this area:&amp;nbsp;&lt;/P&gt;
&lt;UL&gt;
&lt;LI aria-level="1"&gt;Running each experiment multiple times to account for model non-determinism&lt;/LI&gt;
&lt;LI aria-level="1"&gt;Repeating experiments with newer or different LLMs to test generalisability&lt;/LI&gt;
&lt;LI aria-level="1"&gt;Testing with different target systems beyond code verification&lt;/LI&gt;
&lt;LI aria-level="1"&gt;Supplying SpecKit with tools (e.g. Playwright MCP) to read outputs it currently cannot access, like live webpages or PDFs&lt;/LI&gt;
&lt;/UL&gt;
&lt;H2&gt;Conclusion&lt;/H2&gt;
&lt;P&gt;Spec-driven development with SpecKit is a useful approach for building complex software with LLMs, but the quality of your prompt determines the quality of your outcome. For the most effective results, write in natural language, keep the whole prompt authoritative, include detail on novel or library-specific components, design your system's outputs to be readable by the model building them, and leave out only what the model can confidently infer.&lt;/P&gt;
&lt;P&gt;If you want to explore the tools used in this project, here are some useful starting points:&lt;/P&gt;
&lt;UL&gt;
&lt;LI aria-level="1"&gt;&lt;A href="https://github.com/microsoft/agent-framework" target="_blank" rel="noopener"&gt;Microsoft Agent Framework&lt;/A&gt;&lt;/LI&gt;
&lt;LI aria-level="1"&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/cosmos-db/" target="_blank" rel="noopener"&gt;Azure Cosmos DB documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI aria-level="1"&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/ai-foundry/" target="_blank" rel="noopener"&gt;Azure AI Foundry documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI aria-level="1"&gt;&lt;A href="https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview" target="_blank" rel="noopener"&gt;Anthropic prompt engineering guide&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Mon, 20 Apr 2026 09:14:37 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/prompt-engineering-for-spec-driven-development-with-speckit/ba-p/4512622</guid>
      <dc:creator>charykn</dc:creator>
      <dc:date>2026-04-20T09:14:37Z</dc:date>
    </item>
    <item>
      <title>Build and Deploy a Microsoft Foundry Hosted Agent: A Hands-On Workshop</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/build-and-deploy-a-microsoft-foundry-hosted-agent-a-hands-on/ba-p/4508426</link>
      <description>&lt;ARTICLE&gt;
&lt;SECTION&gt;&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;P&gt;Agents are easy to demo, hard to ship.&lt;/P&gt;
&lt;P&gt;Most teams can put together a convincing prototype quickly. The harder part starts afterwards: shaping deterministic tools, validating behaviour with tests, building a CI path, packaging for deployment, and proving the experience through a user-facing interface. That is where many promising projects slow down.&lt;/P&gt;
&lt;P&gt;This workshop helps you close that gap without unnecessary friction. You get a guided path from local run to deployment handoff, then complete the journey with a working chat UI that calls your deployed hosted agent through the project endpoint.&lt;/P&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;What You Will Build&lt;/H2&gt;
&lt;P&gt;This is a hands-on, end-to-end learning experience for building and deploying AI agents with Microsoft Foundry.&lt;/P&gt;
&lt;P&gt;The lab provides a guided and practical journey through hosted-agent development, including deterministic tool design, prompt-guided workflows, CI validation, deployment preparation, and UI integration.&lt;/P&gt;
&lt;P&gt;It’s designed to reduce setup friction with a ready-to-run experience.&lt;/P&gt;
&lt;P&gt;It is a prompt-based development lab using Copilot guidance and MCP-assisted workflow options during deployment.&lt;/P&gt;
&lt;P&gt;It’s a .NET 10 workshop that includes local development, Copilot-assisted coding, CI, secure deployment to Azure, and a working chat UI.&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;A local hosted agent that responds on the responses contract&lt;/LI&gt;
&lt;LI&gt;Deterministic tool improvements in core logic with xUnit coverage&lt;/LI&gt;
&lt;LI&gt;A GitHub Actions CI workflow for restore, build, test, and container validation&lt;/LI&gt;
&lt;LI&gt;An Azure-ready deployment path using azd, ACR image publishing, and Foundry manifest apply&lt;/LI&gt;
&lt;LI&gt;A Blazor chat UI that calls openai/v1/responses with agent_reference&lt;/LI&gt;
&lt;LI&gt;A repeatable implementation shape that teams can adapt to real projects&lt;/LI&gt;
&lt;/UL&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;Who This Lab Is For&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;AI developers and software engineers who prefer learning by building&lt;/LI&gt;
&lt;LI&gt;Motivated beginners who want a guided, step-by-step path&lt;/LI&gt;
&lt;LI&gt;Experienced developers who want a practical hosted-agent reference implementation&lt;/LI&gt;
&lt;LI&gt;Architects evaluating deployment shape, validation strategy, and operational readiness&lt;/LI&gt;
&lt;LI&gt;Technical decision-makers who need to see how demos become deployable systems&lt;/LI&gt;
&lt;/UL&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;Why Hosted Agents&lt;/H2&gt;
&lt;P&gt;Hosted agents run your code in a managed environment. That matters because it reduces the amount of infrastructure plumbing you need to manage directly, while giving you a clearer path to secure, observable, team-friendly deployments.&lt;/P&gt;
&lt;P&gt;Prompt-only demos are still useful. They are quick, excellent for ideation, and often the right place to start. Hosted agents complement that approach when you need custom code, tool-backed logic, and a deployment process that can be repeated by a team.&lt;/P&gt;
&lt;P&gt;Think of this lab as the bridge: you keep the speed of prompt-based iteration, then layer in the real-world patterns needed to run reliably.&lt;/P&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;What You Will Learn&lt;/H2&gt;
&lt;H3&gt;1) Orchestration&lt;/H3&gt;
&lt;P&gt;You will practise workflow-oriented reasoning through implementation-shape recommendations and multi-step readiness scenarios. The lab introduces orchestration concepts at a practical level, rather than as a dedicated orchestration framework deep dive.&lt;/P&gt;
&lt;H3&gt;2) Tool Integration&lt;/H3&gt;
&lt;P&gt;You will connect deterministic tools and understand how tool calls fit into predictable execution paths. This is a core focus of the workshop and is backed by tests in the solution.&lt;/P&gt;
&lt;H3&gt;3) Retrieval Patterns (What This Lab Covers Today)&lt;/H3&gt;
&lt;P&gt;This workshop does not include a full RAG implementation with embeddings and vector search. Instead, it focuses on deterministic local tools and hosted-agent response flow, giving you a strong foundation before adding retrieval infrastructure in a follow-on phase.&lt;/P&gt;
&lt;H3&gt;4) Observability&lt;/H3&gt;
&lt;P&gt;You will see light observability foundations through OpenTelemetry usage in the host and practical verification during local and deployed checks. This is introductory coverage intended to support debugging and confidence building.&lt;/P&gt;
&lt;H3&gt;5) Responsible AI&lt;/H3&gt;
&lt;P&gt;You will apply production-minded safety basics, including secure secret handling and review hygiene. A full Responsible AI policy and evaluation framework is not the primary goal of this workshop, but the workflow does encourage safe habits from the start.&lt;/P&gt;
&lt;H3&gt;6) Secure Deployment Path&lt;/H3&gt;
&lt;P&gt;You will move from local implementation to Azure deployment with a secure, practical workflow: azd provisioning, ACR publishing, manifest deployment, hosted-agent start, status checks, and endpoint validation.&lt;/P&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;The Learning Journey&lt;/H2&gt;
&lt;P&gt;The overall flow is simple and memorable: clone, open, run, iterate, deploy, observe.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;clone -&amp;gt; open -&amp;gt; run -&amp;gt; iterate -&amp;gt; deploy -&amp;gt; observe&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;You are not expected to memorize every command. The lab is structured to help you learn through small, meaningful wins that build confidence.&lt;/P&gt;
&lt;H3&gt;Your First 15 Minutes: Quick Wins&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;Open the repo and understand the lab structure in a few minutes&lt;/LI&gt;
&lt;LI&gt;Set project endpoint and model deployment environment variables&lt;/LI&gt;
&lt;LI&gt;Run the host locally and validate the responses endpoint&lt;/LI&gt;
&lt;LI&gt;Inspect the deterministic tools in WorkshopLab.Core&lt;/LI&gt;
&lt;LI&gt;Run tests and see how behaviour changes are verified&lt;/LI&gt;
&lt;LI&gt;Review the deployment path so local work maps to Azure steps&lt;/LI&gt;
&lt;LI&gt;Understand how the UI validates end-to-end behaviour after deployment&lt;/LI&gt;
&lt;LI&gt;Leave the first session with a working baseline and a clear next step&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;That first checkpoint is important. Once you see a working loop on your own machine, the rest of the workshop becomes much easier to finish.&lt;/P&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;Using Copilot and MCP in the Workflow&lt;/H2&gt;
&lt;P&gt;This lab emphasises prompt-based development patterns that help you move faster while still learning the underlying architecture. You are not only writing code, you are learning to describe intent clearly, inspect generated output, and iterate with discipline.&lt;/P&gt;
&lt;P&gt;Copilot supports implementation and review in the coding labs. MCP appears as a practical deployment option for hosted-agent lifecycle actions, provided your tools are authenticated to the correct tenant and project context.&lt;/P&gt;
&lt;P&gt;Together, this creates a development rhythm that is especially useful for learning:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Define intent with clear prompts&lt;/LI&gt;
&lt;LI&gt;Generate or adjust implementation details&lt;/LI&gt;
&lt;LI&gt;Validate behaviour through tests and UI checks&lt;/LI&gt;
&lt;LI&gt;Deploy and observe outcomes in Azure&lt;/LI&gt;
&lt;LI&gt;Refine based on evidence, not guesswork&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;That same rhythm transfers well to real projects. Even if your production environment differs, the patterns from this workshop are adaptable.&lt;/P&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;Production-Minded Tips&lt;/H2&gt;
&lt;P&gt;As you complete the lab, keep a production mindset from day one:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Reliability: keep deterministic logic small, testable, and explicit&lt;/LI&gt;
&lt;LI&gt;Security: Treat secrets, identity, and access boundaries as first-class concerns&lt;/LI&gt;
&lt;LI&gt;Observability: use telemetry and status checks to speed up debugging&lt;/LI&gt;
&lt;LI&gt;Governance: keep deployment steps explicit so teams can review and repeat them&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;You do not need to solve everything in one pass. The goal is to build habits that make your agent projects safer and easier to evolve.&lt;/P&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;Start Today:&lt;/H2&gt;
&lt;P&gt;If you have been waiting for the right time to move from “interesting demo” to “practical implementation”, this is the moment. The workshop is structured for self-study, and the steps are designed to keep your momentum high.&lt;/P&gt;
&lt;P&gt;Start here: &lt;A href="https://github.com/microsoft/Hosted_Agents_Workshop_Lab" target="_blank" rel="noopener"&gt;https://github.com/microsoft/Hosted_Agents_Workshop_Lab&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Want deeper documentation while you go? These official guides are great companions:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/foundry/agents/quickstarts/quickstart-hosted-agent" target="_blank" rel="noopener"&gt;Hosted agent quickstart&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/foundry/agents/how-to/deploy-hosted-agent" target="_blank" rel="noopener"&gt;Hosted agent deployment guide&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;When you finish, share what you built. Post a screenshot or short write-up in a GitHub issue/discussion, on social, or in comments with one lesson learned. Your example can help the next developer get unstuck faster.&lt;/P&gt;
&lt;H3&gt;Copy/Paste Progress Checklist&lt;/H3&gt;
&lt;PRE&gt;&lt;CODE&gt;[ ] Clone the workshop repo
[ ] Complete local setup and run the agent
[ ] Make one prompt-based behaviour change
[ ] Validate with tests and chat UI
[ ] Run CI checks
[ ] Provision and deploy via Azure and Foundry workflow
[ ] Review observability signals and refine
[ ] Share what I built + one takeaway&lt;/CODE&gt;&lt;/PRE&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;Common Questions&lt;/H2&gt;
&lt;H3&gt;How long does it take?&lt;/H3&gt;
&lt;P&gt;Most developers can complete a meaningful pass in a few focused sessions of 60-75 mins. You can get the first local success quickly, then continue through deployment and refinement at your own pace.&lt;/P&gt;
&lt;H3&gt;Do I need an Azure subscription?&lt;/H3&gt;
&lt;P&gt;Yes, for provisioning and deployment steps. You can still begin local development and testing before completing all Azure activities.&lt;/P&gt;
&lt;H3&gt;Is it beginner-friendly?&lt;/H3&gt;
&lt;P&gt;Yes. The labs are written for beginners, run in sequence, and include expected outcomes for each stage.&lt;/P&gt;
&lt;H3&gt;Can I adapt it beyond .NET?&lt;/H3&gt;
&lt;P&gt;Yes. The implementation in this workshop is .NET 10, but the architecture and development patterns can be adapted to other stacks.&lt;/P&gt;
&lt;H3&gt;What if I am evaluating for a team?&lt;/H3&gt;
&lt;P&gt;This lab is a strong team evaluation asset because it demonstrates end-to-end flow: local dev, integration patterns, CI, secure deployment, and operational visibility.&lt;/P&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;&lt;/SECTION&gt;
&lt;SECTION&gt;&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;Closing&lt;/H2&gt;
&lt;P&gt;This workshop gives you more than theory. It gives you a practical path from first local run to deployed hosted agent, backed by tests, CI, and a user-facing UI validation loop. If you want a build-first route into Microsoft Foundry hosted-agent development, this is an excellent place to start.&lt;/P&gt;
&lt;P&gt;Begin now: &lt;A href="https://github.com/microsoft/Hosted_Agents_Workshop_Lab" target="_blank" rel="noopener"&gt;https://github.com/microsoft/Hosted_Agents_Workshop_Lab&lt;/A&gt;&lt;/P&gt;
&lt;/SECTION&gt;
&lt;/ARTICLE&gt;</description>
      <pubDate>Fri, 03 Apr 2026 11:25:45 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/build-and-deploy-a-microsoft-foundry-hosted-agent-a-hands-on/ba-p/4508426</guid>
      <dc:creator>Lee_Stott</dc:creator>
      <dc:date>2026-04-03T11:25:45Z</dc:date>
    </item>
    <item>
      <title>Getting Started with Foundry Local: A Student Guide to the Microsoft Foundry Local Lab</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/getting-started-with-foundry-local-a-student-guide-to-the/ba-p/4503604</link>
      <description>&lt;P&gt;If you want to start building AI applications on your own machine, the&amp;nbsp;&lt;A href="https://github.com/microsoft-foundry/foundry-local-lab" target="_blank" rel="noopener"&gt;Microsoft Foundry Local Lab&lt;/A&gt; is one of the most useful places to begin. It is a practical workshop that takes you from first-time setup through to agents, retrieval, evaluation, speech transcription, tool calling, and a browser-based interface. The material is hands-on, cross-language, and designed to show how modern AI apps can run locally rather than depending on a cloud service for every step.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;This blog post is aimed at students, self-taught developers, and anyone learning how AI applications are put together in practice. Instead of treating large language models as a black box, the lab shows you how to install and manage local models, connect to them with code, structure tasks into workflows, and test whether the results are actually good enough. If you have been looking for a learning path that feels more like building real software and less like copying isolated snippets, this workshop is a strong starting point.&lt;/P&gt;
&lt;H2&gt;What Is Foundry Local?&lt;/H2&gt;
&lt;P&gt;&lt;A class="lia-external-url" href="https://foundrylocal.ai" target="_blank" rel="noopener"&gt;Foundry Local&lt;/A&gt; is a local runtime for downloading, managing, and serving AI models on your own hardware. It exposes an OpenAI-compatible interface, which means you can work with familiar SDK patterns while keeping execution on your device. For learners, that matters for three reasons. First, it lowers the barrier to experimentation because you can run projects without setting up a cloud account for every test. Second, it helps you understand the moving parts behind AI applications, including model lifecycle, local inference, and application architecture. Third, it encourages privacy-aware development because the examples are designed to keep data on the machine wherever possible.&lt;/P&gt;
&lt;P&gt;The Foundry Local Lab uses that local-first approach to teach the full journey from simple prompts to multi-agent systems. It includes examples in Python, JavaScript, and C#, so you can follow the language that fits your course, your existing skills, or the platform you want to build on.&lt;/P&gt;
&lt;H2&gt;Why This Lab Works Well for Learners&lt;/H2&gt;
&lt;P&gt;A lot of AI tutorials stop at the moment a model replies to a prompt. That is useful for a first demo, but it does not teach you how to build a proper application. The Foundry Local Lab goes further. It is organised as a sequence of parts, each one adding a new idea and giving you working code to explore. You do not just ask a model to respond. You learn how to manage the service, choose a language SDK, construct retrieval pipelines, build agents, evaluate outputs, and expose the result through a usable interface.&lt;/P&gt;
&lt;P&gt;That sequence is especially helpful for students because the parts build on each other. Early labs focus on confidence and setup. Middle labs focus on architecture and patterns. Later labs move into more advanced ideas that are common in real projects, such as tool calling, evaluation, and custom model packaging. By the end, you have seen not just what a local AI app looks like, but how its different layers fit together.&lt;/P&gt;
&lt;img /&gt;
&lt;P&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;H2&gt;Before You Start&lt;/H2&gt;
&lt;P&gt;The workshop expects a reasonably modern machine and at least one programming language environment. The core prerequisites are straightforward: install Foundry Local, clone the repository, and choose whether you want to work in Python, JavaScript, or C#. You do not need to master all three. In fact, most learners will get more value by picking one language first, completing the full path in that language, and only then comparing how the same patterns look elsewhere.&lt;/P&gt;
&lt;P&gt;If you are new to AI development, do not be put off by the number of parts. The early sections are accessible, and the later ones become much easier once you have completed the foundations. Think of the lab as a structured course rather than a single tutorial.&lt;/P&gt;
&lt;H2&gt;What You Learn in Each Lab &lt;A class="lia-external-url" href="https://github.com/microsoft-foundry/foundry-local-lab" target="_blank" rel="noopener"&gt;https://github.com/microsoft-foundry/foundry-local-lab&lt;/A&gt;&amp;nbsp;&lt;/H2&gt;
&lt;H3&gt;Part 1: Getting Started with Foundry Local&lt;/H3&gt;
&lt;P&gt;The first part introduces the basics of Foundry Local and gets you up and running. You learn how to install the CLI, inspect the model catalogue, download a model, and run it locally. This part also introduces practical details such as model aliases and dynamic service ports, which are small but important pieces of real development work.&lt;/P&gt;
&lt;P&gt;For students, the value of this part is confidence. You prove that local inference works on your machine, you see how the service behaves, and you learn the operational basics before writing any application code. By the end of Part 1, you should understand what Foundry Local does, how to start it, and how local model serving fits into an application workflow.&lt;/P&gt;
&lt;H3&gt;Part 2: Foundry Local SDK Deep Dive&lt;/H3&gt;
&lt;P&gt;Once the CLI makes sense, the workshop moves into the SDK. This part explains why application developers often use the SDK instead of relying only on terminal commands. You learn how to manage the service programmatically, browse available models, control model download and loading, and understand model metadata such as aliases and hardware-aware selection.&lt;/P&gt;
&lt;P&gt;This is where learners start to move from using a tool to building with a platform. You begin to see the difference between running a model manually and integrating it into software. By the end of this section, you should understand the API surface you will use in your own projects and know how to bootstrap the SDK in Python, JavaScript, or C#.&lt;/P&gt;
&lt;H3&gt;Part 3: SDKs and APIs&lt;/H3&gt;
&lt;P&gt;Part 3 turns the SDK concepts into a working chat application. You connect code to the local inference server and use the OpenAI-compatible API for streaming chat completions. The lab includes examples in all three supported languages, which makes it especially useful if you are comparing ecosystems or learning how the same idea is expressed through different syntax and libraries.&lt;/P&gt;
&lt;P&gt;The key learning outcome here is not just that you can get a response from a model. It is that you understand the boundary between your application and the local model service. You learn how messages are structured, how streaming works, and how to write the sort of integration code that becomes the foundation for every later lab.&lt;/P&gt;
&lt;H3&gt;Part 4: Retrieval-Augmented Generation&lt;/H3&gt;
&lt;P&gt;This is where the workshop starts to feel like modern AI engineering rather than basic prompting. In the retrieval-augmented generation lab, you build a simple RAG pipeline that grounds answers in supplied data. You work with an in-memory knowledge base, apply retrieval logic, score matches, and compose prompts that include grounded context.&lt;/P&gt;
&lt;P&gt;For learners, this part is important because it demonstrates a core truth of AI app development: a model on its own is often not enough. Useful applications usually need access to documents, notes, or structured information. By the end of Part 4, you understand why retrieval matters, how to pass retrieved context into a prompt, and how a pipeline can make answers more relevant and reliable.&lt;/P&gt;
&lt;H3&gt;Part 5: Building AI Agents&lt;/H3&gt;
&lt;P&gt;Part 5 introduces the concept of an agent. Instead of a one-off prompt and response, you begin to define behaviour through system instructions, roles, and conversation state. The lab uses the ChatAgent pattern and the Microsoft Agent Framework to show how an agent can maintain a purpose, respond with a persona, and return structured output such as JSON.&lt;/P&gt;
&lt;P&gt;This part helps learners understand the difference between a raw model call and a reusable application component. You learn how to design instructions that shape behaviour, how multi-turn interaction differs from single prompts, and why structured output matters when an AI component has to work inside a broader system.&lt;/P&gt;
&lt;H3&gt;Part 6: Multi-Agent Workflows&lt;/H3&gt;
&lt;P&gt;Once a single agent makes sense, the workshop expands the idea into a multi-agent workflow. The example pipeline uses roles such as researcher, writer, and editor, with outputs passed from one stage to the next. You explore sequential orchestration, shared configuration, and feedback loops between specialised components.&lt;/P&gt;
&lt;P&gt;For students, this lab is a very clear introduction to decomposition. Instead of asking one model to do everything at once, you break a task into smaller responsibilities. That pattern is useful well beyond AI. By the end of Part 6, you should understand why teams build multi-agent systems, how hand-offs are structured, and what trade-offs appear when more components are added to a workflow.&lt;/P&gt;
&lt;H3&gt;Part 7: Zava Creative Writer Capstone Application&lt;/H3&gt;
&lt;P&gt;The Zava Creative Writer is the capstone project that brings the earlier ideas together into a more production-style application. It uses multiple specialised agents, structured JSON hand-offs, product catalogue search, streaming output, and evaluation-style feedback loops. Rather than showing an isolated feature, this part shows how separate patterns combine into a complete system.&lt;/P&gt;
&lt;P&gt;This is one of the most valuable parts of the workshop for learner developers because it narrows the gap between tutorial code and real application design. You can see how orchestration, agent roles, and practical interfaces fit together. By the end of Part 7, you should be able to recognise the architecture of a serious local AI app and understand how the earlier labs support it.&lt;/P&gt;
&lt;H3&gt;Part 8: Evaluation-Led Development&lt;/H3&gt;
&lt;P&gt;Many beginner AI projects stop once the output looks good once or twice. This lab teaches a much stronger habit: evaluation-led development. You work with golden datasets, rule-based checks, and LLM-as-judge scoring to compare prompt or agent variants systematically. The goal is to move from anecdotal testing to repeatable assessment.&lt;/P&gt;
&lt;P&gt;This matters enormously for students because evaluation is one of the clearest differences between a classroom demo and dependable software. By the end of Part 8, you should understand how to define success criteria, compare outputs at scale, and use evidence rather than intuition when improving an AI component.&lt;/P&gt;
&lt;H3&gt;Part 9: Voice Transcription with Whisper&lt;/H3&gt;
&lt;P&gt;Part 9 broadens the workshop beyond text generation by introducing speech-to-text with Whisper running locally. You use the Foundry Local SDK to download and load the model, then transcribe local audio files through the compatible API surface. The emphasis is on privacy-first processing, with audio kept on-device.&lt;/P&gt;
&lt;P&gt;This section is a useful reminder that local AI development is not limited to chatbots. Learners see how a different modality fits into the same ecosystem and how local execution supports sensitive workloads. By the end of this lab, you should understand the transcription flow, the relevant client methods, and how speech features can be integrated into broader applications.&lt;/P&gt;
&lt;H3&gt;Part 10: Using Custom or Hugging Face Models&lt;/H3&gt;
&lt;P&gt;After learning the standard path, the workshop shows how to work with custom or Hugging Face models. This includes compiling models into optimised ONNX format with ONNX Runtime GenAI, choosing hardware-specific options, applying quantisation strategies, creating configuration files, and adding compiled models to the Foundry Local cache.&lt;/P&gt;
&lt;P&gt;For learner developers, this part opens the door to model engineering rather than simple model consumption. You begin to understand that model choice, optimisation, and packaging affect performance and usability. By the end of Part 10, you should have a clearer picture of how models move from an external source into a runnable local setup and why deployment format matters.&lt;/P&gt;
&lt;H3&gt;Part 11: Tool Calling with Local Models&lt;/H3&gt;
&lt;P&gt;Tool calling is one of the most practical patterns in current AI development, and this lab covers it directly. You define tool schemas, allow the model to request function calls, handle the multi-turn interaction loop, execute the tools locally, and return results back to the model. The examples include practical scenarios such as weather and population tools.&lt;/P&gt;
&lt;P&gt;This lab teaches learners how to move beyond generation into action. A model is no longer limited to producing text. It can decide when external data or a function is needed and incorporate that result into a useful answer. By the end of Part 11, you should understand the tool-calling flow and how AI systems connect reasoning with deterministic software behaviour.&lt;/P&gt;
&lt;H3&gt;Part 12: Building a Web UI for the Zava Creative Writer&lt;/H3&gt;
&lt;P&gt;Part 12 adds a browser-based front end to the capstone application. You learn how to serve a shared interface from Python, JavaScript, or C#, stream updates to the browser, consume NDJSON with the Fetch API and ReadableStream, and show live agent status as content is produced in real time.&lt;/P&gt;
&lt;P&gt;This part is especially good for students who want to build portfolio projects. It turns backend orchestration into something visible and interactive. By the end of Part 12, you should understand how to connect a local AI backend to a web interface and how streaming changes the user experience compared with waiting for one final response.&lt;/P&gt;
&lt;H3&gt;Part 13: Workshop Complete&lt;/H3&gt;
&lt;P&gt;The final part is a summary and extension point. It reviews what you have built across the previous sections and suggests ways to continue. Although it is not a new technical lab in the same way as the earlier parts, it plays an important role in learning. It helps you consolidate the architecture, the terminology, and the development patterns you have encountered.&lt;/P&gt;
&lt;P&gt;For learners, reflection matters. By the end of Part 13, you should be able to describe the full stack of a local AI application, from model management to user interface, and identify which area you want to deepen next.&lt;/P&gt;
&lt;H2&gt;What Students Gain from the Full Workshop&lt;/H2&gt;
&lt;P&gt;Taken together, these labs do more than teach Foundry Local itself. They teach how AI applications are built. You learn operational basics such as model setup and service management. You learn application integration through SDKs and APIs. You learn system design through RAG, agents, multi-agent orchestration, and web interfaces. You learn engineering discipline through evaluation. You also see how text, speech, custom models, and tool calling all fit into one local-first development workflow.&lt;/P&gt;
&lt;P&gt;That breadth makes the workshop useful in several settings. A student can use it as a self-study path. A lecturer can use it as source material for practical sessions. A learner developer can use it to build portfolio pieces and to understand which AI patterns are worth learning next. Because the repository includes Python, JavaScript, and C#, it also works well for comparing how architectural ideas transfer across languages.&lt;/P&gt;
&lt;H2&gt;How to Approach the Lab as a Beginner&lt;/H2&gt;
&lt;P&gt;If you are starting from scratch, the best route is simple. Complete Parts 1 to 3 in your preferred language first. That gives you the essential setup and integration skills. Then move into Parts 4 to 6 to understand how AI application patterns are composed. After that, use Parts 7 and 8 to learn how larger systems and evaluation fit together. Finally, explore Parts 9 to 12 based on your interests, whether that is speech, tooling, model customisation, or front-end work.&lt;/P&gt;
&lt;P&gt;It is also worth keeping notes as you go. Record what each part adds to your understanding, what code files matter, and what assumptions each example makes. That habit will help you move from following the labs to adapting the patterns in your own projects.&lt;/P&gt;
&lt;H2&gt;Final Thoughts&lt;/H2&gt;
&lt;P&gt;The &lt;A class="lia-external-url" href="https://github.com/microsoft-foundry/foundry-local-lab" target="_blank" rel="noopener"&gt;Microsoft Foundry Local Lab&lt;/A&gt; is a strong introduction to local AI development because it treats learners like developers rather than spectators. You install, run, connect, orchestrate, evaluate, and present working systems. That makes it far more valuable than a short demo that only proves a model can answer a question.&lt;/P&gt;
&lt;P&gt;If you are a student or learner developer who wants to understand how AI applications are really built, this lab gives you a clear path. Start with the basics, pick one language, and work through the parts in order. By the time you finish, you will not just have used Foundry Local. You will have a practical foundation for building local AI applications with far more confidence and much better judgement.&lt;/P&gt;</description>
      <pubDate>Mon, 30 Mar 2026 07:00:00 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/getting-started-with-foundry-local-a-student-guide-to-the/ba-p/4503604</guid>
      <dc:creator>Lee_Stott</dc:creator>
      <dc:date>2026-03-30T07:00:00Z</dc:date>
    </item>
    <item>
      <title>Langchain Multi-Agent Systems with Microsoft Agent Framework and Hosted Agents</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/langchain-multi-agent-systems-with-microsoft-agent-framework-and/ba-p/4504863</link>
      <description>&lt;P&gt;If you have been building AI agents with LangChain, you already know how powerful its tool and chain abstractions are. But when it comes to deploying those agents to production — with real infrastructure, managed identity, live web search, and container orchestration — you need something more.&lt;/P&gt;
&lt;P&gt;This post walks through how to combine &lt;STRONG&gt;LangChain&lt;/STRONG&gt; with the &lt;STRONG&gt;Microsoft Agent Framework&lt;/STRONG&gt; (&lt;CODE&gt;azure-ai-agents&lt;/CODE&gt;) and deploy the result as a &lt;STRONG&gt;Microsoft Foundry Hosted Agent&lt;/STRONG&gt;. We will build a multi-agent incident triage copilot that uses LangChain locally and seamlessly upgrades to cloud-hosted capabilities on Microsoft Foundry.&lt;/P&gt;
&lt;H2&gt;Why combine LangChain with Microsoft Agent Framework?&lt;/H2&gt;
&lt;P&gt;As a LangChain developer, you get excellent abstractions for building agents: the &lt;CODE&gt;@tool&lt;/CODE&gt; decorator, &lt;CODE&gt;RunnableLambda&lt;/CODE&gt; chains, and composable pipelines. But production deployment raises questions that LangChain alone does not answer:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Where do your agents run?&lt;/STRONG&gt; Containers, serverless, or managed infrastructure?&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;How do you add live web search or code execution?&lt;/STRONG&gt; Bing Grounding and Code Interpreter are not LangChain built-ins.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;How do you handle authentication?&lt;/STRONG&gt; Managed identity, API keys, or tokens?&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;How do you observe agents in production?&lt;/STRONG&gt; Distributed tracing across multiple agents?&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The Microsoft Agent Framework fills these gaps. It provides &lt;CODE&gt;AgentsClient&lt;/CODE&gt; for creating and managing agents on Microsoft Foundry, built-in tools like &lt;CODE&gt;BingGroundingTool&lt;/CODE&gt; and &lt;CODE&gt;CodeInterpreterTool&lt;/CODE&gt;, and a thread-based conversation model. Combined with Hosted Agents, you get a fully managed container runtime with health probes, auto-scaling, and the OpenAI Responses API protocol.&lt;/P&gt;
&lt;P&gt;The key insight: &lt;STRONG&gt;LangChain handles local logic and chain composition; the Microsoft Agent Framework handles cloud-hosted orchestration and tooling.&lt;/STRONG&gt;&lt;/P&gt;
&lt;H2&gt;Architecture overview&lt;/H2&gt;
&lt;P&gt;The incident triage copilot uses a coordinator pattern with three specialist agents:&lt;/P&gt;
&lt;P&gt;&lt;IMG src="https://raw.githubusercontent.com/leestott/hosted-agents-langchain-samples/main/screenshots/01-ui-homepage-foundry-connected.png" alt="UI Homepage showing Foundry connected status" /&gt;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;User Query
    |
    v
Coordinator Agent
    |
    +--&amp;gt; LangChain Triage Chain    (routing decision)
    +--&amp;gt; LangChain Synthesis Chain  (combine results)
    |
    +---+---+---+
    |   |       |
    v   v       v
Research  Diagnostics  Remediation
 Agent      Agent        Agent&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Each specialist agent has two execution modes:&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Mode&lt;/th&gt;&lt;th&gt;LangChain Role&lt;/th&gt;&lt;th&gt;Microsoft Agent Framework Role&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Local&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;&lt;CODE&gt;@tool&lt;/CODE&gt; functions provide heuristic analysis&lt;/td&gt;&lt;td&gt;Not used&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;STRONG&gt;Foundry&lt;/STRONG&gt;&lt;/td&gt;&lt;td&gt;Chains handle routing and synthesis&lt;/td&gt;&lt;td&gt;&lt;CODE&gt;AgentsClient&lt;/CODE&gt; with &lt;CODE&gt;BingGroundingTool&lt;/CODE&gt;, &lt;CODE&gt;CodeInterpreterTool&lt;/CODE&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;P&gt;This dual-mode design means you can develop and test locally with zero cloud dependencies, then deploy to Foundry for production capabilities.&lt;/P&gt;
&lt;H2&gt;Step 1: Define your LangChain tools&lt;/H2&gt;
&lt;P&gt;Start with what you know. Define typed, documented tools using LangChain’s &lt;CODE&gt;@tool&lt;/CODE&gt; decorator:&lt;/P&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;LI-CODE lang=""&gt;from langchain_core.tools import tool

@tool
def classify_incident_severity(query: str) -&amp;gt; str:
    """Classify the severity and priority of an incident based on keywords.

    Args:
        query: The incident description text.

    Returns:
        Severity classification with priority level.
    """
    query_lower = query.lower()

    critical_keywords = [
        "production down", "all users", "outage", "breach",
    ]
    high_keywords = [
        "503", "500", "timeout", "latency", "slow",
    ]

    if any(kw in query_lower for kw in critical_keywords):
        return "severity=critical, priority=P1"
    if any(kw in query_lower for kw in high_keywords):
        return "severity=high, priority=P2"
    return "severity=low, priority=P4"&lt;/LI-CODE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;P&gt;These tools work identically in local mode and serve as fallbacks when Foundry is unavailable.&lt;/P&gt;
&lt;H2&gt;Step 2: Build routing with LangChain chains&lt;/H2&gt;
&lt;P&gt;Use &lt;CODE&gt;RunnableLambda&lt;/CODE&gt; to create a routing chain that classifies the incident and selects which specialists to invoke:&lt;/P&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;LI-CODE lang=""&gt;from langchain_core.runnables import RunnableLambda
from enum import Enum

class AgentRole(str, Enum):
    RESEARCH = "research"
    DIAGNOSTICS = "diagnostics"
    REMEDIATION = "remediation"

DIAGNOSTICS_KEYWORDS = {
    "log", "error", "exception", "timeout", "500", "503",
    "crash", "oom", "root cause",
}

REMEDIATION_KEYWORDS = {
    "fix", "remediate", "runbook", "rollback", "hotfix",
    "patch", "resolve", "action plan",
}

def _route(inputs: dict) -&amp;gt; dict:
    query = inputs["query"].lower()
    specialists = [AgentRole.RESEARCH]  # always included

    if any(kw in query for kw in DIAGNOSTICS_KEYWORDS):
        specialists.append(AgentRole.DIAGNOSTICS)

    if any(kw in query for kw in REMEDIATION_KEYWORDS):
        specialists.append(AgentRole.REMEDIATION)

    return {**inputs, "specialists": specialists}

triage_routing_chain = RunnableLambda(_route)&lt;/LI-CODE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;P&gt;This is pure LangChain — no cloud dependency. The chain analyses the query and returns which specialists should handle it.&lt;/P&gt;
&lt;H2&gt;Step 3: Create specialist agents with dual-mode execution&lt;/H2&gt;
&lt;P&gt;Each specialist agent extends a base class. In local mode, it uses LangChain tools. In Foundry mode, it delegates to the Microsoft Agent Framework:&lt;/P&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;LI-CODE lang=""&gt;from abc import ABC, abstractmethod
from pathlib import Path

class BaseSpecialistAgent(ABC):
    role: AgentRole
    prompt_file: str

    def __init__(self):
        prompt_path = Path(__file__).parent.parent / "prompts" / self.prompt_file
        self.system_prompt = prompt_path.read_text(encoding="utf-8")

    async def run(self, query, shared_context, correlation_id, client=None):
        if client is not None:
            return await self._run_on_foundry(query, shared_context, correlation_id, client)
        return await self._run_locally(query, shared_context, correlation_id)

    async def _run_on_foundry(self, query, shared_context, correlation_id, client):
        """Use Microsoft Agent Framework for cloud-hosted execution."""
        from azure.ai.agents.models import BingGroundingTool

        agent = await client.agents.create_agent(
            model=shared_context.get("model_deployment", "gpt-4o"),
            name=f"{self.role.value}-{correlation_id}",
            instructions=self.system_prompt,
            tools=self._get_foundry_tools(shared_context),
        )

        thread = await client.agents.threads.create()
        await client.agents.messages.create(
            thread_id=thread.id,
            role="user",
            content=self._build_prompt(query, shared_context),
        )

        run = await client.agents.runs.create_and_process(
            thread_id=thread.id,
            agent_id=agent.id,
        )
        # Extract and return the agent’s response...

    async def _run_locally(self, query, shared_context, correlation_id):
        """Use LangChain tools for local heuristic analysis."""
        # Each subclass implements this with its specific tools
        ...&lt;/LI-CODE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;P&gt;The key pattern here:&amp;nbsp;&lt;STRONG&gt;same interface, different backends&lt;/STRONG&gt;. Your coordinator does not care whether a specialist ran locally or on Foundry.&lt;/P&gt;
&lt;H2&gt;Step 4: Wire it up with FastAPI&lt;/H2&gt;
&lt;P&gt;Expose the multi-agent pipeline through a FastAPI endpoint. The &lt;CODE&gt;/triage&lt;/CODE&gt; endpoint accepts incident descriptions and returns structured reports:&lt;/P&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;LI-CODE lang=""&gt;from fastapi import FastAPI
from agents.coordinator import Coordinator
from models import TriageRequest

app = FastAPI(title="Incident Triage Copilot")
coordinator = Coordinator()

@app.post("/triage")
async def triage(request: TriageRequest):
    return await coordinator.triage(
        request=request,
        client=app.state.foundry_client,
        max_turns=10,
    )&lt;/LI-CODE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;P&gt;The application also implements the&amp;nbsp;&lt;CODE&gt;/responses&lt;/CODE&gt; endpoint, which follows the OpenAI Responses API protocol. This is what Microsoft Foundry Hosted Agents expects when routing traffic to your container.&lt;/P&gt;
&lt;H2&gt;Step 5: Deploy as a Hosted Agent&lt;/H2&gt;
&lt;P&gt;This is where Microsoft Foundry Hosted Agents shines. Your multi-agent system becomes a managed, auto-scaling service with a single command:&lt;/P&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;LI-CODE lang=""&gt;# Install the azd AI agent extension
azd extension install azure.ai.agents

# Provision infrastructure and deploy
azd up&lt;/LI-CODE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;P&gt;&lt;IMG src="https://raw.githubusercontent.com/leestott/hosted-agents-langchain-samples/main/screenshots/02-ui-triage-running.png" alt="Triage pipeline running with Research, Diagnostics, and Remediation agents" /&gt;&lt;/P&gt;
&lt;P&gt;The Azure Developer CLI (&lt;CODE&gt;azd&lt;/CODE&gt;) provisions everything:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Azure Container Registry&lt;/STRONG&gt; for your Docker image&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Container App&lt;/STRONG&gt; with health probes and auto-scaling&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;User-Assigned Managed Identity&lt;/STRONG&gt; for secure authentication&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Microsoft Foundry Hub and Project&lt;/STRONG&gt; with model deployments&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Application Insights&lt;/STRONG&gt; for distributed tracing&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Your &lt;CODE&gt;agent.yaml&lt;/CODE&gt; defines what tools the hosted agent has access to:&lt;/P&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;LI-CODE lang=""&gt;name: incident-triage-copilot-langchain
kind: hosted
model:
  deployment: gpt-4o
identity:
  type: managed
tools:
  - type: bing_grounding
    enabled: true
  - type: code_interpreter
    enabled: true&lt;/LI-CODE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;H2&gt;What you gain over pure LangChain&lt;/H2&gt;
&lt;P&gt;&lt;IMG src="https://raw.githubusercontent.com/leestott/hosted-agents-langchain-samples/main/screenshots/03-ui-triage-report.png" alt="Triage report showing coordinator summary and specialist results" /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Capability&lt;/th&gt;&lt;th&gt;LangChain Only&lt;/th&gt;&lt;th&gt;LangChain + Microsoft Agent Framework&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Local development&lt;/td&gt;&lt;td&gt;Yes&lt;/td&gt;&lt;td&gt;Yes (identical experience)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Live web search&lt;/td&gt;&lt;td&gt;Requires custom integration&lt;/td&gt;&lt;td&gt;Built-in &lt;CODE&gt;BingGroundingTool&lt;/CODE&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Code execution&lt;/td&gt;&lt;td&gt;Requires sandboxing&lt;/td&gt;&lt;td&gt;Built-in &lt;CODE&gt;CodeInterpreterTool&lt;/CODE&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Managed hosting&lt;/td&gt;&lt;td&gt;DIY containers&lt;/td&gt;&lt;td&gt;Foundry Hosted Agents&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Authentication&lt;/td&gt;&lt;td&gt;DIY&lt;/td&gt;&lt;td&gt;Managed Identity (zero secrets)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Observability&lt;/td&gt;&lt;td&gt;DIY&lt;/td&gt;&lt;td&gt;OpenTelemetry + Application Insights&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;One-command deploy&lt;/td&gt;&lt;td&gt;No&lt;/td&gt;&lt;td&gt;&lt;CODE&gt;azd up&lt;/CODE&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H2&gt;Testing locally&lt;/H2&gt;
&lt;P&gt;The dual-mode architecture means you can test the full pipeline without any cloud resources:&lt;/P&gt;
&lt;P&gt;&lt;IMG src="https://raw.githubusercontent.com/leestott/hosted-agents-langchain-samples/main/screenshots/04-ui-specialist-agents.png" alt="Research Agent with Bing Grounding and Diagnostics Agent with Code Interpreter" /&gt;&lt;/P&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;LI-CODE lang=""&gt;# Create virtual environment and install dependencies
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Run locally (agents use LangChain tools)
python -m src&lt;/LI-CODE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;P&gt;Then open &lt;CODE&gt;http://localhost:8080&lt;/CODE&gt; in your browser to use the built-in web UI, or call the API directly:&lt;/P&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;LI-CODE lang=""&gt;curl -X POST http://localhost:8080/triage \
  -H "Content-Type: application/json" \
  -d '{"message": "Getting 503 errors on /api/orders since 2pm"}'&lt;/LI-CODE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;P&gt;The response includes a coordinator summary, specialist results with confidence scores, and the tools each agent used.&lt;/P&gt;
&lt;H2&gt;Running the tests&lt;/H2&gt;
&lt;P&gt;The project includes a comprehensive test suite covering routing logic, tool behaviour, agent execution, and HTTP endpoints:&lt;/P&gt;
&lt;LI-CODE lang=""&gt;curl -X POST http://localhost:8080/triage \
  -H "Content-Type: application/json" \
  -d '{"message": "Getting 503 errors on /api/orders since 2pm"}'&lt;/LI-CODE&gt;
&lt;P&gt;Tests run entirely in local mode, so no cloud credentials are needed.&lt;/P&gt;
&lt;H2&gt;Key takeaways for LangChain developers&lt;/H2&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;Keep your LangChain abstractions.&lt;/STRONG&gt; The &lt;CODE&gt;@tool&lt;/CODE&gt; decorator, &lt;CODE&gt;RunnableLambda&lt;/CODE&gt; chains, and composable pipelines all work exactly as you expect.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Add cloud capabilities incrementally.&lt;/STRONG&gt; Start local, then enable Bing Grounding, Code Interpreter, and managed hosting when you are ready.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Use the dual-mode pattern.&lt;/STRONG&gt; Every agent should work locally with LangChain tools and on Foundry with the Microsoft Agent Framework. This makes development fast and deployment seamless.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Let &lt;CODE&gt;azd&lt;/CODE&gt; handle infrastructure.&lt;/STRONG&gt; One command provisions everything: containers, identity, monitoring, and model deployments.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Security comes free.&lt;/STRONG&gt; Managed Identity means no API keys in your code. Non-root containers, RBAC, and disabled ACR admin are all configured by default.&lt;/LI&gt;
&lt;/OL&gt;
&lt;H2&gt;Get started&lt;/H2&gt;
&lt;P&gt;Clone the sample repository and try it yourself:&lt;/P&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;LI-CODE lang=""&gt;git clone https://github.com/leestott/hosted-agents-langchain-samples
cd hosted-agents-langchain-samples
python -m venv .venv &amp;amp;&amp;amp; source .venv/bin/activate
pip install -r requirements.txt
python -m src&lt;/LI-CODE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;P&gt;Open&amp;nbsp;&lt;CODE&gt;http://localhost:8080&lt;/CODE&gt; to interact with the copilot through the web UI. When you are ready for production, run &lt;CODE&gt;azd up&lt;/CODE&gt; and your multi-agent system is live on Microsoft Foundry.&lt;/P&gt;
&lt;H2&gt;Resources&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/azure/ai-services/agents/" target="_blank"&gt;Microsoft Agent Framework for Python documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/azure/ai-services/agents/concepts/hosted-agents" target="_blank"&gt;Microsoft Foundry Hosted Agents&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/azure/developer/azure-developer-cli/" target="_blank"&gt;Azure Developer CLI (azd)&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://python.langchain.com/" target="_blank"&gt;LangChain documentation&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A href="https://learn.microsoft.com/azure/ai-foundry/" target="_blank"&gt;Microsoft Foundry documentation&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Thu, 26 Mar 2026 07:00:00 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/langchain-multi-agent-systems-with-microsoft-agent-framework-and/ba-p/4504863</guid>
      <dc:creator>Lee_Stott</dc:creator>
      <dc:date>2026-03-26T07:00:00Z</dc:date>
    </item>
    <item>
      <title>Build an Offline Hybrid RAG Stack with ONNX and Foundry Local</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/build-an-offline-hybrid-rag-stack-with-onnx-and-foundry-local/ba-p/4503589</link>
      <description>&lt;MAIN&gt;
&lt;ARTICLE&gt;&lt;HEADER&gt;
&lt;P class="lead"&gt;If you are building local AI applications, basic retrieval augmented generation is often only the starting point. This sample shows a more practical pattern: combine lexical retrieval, ONNX based semantic embeddings, and a Foundry Local chat model so the assistant stays grounded, remains offline, and degrades cleanly when the semantic path is unavailable.&lt;/P&gt;
&lt;/HEADER&gt;
&lt;SECTION&gt;
&lt;H2&gt;Why this sample is worth studying&lt;/H2&gt;
&lt;P&gt;Many local RAG samples rely on a single retrieval strategy. That is usually enough for a proof of concept, but it breaks down quickly in production. Exact keywords, acronyms, and document codes behave differently from natural language questions and paraphrased requests.&lt;/P&gt;
&lt;P&gt;This repository keeps the original lexical retrieval path, adds local ONNX embeddings for semantic search, and fuses both signals in a hybrid ranking mode. The generation step runs through Foundry Local, so the entire assistant can remain on device.&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Lexical mode handles exact terms and structured vocabulary.&lt;/LI&gt;
&lt;LI&gt;Semantic mode handles paraphrases and more natural language phrasing.&lt;/LI&gt;
&lt;LI&gt;Hybrid mode combines both and is usually the best default.&lt;/LI&gt;
&lt;LI&gt;Lexical fallback protects the user experience if the embedding pipeline cannot start.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;Architectural overview&lt;/H2&gt;
&lt;P&gt;The sample has two main flows: an offline ingestion pipeline and a local query pipeline.&lt;/P&gt;
&lt;FIGURE&gt;&lt;IMG src="https://raw.githubusercontent.com/leestott/local-hybrid-retrival-onnx/main/screenshots/07-architecture-diagram.png" alt="Architecture diagram showing the ingestion pipeline and local query pipeline" /&gt;
&lt;FIGCAPTION&gt;The architecture splits cleanly into offline ingestion at the top and runtime query handling at the bottom.&lt;/FIGCAPTION&gt;
&lt;/FIGURE&gt;
&lt;H3&gt;Offline ingestion pipeline&lt;/H3&gt;
&lt;OL&gt;
&lt;LI&gt;Read Markdown files from &lt;CODE&gt;docs/&lt;/CODE&gt;.&lt;/LI&gt;
&lt;LI&gt;Parse front matter and split each document into overlapping chunks.&lt;/LI&gt;
&lt;LI&gt;Generate dense embeddings when the ONNX model is available.&lt;/LI&gt;
&lt;LI&gt;Store chunks in SQLite with both sparse lexical features and optional dense vectors.&lt;/LI&gt;
&lt;/OL&gt;
&lt;H3&gt;Local query pipeline&lt;/H3&gt;
&lt;OL&gt;
&lt;LI&gt;The browser posts a question to the Express API.&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;ChatEngine&lt;/CODE&gt; resolves the requested retrieval mode.&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;VectorStore&lt;/CODE&gt; retrieves lexical, semantic, or hybrid results.&lt;/LI&gt;
&lt;LI&gt;The prompt is assembled with the retrieved context and sent to a Foundry Local chat model.&lt;/LI&gt;
&lt;LI&gt;The answer is returned with source references and retrieval metadata.&lt;/LI&gt;
&lt;/OL&gt;
&lt;FIGURE&gt;&lt;IMG src="https://raw.githubusercontent.com/leestott/local-hybrid-retrival-onnx/main/screenshots/08-rag-flow-sequence.png" alt="Sequence diagram showing lexical and hybrid retrieval flow" /&gt;
&lt;FIGCAPTION&gt;The sequence diagram shows the difference between lexical retrieval and hybrid retrieval. In hybrid mode, the query is embedded first, then lexical and semantic scores are fused before prompt assembly.&lt;/FIGCAPTION&gt;
&lt;/FIGURE&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;Repository structure and core components&lt;/H2&gt;
&lt;P&gt;The implementation is compact and readable. The main files to understand are listed below.&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;CODE&gt;src/config.js&lt;/CODE&gt;: retrieval defaults, paths, and model settings.&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;src/embeddingEngine.js&lt;/CODE&gt;: local ONNX embedding generation through Transformers.js.&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;src/vectorStore.js&lt;/CODE&gt;: SQLite storage plus lexical, semantic, and hybrid ranking.&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;src/chatEngine.js&lt;/CODE&gt;: retrieval mode resolution, prompt assembly, and Foundry Local model execution.&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;src/ingest.js&lt;/CODE&gt;: document ingestion and embedding generation during indexing.&lt;/LI&gt;
&lt;LI&gt;&lt;CODE&gt;src/server.js&lt;/CODE&gt;: REST endpoints, streaming endpoints, upload support, and health reporting.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;Getting started&lt;/H2&gt;
&lt;P&gt;To run the sample, you need Node.js 20 or newer, Foundry Local, and a local ONNX embedding model. The default model path is &lt;CODE&gt;models/embeddings/bge-small-en-v1.5&lt;/CODE&gt;.&lt;/P&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;LI-CODE lang=""&gt;cd c:\Users\leestott\local-hybrid-retrival-onnx 
npm install huggingface-cli 
download BAAI/bge-small-en-v1.5 --local-dir models/embeddings/bge-small-en-v1.5 
npm run ingest 
npm start&lt;/LI-CODE&gt;
&lt;P&gt;Ingestion writes the local SQLite database to &lt;CODE&gt;data/rag.db&lt;/CODE&gt;. If the embedding model is available, each chunk gets a dense vector as well as lexical features. If the embedding model is missing, ingestion still succeeds and the application remains usable in lexical mode.&lt;/P&gt;
&lt;DIV class="note"&gt;Best practice: local AI applications should treat model files, SQLite data, and native runtime compatibility as part of the deployable system, not as optional developer conveniences.&lt;/DIV&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;Code walkthrough&lt;/H2&gt;
&lt;H3&gt;1. Retrieval configuration&lt;/H3&gt;
&lt;P&gt;The sample makes its retrieval behaviour explicit in configuration. That is useful for testing and for operator visibility.&lt;/P&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;LI-CODE lang=""&gt;export const config = {
  model: "phi-3.5-mini",
  docsDir: path.join(ROOT, "docs"),
  dbPath: path.join(ROOT, "data", "rag.db"),
  chunkSize: 200,
  chunkOverlap: 25,
  topK: 3,
  retrievalMode: process.env.RETRIEVAL_MODE || "hybrid",
  retrievalModes: ["lexical", "semantic", "hybrid"],
  fallbackRetrievalMode: "lexical",
  retrievalWeights: {
    lexical: 0.45,
    semantic: 0.55,
  },
};&lt;/LI-CODE&gt;&lt;BR /&gt;
&lt;P&gt;Those defaults tell you a lot about the intended operating profile. Chunks are small, the number of returned chunks is low, and the fallback path is explicit.&lt;/P&gt;
&lt;H3&gt;2. Local ONNX embeddings&lt;/H3&gt;
&lt;P&gt;The embedding engine disables remote model loading and only uses local files. That matters for privacy, repeatability, and air gapped operation.&lt;/P&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;LI-CODE lang=""&gt;env.allowLocalModels = true;
env.allowRemoteModels = false;

this.extractor = await pipeline("feature-extraction", resolvedPath, {
  local_files_only: true,
});

const output = await this.extractor(text, {
  pooling: "mean",
  normalize: true,
});&lt;/LI-CODE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;P&gt;The mean pooling and normalisation step make the vectors suitable for cosine similarity based ranking.&lt;/P&gt;
&lt;H3&gt;3. Hybrid storage and ranking in SQLite&lt;/H3&gt;
&lt;P&gt;Instead of adding a separate vector database, the sample stores lexical and semantic representations in the same SQLite table. That keeps the local footprint low and the implementation easy to debug.&lt;/P&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;LI-CODE lang=""&gt;searchHybrid(query, queryEmbedding, topK = 5, weights = { lexical: 0.45, semantic: 0.55 }) {
  const lexicalResults = this.searchLexical(query, topK * 3);
  const semanticResults = this.searchSemantic(queryEmbedding, topK * 3);

  if (semanticResults.length === 0) {
    return lexicalResults.slice(0, topK).map((row) =&amp;gt; ({
      ...row,
      retrievalMode: "lexical",
    }));
  }

  const fused = [...combined.values()].map((row) =&amp;gt; ({
    ...row,
    score: (row.lexicalScore * lexicalWeight) + (row.semanticScore * semanticWeight),
  }));

  fused.sort((a, b) =&amp;gt; b.score - a.score);
  return fused.slice(0, topK);
}&lt;/LI-CODE&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;P&gt;The important point is not just the weighted fusion. It is the fallback behaviour. If semantic retrieval cannot provide results, the user still gets lexical grounding instead of an empty context window.&lt;/P&gt;
&lt;H3&gt;4. Retrieval mode resolution in ChatEngine&lt;/H3&gt;
&lt;P&gt;&lt;CODE&gt;ChatEngine&lt;/CODE&gt; keeps the runtime behaviour predictable. It validates the requested mode and falls back to lexical search when semantic retrieval is unavailable.&lt;/P&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;LI-CODE lang=""&gt;resolveRetrievalMode(requestedMode) {
  const desiredMode = config.retrievalModes.includes(requestedMode)
    ? requestedMode
    : config.retrievalMode;

  if ((desiredMode === "semantic" || desiredMode === "hybrid") &amp;amp;&amp;amp; !this.semanticAvailable) {
    return config.fallbackRetrievalMode;
  }

  return desiredMode;
}&lt;/LI-CODE&gt;
&lt;P&gt;This is a sensible production design because local runtime failures are common. Missing model files or native dependency mismatches should reduce quality, not crash the entire assistant.&lt;/P&gt;
&lt;H3&gt;5. Foundry Local model management&lt;/H3&gt;
&lt;P&gt;The sample uses &lt;CODE&gt;FoundryLocalManager&lt;/CODE&gt; to discover, download, cache, and load the configured chat model.&lt;/P&gt;
&lt;PRE&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;LI-CODE lang=""&gt;const manager = FoundryLocalManager.create({ appName: "gas-field-local-rag" });
const catalog = manager.catalog;

this.model = await catalog.getModel(config.model);

if (!this.model.isCached) {
  await this.model.download((progress) =&amp;gt; {
    const pct = Math.round(progress * 100);
    this._emitStatus("download", `Downloading ${this.modelAlias}... ${pct}%`, progress);
  });
}

await this.model.load();
this.chatClient = this.model.createChatClient();
this.chatClient.settings.temperature = 0.1;&lt;/LI-CODE&gt;
&lt;P&gt;&lt;SPAN style="color: rgb(30, 30, 30);"&gt;This gives the app a better local startup experience. The server can expose a status stream while the model initialises in the background.&lt;/SPAN&gt;&lt;/P&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;User experience and screenshots&lt;/H2&gt;
&lt;P&gt;The client is intentionally simple, which makes it useful during evaluation. You can switch retrieval mode, test questions quickly, and inspect the retrieved sources.&lt;/P&gt;
&lt;FIGURE&gt;&lt;IMG src="https://raw.githubusercontent.com/leestott/local-hybrid-retrival-onnx/main/screenshots/01-landing-page.png" alt="Landing page showing the gas field support agent UI in hybrid mode" /&gt;
&lt;FIGCAPTION&gt;The landing page exposes retrieval mode directly in the UI. That makes it easy to compare lexical, semantic, and hybrid behaviour during testing.&lt;/FIGCAPTION&gt;
&lt;/FIGURE&gt;
&lt;FIGURE&gt;&lt;IMG src="https://raw.githubusercontent.com/leestott/local-hybrid-retrival-onnx/main/screenshots/04-sources-panel.png" alt="Chat response showing sources panel and hybrid retrieval scores" /&gt;
&lt;FIGCAPTION&gt;The sources panel shows grounding evidence and retrieval scores, which is useful when validating whether better answers are coming from better retrieval or just model phrasing.&lt;/FIGCAPTION&gt;
&lt;/FIGURE&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;Best practices for ONNX RAG and Foundry Local&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;Keep lexical fallback alive. Exact identifiers and runtime failures both make this necessary.&lt;/LI&gt;
&lt;LI&gt;Persist sparse and dense features together where possible. It simplifies debugging and operational reasoning.&lt;/LI&gt;
&lt;LI&gt;Use small chunks and conservative &lt;CODE&gt;topK&lt;/CODE&gt; values for local context budgets.&lt;/LI&gt;
&lt;LI&gt;Expose health and status endpoints so users can see when the model is still loading or embeddings are unavailable.&lt;/LI&gt;
&lt;LI&gt;Test retrieval quality separately from generation quality.&lt;/LI&gt;
&lt;LI&gt;Pin and validate native runtime dependencies, especially ONNX Runtime, before tuning prompts.&lt;/LI&gt;
&lt;/UL&gt;
&lt;DIV class="note"&gt;Practical warning: this repository already shows why runtime validation matters. A local app can ingest documents successfully and still fail at model initialisation if the native runtime stack is misaligned.&lt;/DIV&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;How this compares with RAG and CAG&lt;/H2&gt;
&lt;P&gt;The strongest value in this sample comes from where it sits between a basic local RAG baseline and a curated CAG design.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;DIV class="styles_lia-table-wrapper__h6Xo9 styles_table-responsive__MW0lN"&gt;&lt;table border="1" style="border-width: 1px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Dimension&lt;/th&gt;&lt;th&gt;Classic local RAG&lt;/th&gt;&lt;th&gt;This hybrid ONNX RAG sample&lt;/th&gt;&lt;th&gt;CAG&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Context assembly&lt;/td&gt;&lt;td&gt;Retrieve chunks at query time, often lexically, then inject them into the prompt.&lt;/td&gt;&lt;td&gt;Retrieve chunks at query time with lexical, semantic, or fused scoring, then inject the strongest results into the prompt.&lt;/td&gt;&lt;td&gt;Use a prepared or cached context pack instead of fresh retrieval for every request.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Main strength&lt;/td&gt;&lt;td&gt;Easy to implement and easy to explain.&lt;/td&gt;&lt;td&gt;Better recall for paraphrases without giving up exact match behaviour or offline execution.&lt;/td&gt;&lt;td&gt;Predictable prompts and low query time overhead.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Main weakness&lt;/td&gt;&lt;td&gt;Misses synonyms and natural language reformulations.&lt;/td&gt;&lt;td&gt;More moving parts, larger local asset footprint, and native runtime compatibility to manage.&lt;/td&gt;&lt;td&gt;Coverage depends on curation quality and goes stale more easily.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Failure behaviour&lt;/td&gt;&lt;td&gt;Weak retrieval leads to weak grounding.&lt;/td&gt;&lt;td&gt;Semantic failure can degrade to lexical retrieval if designed properly, which this sample does.&lt;/td&gt;&lt;td&gt;Prepared context can be too narrow for new or unexpected questions.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Best fit&lt;/td&gt;&lt;td&gt;Simple local assistants and proof of concept systems.&lt;/td&gt;&lt;td&gt;Offline copilots and technical assistants that need stronger recall across varied phrasing.&lt;/td&gt;&lt;td&gt;Stable workflows with tightly bounded, curated knowledge.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;colgroup&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;col style="width: 25.00%" /&gt;&lt;/colgroup&gt;&lt;/table&gt;&lt;/DIV&gt;
&lt;H3&gt;Samples&lt;/H3&gt;
&lt;P&gt;Related samples:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;- Foundry Local RAG - &lt;A class="lia-external-url" href="https://github.com/leestott/local-rag" target="_blank"&gt;https://github.com/leestott/local-rag&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;- Foundry Local CAG - &lt;A class="lia-external-url" href="https://github.com/leestott/local-cag" target="_blank"&gt;https://github.com/leestott/local-cag&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;- Foundry Local hybrid-retrival-onnx &lt;A class="lia-external-url" href="https://github.com/leestott/local-hybrid-retrival-onnx" target="_blank"&gt;https://github.com/leestott/local-hybrid-retrival-onnx&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;H3&gt;Specific benefits of this hybrid approach over classic RAG&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;It captures paraphrased questions that lexical search would often miss.&lt;/LI&gt;
&lt;LI&gt;It still preserves exact match performance for codes, terms, and product names.&lt;/LI&gt;
&lt;LI&gt;It gives operators a controlled degradation path when the semantic stack is unavailable.&lt;/LI&gt;
&lt;LI&gt;It stays local and inspectable without introducing a separate hosted vector service.&lt;/LI&gt;
&lt;/UL&gt;
&lt;H3&gt;Specific differences from CAG&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;CAG shifts effort into context curation before the request. This sample retrieves evidence dynamically at runtime.&lt;/LI&gt;
&lt;LI&gt;CAG can be faster for fixed workflows, but it is usually less flexible when the document set changes.&lt;/LI&gt;
&lt;LI&gt;This hybrid RAG design is better suited to open ended knowledge search and growing document collections.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;What to validate before shipping&lt;/H2&gt;
&lt;UL&gt;
&lt;LI&gt;Measure retrieval quality in each mode using exact term, acronym, and paraphrase queries.&lt;/LI&gt;
&lt;LI&gt;Check that sources shown in the UI reflect genuinely distinct evidence, not repeated chunks.&lt;/LI&gt;
&lt;LI&gt;Confirm the application remains usable when semantic retrieval is unavailable.&lt;/LI&gt;
&lt;LI&gt;Verify ONNX Runtime compatibility on the real target machines, not only on the development laptop.&lt;/LI&gt;
&lt;LI&gt;Test model download, cache, and startup behaviour with a clean environment.&lt;/LI&gt;
&lt;/UL&gt;
&lt;/SECTION&gt;
&lt;SECTION&gt;
&lt;H2&gt;Final take&lt;/H2&gt;
&lt;P&gt;For developers getting started with ONNX RAG and Foundry Local, this sample is a good technical reference because it demonstrates a realistic local architecture rather than a minimal demo. It shows how to build a grounded assistant that remains offline, supports multiple retrieval modes, and fails gracefully.&lt;/P&gt;
&lt;P&gt;Compared with classic local RAG, the hybrid design provides better recall and better resilience. Compared with CAG, it remains more flexible for changing document sets and less dependent on pre curated context packs. If you want a practical starting point for offline grounded AI on developer workstations or edge devices, this is the most balanced pattern in the repository set.&lt;/P&gt;
&lt;/SECTION&gt;
&lt;/ARTICLE&gt;
&lt;/MAIN&gt;</description>
      <pubDate>Thu, 26 Mar 2026 07:00:00 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/build-an-offline-hybrid-rag-stack-with-onnx-and-foundry-local/ba-p/4503589</guid>
      <dc:creator>Lee_Stott</dc:creator>
      <dc:date>2026-03-26T07:00:00Z</dc:date>
    </item>
    <item>
      <title>Step-by-Step: Deploy the Architecture Review Agent Using AZD AI CLI</title>
      <link>https://techcommunity.microsoft.com/t5/educator-developer-blog/step-by-step-deploy-the-architecture-review-agent-using-azd-ai/ba-p/4504460</link>
      <description>&lt;P&gt;Hey everyone! I am &lt;A class="lia-external-url" href="https://linkedin.com/in/shivam2003" target="_blank" rel="noopener"&gt;Shivam Goyal&lt;/A&gt;, a Microsoft MVP, and I am super excited to share a project that will save you a massive amount of time.&lt;/P&gt;
&lt;P&gt;Have you ever built a brilliant AI agent in an afternoon, only to spend the next two weeks fighting with Docker containers, memory persistence, and cloud deployment scripts to get it running?&lt;/P&gt;
&lt;P&gt;In our &lt;A href="https://techcommunity.microsoft.com/blog/educatordeveloperblog/stop-drawing-architecture-diagrams-manually-meet-the-open-source-ai-architecture/4496271" target="_blank" rel="noopener"&gt;previous post&lt;/A&gt;, we introduced the &lt;STRONG&gt;Architecture Review Agent&lt;/STRONG&gt;, an open-source tool built on Microsoft Foundry that automatically converts messy architectural notes into structured risk assessments and interactive Excalidraw diagrams.&lt;/P&gt;
&lt;P&gt;But building an AI agent is only half the battle. &lt;EM&gt;Iterating&lt;/EM&gt; on one and actually getting it running in a production-grade environment without losing your mind over infrastructure is a completely different story.&lt;/P&gt;
&lt;H5&gt;&lt;STRONG&gt;The Problem with Agentic Development Loops&lt;/STRONG&gt;&lt;/H5&gt;
&lt;P&gt;The typical agent development loop is painful: you write your agent code, test it by copy-pasting inputs into a local REPL, manually build a container, push it to a registry, configure RBAC, deploy to your cloud target, realize you need to tweak three lines of logic, and start the whole cycle over again.&lt;/P&gt;
&lt;P&gt;You often end up with an agent that is 100 lines of clean Python, surrounded by 400 lines of Bicep and a 12-step deployment guide.&lt;/P&gt;
&lt;P&gt;The azd ai extension for the &lt;STRONG&gt;Azure Developer CLI (AZD)&lt;/STRONG&gt; completely changes this equation. For the Architecture Review Agent, the entire workflow, from zero infrastructure to a live hosted agent you can invoke from the command line, is just a few simple commands. And moving from local testing to a live cloud deployment is a single azd up.&lt;/P&gt;
&lt;P&gt;Here is how you can set up, invoke, and deploy your own Architecture Review Agent, and even publish it to Microsoft Teams, without needing a tenant admin.&lt;/P&gt;
&lt;H4&gt;&lt;STRONG&gt;Step 1: The Setup (No heavy lifting required)&lt;/STRONG&gt;&lt;/H4&gt;
&lt;P&gt;First, make sure you have the Azure Developer CLI installed and grab the AI Agents extension.&lt;/P&gt;
&lt;LI-CODE lang="powershell"&gt;# Install AZD
winget install microsoft.azd 

# Install the AI Agents extension 
azd extension install azure.ai.agents&lt;/LI-CODE&gt;
&lt;P&gt;Next, clone the repository and set up your local Python environment:&lt;/P&gt;
&lt;LI-CODE lang="powershell"&gt;git clone https://github.com/Azure-Samples/agent-architecture-review-sample
cd agent-architecture-review-sample

python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt&lt;/LI-CODE&gt;
&lt;P&gt;Finally, authenticate and tell AZD where your Microsoft Foundry project lives:&lt;/P&gt;
&lt;LI-CODE lang="powershell"&gt;azd auth login
azd env new arch-review-dev

# Point it to your Foundry Project and Model
azd env set AZURE_AI_PROJECT_ENDPOINT "https://&amp;lt;your-resource&amp;gt;.services.ai.azure.com/api/projects/&amp;lt;your-project&amp;gt;"
azd env set AZURE_AI_MODEL_DEPLOYMENT_NAME "gpt-4.1"&lt;/LI-CODE&gt;
&lt;H4&gt;&lt;STRONG&gt;Step 2: Run and Invoke Locally&lt;/STRONG&gt;&lt;/H4&gt;
&lt;P&gt;With the &lt;A class="lia-external-url" href="https://marketplace.visualstudio.com/items?itemName=ms-azuretools.azure-dev" target="_blank" rel="noopener"&gt;AZD AI extension&lt;/A&gt;, you get a local server that behaves &lt;EM&gt;identically&lt;/EM&gt; to a deployed Foundry-hosted agent. It uses the same localhost:8088 endpoint, the same OpenAI Responses API protocol, and the same conversation persistence.&lt;/P&gt;
&lt;P&gt;Open your first terminal and start the runtime:&lt;/P&gt;
&lt;LI-CODE lang="powershell"&gt;azd ai agent run&lt;/LI-CODE&gt;
&lt;P&gt;Now, open a second terminal. This is where the magic happens. The agent is completely format-agnostic. There is no schema you have to memorize. You can pass it a file, or just type out a whiteboard brain-dump inline.&lt;/P&gt;
&lt;P&gt;Here is what the terminal experience looks like when running these commands and getting the structured report back:&lt;/P&gt;
&lt;img&gt;&lt;EM&gt;Styled terminal showing all 3 azd ai agent invoke commands + full structured report output&lt;/EM&gt;&lt;/img&gt;
&lt;P&gt;Here are the three ways you can invoke it using "azd ai agent invoke --local":&lt;/P&gt;
&lt;H5&gt;&lt;STRONG&gt;Pattern A: The Structured YAML&lt;/STRONG&gt;&lt;/H5&gt;
&lt;P&gt;If your team uses formal definitions, just point the agent to the file. The rule-based parser handles this instantly without an LLM call.&lt;/P&gt;
&lt;LI-CODE lang="powershell"&gt;azd ai agent invoke --local "scenarios/ecommerce.yaml"&lt;/LI-CODE&gt;&lt;img&gt;&lt;EM&gt;12-component ecommerce architecture diagram generated from YAML input.&lt;/EM&gt;&lt;/img&gt;
&lt;H5&gt;&lt;STRONG&gt;Pattern B: The Whiteboard Brain-Dump (Inline Arrow Notation)&lt;/STRONG&gt;&lt;/H5&gt;
&lt;P&gt;Arrow notation (A -&amp;gt; B -&amp;gt; C) is how engineers actually communicate on whiteboards and in Slack. Before now, this wasn't a valid input for architecture tools.&lt;/P&gt;
&lt;LI-CODE lang="powershell"&gt;azd ai agent invoke --local "LB -&amp;gt; 3 API servers -&amp;gt; PostgreSQL primary with read replica -&amp;gt; Redis cache"&lt;/LI-CODE&gt;
&lt;P&gt;The parser automatically extracts the replica count, infers the component types (LB becomes a Gateway), and builds a valid connection graph, surfacing single points of failure instantly.&lt;/P&gt;
&lt;H5&gt;&lt;STRONG&gt;Pattern C: The Markdown Design Doc&lt;/STRONG&gt;&lt;/H5&gt;
&lt;P&gt;Just point it to your existing READMEs or design docs.&lt;/P&gt;
&lt;LI-CODE lang="powershell"&gt;azd ai agent invoke --local "scenarios/event_driven.md"&lt;/LI-CODE&gt;&lt;img&gt;&lt;EM&gt;8-component event-driven streaming architecture generated from Markdown input&lt;/EM&gt;&lt;/img&gt;
&lt;P&gt;For all three patterns, the agent returns a structured Markdown report in your terminal and generates an interactive architecture.excalidraw file and a high-res PNG right in your local /output folder.&lt;/P&gt;
&lt;H4&gt;&lt;STRONG&gt;Step 3: One Command to the Cloud&lt;/STRONG&gt;&lt;/H4&gt;
&lt;P&gt;When you are happy with how your agent performs locally, it's time to deploy. Forget manual Docker builds and complex credential management.&lt;/P&gt;
&lt;LI-CODE lang="powershell"&gt;azd up&lt;/LI-CODE&gt;
&lt;P&gt;This single command orchestrates everything:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;Provisions Infrastructure&lt;/STRONG&gt;: Creates your Foundry AI Services account, ACR, App Insights, and managed identities with proper RBAC.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Builds and Pushes&lt;/STRONG&gt;: Packages your Dockerfile and pushes the container image to ACR.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Deploys the Agent&lt;/STRONG&gt;: Registers the image and creates a hosted agent version in Foundry Agent Service.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;The output will hand you a live Agent Playground URL and a production-ready API endpoint. Your agent now automatically scales from 0 to 5 replicas, manages its own conversation state, and authenticates securely via Managed Identity.&lt;/P&gt;
&lt;H4&gt;&lt;STRONG&gt;Step 4: Publish to Teams and M365 Copilot (Zero Admin Required!)&lt;/STRONG&gt;&lt;/H4&gt;
&lt;P&gt;Having an API is great, but agents are most powerful when they live where your users collaborate. You can publish this agent directly to Microsoft Teams and M365 Copilot natively from the Foundry portal.&lt;/P&gt;
&lt;P&gt;The best part? You can use the &lt;STRONG&gt;Individual Scope&lt;/STRONG&gt;.&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Go to the Microsoft Foundry portal and find your deployed agent.&lt;/LI&gt;
&lt;LI&gt;Click &lt;STRONG&gt;Publish to Teams and Microsoft 365 Copilot&lt;/STRONG&gt;.&lt;/LI&gt;
&lt;LI&gt;Fill out the basic metadata (Name, Description).&lt;/LI&gt;
&lt;LI&gt;Select the &lt;STRONG&gt;Individual scope&lt;/STRONG&gt;.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Because you are using an individual scope, &lt;STRONG&gt;no M365 admin approval is required&lt;/STRONG&gt;. The portal automatically provisions the Azure Bot Service, packages the metadata, and registers the app. Within minutes, your agent will appear in your Teams Copilot agent store. You can generate a share link and instantly send it to your team for a workshop or demo.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;What I Learned Building This Workflow&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Shifting from custom deployment scripts to the azd ai CLI taught me three things:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;STRONG&gt;The declarative contract is beautifully clean.&lt;/STRONG&gt; Our azure.yaml declares the agent and infrastructure in about 30 lines. azd up translates that into a fully secure, production-grade Foundry environment.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;The local-to-cloud gap is finally gone.&lt;/STRONG&gt; The azd ai agent run behaves exactly like the cloud. The invocation you write locally works identically against the deployed endpoint.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Teams publishing is remarkably simple.&lt;/STRONG&gt; I expected bot registration nightmares and tenant admin blockers. Instead, I filled out a form, waited two minutes, and was chatting with my architecture agent in Teams.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&lt;STRONG&gt;Resources &amp;amp; Next Steps&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Now that we have a streamlined, single-hosted agent deployment, the natural next step is &lt;STRONG&gt;multi-agent orchestration&lt;/STRONG&gt;. Imagine a triage agent that routes your design doc to a dedicated Security Reviewer Agent and a Scalability Reviewer Agent.&lt;/P&gt;
&lt;P&gt;Try it out yourself by cloning the repository, running azd up, and let me know what you build!&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;GitHub Repository:&lt;/STRONG&gt; &lt;A href="https://github.com/Azure-Samples/agent-architecture-review-sample" target="_blank" rel="noopener"&gt;Azure-Samples/agent-architecture-review-sample&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Previous Article:&lt;/STRONG&gt; &lt;A href="https://techcommunity.microsoft.com/blog/educatordeveloperblog/stop-drawing-architecture-diagrams-manually-meet-the-open-source-ai-architecture/4496271" target="_blank" rel="noopener" data-lia-auto-title-active="0" data-lia-auto-title="Stop Drawing Architecture Diagrams Manually: Meet the Open-Source AI Architecture Review Agents | Microsoft Community Hub"&gt;Stop Drawing Architecture Diagrams Manually: Meet the Open-Source AI Architecture Review Agents | Microsoft Community Hub&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Microsoft Learn:&lt;/STRONG&gt; &lt;A href="https://learn.microsoft.com/azure/developer/azure-developer-cli/install-azd" target="_blank" rel="noopener" data-lia-auto-title-active="0" data-lia-auto-title="Install the Azure Developer CLI"&gt;Install the Azure Developer CLI&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Microsoft Foundry Documentation:&lt;/STRONG&gt; &lt;A href="https://learn.microsoft.com/azure/ai-foundry/agents/concepts/hosted-agents?view=foundry" target="_blank" rel="noopener" data-lia-auto-title-active="0" data-lia-auto-title="Hosted agents in Foundry Agent Service (preview) - Microsoft Foundry"&gt;Hosted agents in Foundry Agent Service (preview) - Microsoft Foundry&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Tue, 24 Mar 2026 07:00:00 GMT</pubDate>
      <guid>https://techcommunity.microsoft.com/t5/educator-developer-blog/step-by-step-deploy-the-architecture-review-agent-using-azd-ai/ba-p/4504460</guid>
      <dc:creator>ShivamGoyal</dc:creator>
      <dc:date>2026-03-24T07:00:00Z</dc:date>
    </item>
  </channel>
</rss>

