learning
766 TopicsBuilding AI Agents with Microsoft Foundry: A Progressive Lab from Hello World to Self-Hosted
AI agent development has a steep on-ramp. The combination of new SDKs, tool-calling patterns, model selection decisions, retrieval-augmented generation, and deployment concerns means most developers spend more time wiring things together than actually building anything useful. The Microsoft Foundry Agent Lab is a structured, open-source demo series designed to change that — nine self-contained demos, each adding exactly one new concept, all built on the same Microsoft Foundry SDK and a single model deployment. This post walks through what the lab contains, how each demo works under the hood, and the architectural decisions that make it a useful reference for AI engineers building production agents. Why a Progressive Lab? Agent frameworks can be overwhelming. A developer who opens a rich example with RAG, tool-calling, streaming, and a custom UI all at once has no clear line of sight to which parts are essential and which are embellishments. The Foundry Agent Lab takes the opposite approach: start with the absolute minimum and introduce one new primitive per demo. By the time you reach Demo 8, you have seen every major capability — not in one monolithic sample, but in a layered sequence where each addition is visible and understandable. # Demo New Concept Tool Used UX 0 hello-demo Agent creation, Responses API, conversations None Terminal 1 tools-demo Function calling, tool-calling loop, live API FunctionTool Terminal 2 desktop-demo UI decoupling — same agent, different surface None Desktop (Tkinter) 3 websearch-demo Server-side built-in tools, no client loop WebSearchTool Terminal 4 code-demo Code execution in sandbox, Gradio web UI CodeInterpreterTool Web (Gradio) 5 rag-demo Document upload, vector stores, RAG grounding FileSearchTool Terminal 6 mcp-demo MCP servers, human-in-the-loop approval MCPTool Terminal 7 toolbox-demo Centralized tool governance, Toolbox versioning Toolbox Terminal 8 hosted-demo Self-hosted agent with Responses protocol Custom server Terminal + Agent Inspector The Model Router: One Deployment to Rule Them All Before diving into the demos, it is worth understanding the one architectural decision that ties the entire lab together: every agent uses model-router as its model deployment. MODEL_DEPLOYMENT=model-router Model Router is a Microsoft Foundry capability that inspects each request at inference time and routes it to the optimal available model — weighing task complexity, cost, and latency. A simple factual question goes to a fast, cheap model. A complex tool-calling chain with code generation gets routed to a frontier model. You write zero routing logic. The lab's MODEL-ROUTER.md file contains empirical observations from running all nine demos. A sample of what the router selected: Demo Query Task Type Model Selected hello "What's the capital of WA state?" Factual recall grok-4-1-fast-reasoning hello "Summarize our conversation" Summarization gpt-5.2-chat-2025-12-11 tools "What's the weather in Seattle?" Tool-using gpt-5.4-mini-2026-03-17 code Data analysis with code generation Code generation + execution gpt-5.4-2026-03-05 rag HR policy document question Retrieval + synthesis gpt-5.3-chat-2026-03-03 This is the strongest signal in the lab: you do not need to reason about model selection. You declare what your agent needs to do; the router handles the rest, and it chooses correctly. Demo 0: The Minimum Viable Agent The hello-demo establishes the baseline pattern used by every subsequent demo. Two files: one to register the agent, one to chat with it. Registering the agent from azure.identity import DefaultAzureCredential from azure.ai.projects import AIProjectClient from azure.ai.projects.models import PromptAgentDefinition credential = DefaultAzureCredential() project = AIProjectClient(endpoint=PROJECT_ENDPOINT, credential=credential) agent = project.agents.create_version( agent_name=AGENT_NAME, definition=PromptAgentDefinition( model=MODEL_DEPLOYMENT, instructions="You are a helpful, friendly assistant.", ), ) Authentication uses DefaultAzureCredential , which works with az login locally and with managed identity in production — no API keys anywhere in the code. Chatting with the agent # Create a server-side conversation (persists history across turns) conversation = openai.conversations.create() # Each turn sends the user message; the agent sees full history response = openai.responses.create( input=user_input, conversation=conversation.id, extra_body={"agent_reference": {"name": AGENT_NAME, "type": "agent_reference"}}, ) print(response.output_text) The conversation object is server-side. You pass its ID on every turn; the history lives in Foundry, not in a local list. This is the Responses API pattern — distinct from the older Completions or Chat Completions APIs. Demo 1: Function Tools and the Tool-Calling Loop Demo 1 adds function calling against a real weather API. The key insight here is that the model does not execute the function — it requests the execution, and your code executes it locally, then feeds the result back. Declaring a function tool from azure.ai.projects.models import FunctionTool, PromptAgentDefinition func_tool = FunctionTool( name="get_weather", description="Get the current weather for a given city.", parameters={ "type": "object", "properties": {"city": {"type": "string", "description": "City name"}}, "required": ["city"], }, strict=True, ) agent = project.agents.create_version( agent_name=AGENT_NAME, definition=PromptAgentDefinition( model=MODEL_DEPLOYMENT, tools=[func_tool], instructions="You are a weather assistant...", ), ) The tool-calling loop response = openai.responses.create(input=user_input, conversation=conversation.id, ...) # Loop while the model is requesting tool calls while any(item.type == "function_call" for item in response.output): input_list = [] for item in response.output: if item.type == "function_call": args = json.loads(item.arguments) result = get_weather(args["city"]) # execute locally input_list.append(FunctionCallOutput(call_id=item.call_id, output=result)) # Send results back to the agent response = openai.responses.create(input=input_list, conversation=conversation.id, ...) print(response.output_text) The strict=True parameter on FunctionTool enforces structured outputs — the model must return arguments that match the declared JSON schema exactly. This eliminates argument parsing errors in production. Demo 2: UI Is Not Your Agent Demo 2 runs the exact same agent as Demo 1 but surfaces it in a Tkinter desktop window. The point is pedagogical: your agent definition, conversation management, and tool-calling logic are entirely independent of your UI layer. Swapping from terminal to desktop requires changing only the presentation code — nothing in the agent or conversation path changes. This is a principle worth internalising early: agent logic and UI logic should never be entangled. The lab enforces this separation structurally. Demo 3: Server-Side Built-In Tools The web search demo introduces a sharp contrast with Demo 1. With WebSearchTool , the tool-calling loop disappears entirely from client code: from azure.ai.projects.models import WebSearchTool agent = project.agents.create_version( agent_name="Search-Agent", definition=PromptAgentDefinition( model=MODEL_DEPLOYMENT, tools=[WebSearchTool()], instructions="You are a research assistant...", ), ) The agent decides when to search, executes the search server-side, and returns a grounded response with citations. Your client code looks identical to Demo 0 — a simple responses.create() call with no tool loop. The distinction matters architecturally: Function tools (Demo 1) — tool execution happens on your client; you control the code, the API call, the error handling. Built-in tools (Demo 3+) — tool execution happens inside Foundry; you get results without managing execution. Demo 4: Code Interpreter and the Gradio Web UI Demo 4 attaches CodeInterpreterTool , which gives the agent a sandboxed Python execution environment inside Foundry. The agent can write code, run it, observe output, and iterate — all server-side. Combined with a Gradio web interface, this demo shows an agent that can perform data analysis, generate charts, and explain results through a browser UI. Model Router is particularly interesting here: the empirical data shows it selects a more capable frontier model ( gpt-5.4-2026-03-05 ) for code-generation tasks, while simpler conversational turns stay on lighter models. Demo 5: Retrieval-Augmented Generation with FileSearchTool Demo 5 introduces RAG. The setup phase uploads a document, creates a vector store, and attaches it to the agent: # Upload document and create a vector store vector_store = openai.vector_stores.create(name="employee-handbook-store") with open("data/employee-handbook.md", "rb") as f: openai.vector_stores.files.upload_and_poll( vector_store_id=vector_store.id, file=f ) # Attach the vector store to the agent agent = project.agents.create_version( agent_name="RAG-Agent", definition=PromptAgentDefinition( model=MODEL_DEPLOYMENT, tools=[FileSearchTool(vector_store_ids=[vector_store.id])], instructions="Answer questions using only the provided documents...", ), ) At query time, the agent embeds the question, searches the vector store semantically, retrieves matching chunks, and generates an answer grounded in the retrieved content — entirely server-side. The client code remains a plain responses.create() call. An important detail: the .vector_store_id file is written to disk during setup and read back during the chat session, so the demo survives process restarts without re-uploading the document. The .gitignore excludes this file from source control. Demo 6: Model Context Protocol Demo 6 connects the agent to a GitHub MCP server, giving it access to repository and issue data via the open Model Context Protocol standard. MCP servers expose tools over a standardised wire protocol; the agent discovers and calls them without any client-side function declarations. The demo also demonstrates human-in-the-loop approval: before executing any MCP tool call, the agent surfaces the proposed action and waits for the user to confirm. This is an important safety pattern for agents that can trigger side effects on external systems. Demo 7: Toolbox — Centralised Tool Governance Where Demo 6 connects to a single MCP server directly, Demo 7 uses a Toolbox — a managed Microsoft Foundry resource that bundles multiple tools into a single, versioned, MCP-compatible endpoint. The Toolbox in this demo exposes both GitHub Issues and GitHub Repos tools, curated into an immutable versioned snapshot. This pattern is significant for production multi-agent systems: Centralised governance — one team owns the tool definitions; all agents consume them via a single endpoint. Versioned snapshots — promoting a new Toolbox version is explicit; agents pin to a version and upgrade intentionally. MCP compatibility — any MCP-capable agent or framework can connect, not just Foundry SDK agents. from azure.ai.projects.models import McpTool toolbox_tool = McpTool( server_label="toolbox", server_url=TOOLBOX_ENDPOINT, allowed_tools=[], # empty = all tools in the Toolbox version headers={"Authorization": f"Bearer {token}"}, ) Demo 8: Self-Hosted Agent with the Responses Protocol The final demo departs from the prompt-agent pattern. Instead of registering a declarative agent in Foundry, Demo 8 implements a custom agent server using the Responses protocol. The server exposes a streaming HTTP endpoint; Foundry's Agent Inspector can connect to it and route user turns to it just as it would to a hosted prompt agent. This demo includes a Dockerfile and an agent.yaml , enabling deployment to Foundry's container hosting service. It uses gpt-4.1-mini directly rather than the model router, because the custom server owns the entire inference path. When to consider this pattern: Your agent requires custom pre- or post-processing logic that cannot be expressed in a system prompt. You need to integrate with infrastructure that is not reachable through MCP or built-in tools. You want to own the inference call for cost control, A/B testing, or compliance reasons. You are building a multi-agent orchestrator that needs to expose itself as an agent to other orchestrators. Getting Started The lab requires Python 3.10 or higher, an Azure subscription with a Microsoft Foundry project, and the Azure CLI. 1. Clone and set up the virtual environment git clone https://github.com/microsoft-foundry/Foundry-Agent-Lab.git cd Foundry-Agent-Lab # Create and activate the virtual environment python -m venv .venv # Windows Command Prompt .venv\Scripts\activate.bat # Windows PowerShell .venv\Scripts\Activate.ps1 # macOS / Linux source .venv/bin/activate pip install -r requirements.txt 2. Configure a demo copy hello-demo\.env.sample hello-demo\.env # Edit hello-demo\.env and set PROJECT_ENDPOINT Your PROJECT_ENDPOINT is on the Overview page of your Foundry project in the Azure portal. It takes the form https://your-resource.ai.azure.com/api/projects/your-project . 3. Run the demo az login 0-hello-demo Each numbered batch file at the root activates the virtual environment, runs create_agent.py , and launches chat.py . Append log to capture the full session transcript: 0-hello-demo log Reset between runs hello-demo\reset.bat Every demo includes a reset.bat that deletes the registered agent and any associated resources (vector stores, uploaded files). Demos are fully repeatable. Architecture Principles Demonstrated Across the nine demos, the lab illustrates a set of design principles that apply directly to production agent systems: Keyless authentication throughout Every demo uses DefaultAzureCredential . No API keys appear anywhere in the code. Locally, az login provides credentials. In production, managed identity takes over automatically — same code, no secrets to rotate. Server-side conversation state The Responses API stores conversation history server-side. Your application passes a conversation ID; Foundry maintains the thread. This eliminates the common bug of truncating history due to local list management and makes multi-process or multi-instance deployments straightforward. Client-side vs server-side tool execution The lab makes the distinction explicit. Function tools execute in your process — you control the code, the external call, and the error handling. Built-in tools (WebSearch, CodeInterpreter, FileSearch) execute inside Foundry — you get results without managing execution infrastructure. MCP tools (Demo 6, 7) fall between these: they execute in a separately deployed server, with the protocol mediating the call. Progressive tool introduction Each demo's create_agent.py registers the agent once. The chat.py file handles the conversation loop. These two responsibilities are always separate, making it easy to update agent definitions without modifying conversation logic, and vice versa. Security Considerations When building agents for production, keep the following in mind: Never commit .env files. The .gitignore excludes them, but verify this before pushing. Use Azure Key Vault or environment variable injection in CI/CD pipelines. Use managed identity in production. DefaultAzureCredential automatically picks up managed identity when deployed to Azure, eliminating the need for any stored credentials. Apply human-in-the-loop for side-effecting tools. Demo 6 demonstrates this pattern for MCP tool calls. Any agent that can modify external state (create issues, send emails, write files) should surface proposed actions for confirmation. Validate tool outputs before use. Treat data returned by external tools (weather APIs, search results, document retrieval) as untrusted input. Prompt injection through tool results is a real attack surface; grounding instructions in your system prompt reduce but do not eliminate this risk. Scope Toolbox permissions narrowly. When using a Toolbox (Demo 7), use allowed_tools to restrict which tools the agent can call, rather than granting access to all tools in a Toolbox version. Key Takeaways Start with the minimum. A prompt agent with no tools requires fewer than 30 lines of code using the Foundry SDK. Add tools only when the use case demands them. Use model-router unless you have a specific reason not to. The empirical data in the lab shows the router selects appropriate models across all task types — factual, creative, tool-calling, RAG, and code generation. Understand the client/server tool boundary. Function tools give you control; built-in tools give you simplicity. MCP and Toolbox give you governance and interoperability. Choose based on where you need control and where you need scale. Conversation state belongs on the server. Do not maintain conversation history in application memory if you can avoid it. The Responses API conversation object is designed for this. The hosted-demo pattern is for when you need to own the inference path. For most use cases, a declarative prompt agent is sufficient and far simpler to operate. Next Steps Explore the repo: github.com/microsoft-foundry/Foundry-Agent-Lab Microsoft Foundry SDK documentation: learn.microsoft.com/azure/ai-studio/ Responses API quickstart: Prompt agent quickstart Model Router conceptual documentation: Model Router for Microsoft Foundry Model Context Protocol: modelcontextprotocol.io Azure Identity SDK (DefaultAzureCredential): azure-identity Python SDK The Foundry Agent Lab is open source under the MIT licence. Contributions, bug reports, and feature requests are welcome through GitHub Issues. See CONTRIBUTING.md for guidelines.Agents League: The Esports-Inspired Hackathon Where AI Agents Battle for Glory
Ready to put your AI skills to the ultimate test? Agents League is here, a dynamic, esports-inspired developer challenge that brings the thrill of live competition to the world of agentic AI. Whether you're a seasoned AI developer or just getting started, this is your chance to build, compete, and win. What is Agents League? Agents League is a week-long hackathon running as part of AI Skills Fest (June 4–14, 2026). Unlike traditional hackathons, Agents League combines live AI coding battles, asynchronous project submissions, and a thriving Discord community all competing for a total prize pool of $55,000 USD. This isn't just about building it's about showcasing what's possible with agentic AI in a format that's fast, competitive, and globally accessible. Three Challenge Tracks Pick One or Compete in All 1. Creative Apps Build innovative applications using GitHub Copilot for AI-assisted development. Show off your creativity and demonstrate how AI can accelerate app creation from concept to code. 2. Reasoning Agents Create intelligent agents using Microsoft Foundry that solve complex problems through multi-step reasoning. This track is all about building agents that can think, plan, and execute. 3. Enterprise Agents Build business-ready knowledge agents integrated with Microsoft 365 Copilot, authored in Copilot Studio. Perfect for developers focused on real-world enterprise solutions. Live Microsoft Reactor Events—Don't Miss the Battles! The heart of Agents League beats through live Microsoft Reactor events. Watch experts go head-to-head in live coding battles, learn cutting-edge techniques, and get inspired for your own submissions: Event What You'll Learn Creative Apps Battle See GitHub Copilot in action as experts build innovative apps live Reasoning Agents Battle Watch multi-step reasoning agents come to life with Microsoft Foundry Enterprise Agents Battle Learn to build M365-integrated agents with Copilot Studio 👉 View the full event series Key Dates Registration Deadline: June 12, 2026, 12:00 PM PT Hacking Period: June 4–14, 2026 Submission Deadline: June 14, 2026, 11:59 PM PT What You Get Live coding battles with expert demonstrations Curated technical experiences and on-demand content Learning resources on Microsoft Learn and AI Skills Navigator Community support through Discord GitHub-based submissions for transparent, collaborative judging Why Participate? Agents League isn't just another hackathon. It's designed as a streamlined, competitive format that: ✅ Fits into your schedule with focused, time-boxed challenges ✅ Provides real-world product innovation experience ✅ Offers global accessibility—participate from anywhere ✅ Demonstrates the latest capabilities of agentic AI, including new IQ tools ✅ Connects you with a passionate developer community Ready to Enter the Arena? Register Now for Agents League Before you register: Review the Hackathon Rules and Regulations for prize categories and judging criteria Join the Microsoft Reactor event series for live battles and learning Check out the Microsoft Event Code of Conduct Join the Conversation Have questions? Want to connect with fellow competitors? Join the Agents League community on Discord and start strategizing with developers from around the world. Whether you're building creative apps, reasoning agents, or enterprise solutions—the arena awaits. May the best agent win! 🏆 Agents League hackathon is open to the public and offered at no cost. Government employees should check with their employers to ensure participation is permitted in accordance with applicable policies. Related Links: Agents League Hackathon Registration Microsoft Reactor Series AI Skills FestJoin us on June 3rd for the MSLE AI Bootcamps - Copilot Chat Agents for Academics Special Session!
Hi TSP Community, We’re excited to invite you to a special session on June 3rd introducing the upcoming Microsoft Learn for Educators (MSLE) AI Copilot Chat Agents for Academics Bootcamp. This session will provide an early look at a new, scalable skilling experience designed specifically for higher education faculty and administrators—and how you can leverage it in your field engagements. What we’ll cover: Overview of the Copilot Agents for Academics Bootcamp and delivery model How faculty will be enabled to use, integrate, and build Copilot agents for teaching, learning, and operations A walkthrough of the curriculum progression (Agent Foundations → Integration & Autonomy) Examples of hands-on labs, live demos, and role-based scenarios The TSP opportunity: how this standardized, repeatable offering supports AI skilling conversations and drives customer value 📅 Date: June 3rd, 2026 at 7:00 AM PST | 7:30 PM IST 📣 Join Here: https://aka.ms/MSLEAgentsTSPSpecialSession Make sure to download the calendar invite attached! We hope you’ll join to learn how this offering can accelerate your engagements and expand AI skilling across your academic customers.Spec-Driven Development for AI-Enabled Enterprise Systems
Spec-Driven Development for AI-Enabled Enterprise Systems How to make specs the single source of truth for your React frontends, backend services, data, and AI agents. If you are building an enterprise system with a React frontend, backend APIs and services, a database layer, and shared libraries, moving to Spec-Driven Development (SDD) can feel like a big cultural shift. For AI developers and engineers, though, it is a gift: structured, machine-readable specifications are exactly what both humans and AI coding agents need to stay aligned and productive. This post walks through how to structure specs, version contracts, design workflows, and integrate AI agents in a way that scales. Along the way, it references Microsoft’s public guidance on microservices, APIs, DevOps, and architecture so you can go deeper where needed. 1. Structuring specifications for an enterprise system For a serious enterprise system, treat specs as layered and modular rather than a single monolithic document. A good mental model is Domain-Driven Design (DDD) and bounded contexts (see https://learn.microsoft.com/azure/architecture/microservices/model/domain-analysis Business and domain layer This layer is technology-agnostic and captures: Business capabilities and problem statements Domain language and key entities Business rules and workflows Non-functional requirements (performance, security, compliance, SLAs) Solution and architecture layer Here you define how the system is shaped: System context and C4-style diagrams Service boundaries and ownership Integration patterns and event flows Data ownership and high-level models Microsoft’s microservices guidance is a solid reference: https://learn.microsoft.com/azure/architecture/microservices/. Implementation-oriented specs per component For each concrete component, keep a focused spec: Frontend / UI (React): screen catalogue, UX flows, state contracts, API dependencies, validation rules, accessibility and performance requirements. APIs / services: OpenAPI or AsyncAPI contracts, error models, authentication and authorisation, rate limits, SLAs, observability requirements. Database / schema: logical data model, ownership per service, migration strategy, retention, indexing, partitioning. Shared libraries: responsibilities, versioning policy, supported runtimes, compatibility matrix. Integrations: protocols, payloads, sequencing, idempotency, retry and backoff, SLAs, failure modes. In practice, this usually means: One “master” business and architecture spec per domain or product Separate specs per service or module (frontend app, each backend service, shared library, integration) Everything linked via IDs (for example REQ-123, SVC-ORDER-001) so you can trace from requirement to spec, implementation, and tests 2. Templates and standards that scale To keep things consistent across teams, use a base template that all components share, then extend it with technology-specific sections. This works well for both human readers and AI agents consuming the specs. Base specification template Every spec, regardless of component type, should include: Purpose and scope Stakeholders and dependencies Requirements mapping (list of requirement IDs covered) Architecture and interaction overview Contracts (APIs, events, data) Non-functional requirements Risks and open questions Test and acceptance criteria Extended templates per component Frontend: UX flows, wireframes or Figma links, accessibility, performance budgets, offline behaviour, error states. API / service: OpenAPI or AsyncAPI link, auth and authorisation, throttling, logging and metrics, health endpoints. See logging and monitoring guidance at https://learn.microsoft.com/azure/architecture/microservices/logging-monitoring Database: schema definition, migration plan, backup and restore, data lifecycle, multi-tenant strategy. Integration: sequence diagrams, error handling, retry and idempotency, message contracts, security. 3. Contracts, versioning, and change management API contracts For SDD, API contracts are first-class citizens. Define them via OpenAPI or AsyncAPI and treat the spec as the source of truth. Use contract testing to keep providers and consumers aligned, and version APIs explicitly (for example v1, v2) rather than breaking changes in place. Microsoft’s API design guidance is a good starting point: https://learn.microsoft.com/azure/architecture/best-practices/api-design and Azure API Management at https://learn.microsoft.com/azure/api-management/. Database migrations Any spec change that affects data should include a migration plan. Use migration tooling such as EF Core migrations, Flyway, or Liquibase, and treat migration scripts as code. Document backward-compatibility windows so APIs can support both old and new fields for a defined period. Shared DTOs and models Prefer sharing contracts (OpenAPI, JSON Schema) over large shared code libraries. If you must share code, version the shared library independently and document compatibility (for example, “Service A supports SharedLib 2.x”). Keep DTOs at the edges and map to internal domain models inside each service. Cross-service dependencies Capture dependencies explicitly in specs, such as “Order Service depends on Customer v1.3+ for endpoint /customers/{id}”. Use consumer-driven contracts and CI checks to prevent breaking changes. For event-driven systems, document event contracts and evolution rules. See event-driven architecture guidance at https://learn.microsoft.com/azure/architecture/reference-architectures/event-driven/event-driven-architecture-overview. Spec versioning and change management Version specs semantically (for example OrderServiceSpec v1.2.0) and record what changed, why, impact, and migration steps. Link spec versions to releases or tags in Git and to work items in Azure DevOps or GitHub Issues. Azure Boards is useful here: https://learn.microsoft.com/azure/devops/boards/?view=azure-devops. 4. A mature Spec-Driven Development workflow A realistic SDD workflow for AI-enabled teams might look like this: Discovery and domain analysis: capture business capabilities, domain language, and high-level workflows. Business and architecture specs: define bounded contexts, service boundaries, integration patterns, and NFRs. Contract design: design API specs (OpenAPI or AsyncAPI), event schemas, data models, and validation rules. Task generation: derive work items from specs, such as “Implement endpoint X”, “Add migration Y”, “Add UI flow Z”. This is a great place to use AI agents to read specs and generate tasks. Implementation: code is generated or written to satisfy the spec; the spec remains the reference, not the code. Validation and testing: contract tests, unit tests, integration tests, and end-to-end tests all trace back to spec IDs. Use quality gates in CI and CD, as described in Https://learn.microsoft.com/azure/architecture/framework/devops/devops-quality Review and sign-off: architecture and product review against the spec; update the spec if reality diverges. Release and observability: dashboards and alerts tied to specified SLIs and SLOs. 5. Governance, traceability, and avoiding drift Traceability across the lifecycle Use IDs everywhere: requirements, spec sections, tasks, tests, and deployment artefacts. In Azure DevOps or GitHub, link: Requirement (for example Azure DevOps Feature) Spec (stored in the repo) User stories and tasks Pull requests Tests Releases For key decisions, adopt Architecture Decision Records (ADRs). Microsoft’s guidance on ADRs is here: Https://learn.microsoft.com/azure/architecture/framework/devops/adrs Keeping humans and AI agents aligned To avoid implementation drift: Make specs as machine-readable as possible (OpenAPI, JSON Schema, YAML, BPMN). Enforce spec checks in CI: API implementation must match OpenAPI, DB schema must match migration plan, generated clients must be up to date. For AI coding agents, always provide the relevant spec files as context and constrain them to files linked to specific spec IDs. Add automated checks that compare generated code to contracts and fail builds when they diverge. 6. Enterprise best practices for repos and governance Example repository structure /docs /business /architecture /decisions (ADRs) /specs /frontend /services /orders /customers /integrations /data /src /frontend /services /shared /tests /ops /pipelines /infra-as-code Governance practices An architecture review group that reviews spec changes, not just code changes. Definition of Done includes: spec updated, tests linked, contracts validated. Regular “spec health” reviews to identify what is out of date or drifting. For broader architectural guidance, see: Azure microservices and DDD: https://learn.microsoft.com/azure/architecture/microservices/ Cloud design patterns: https://learn.microsoft.com/azure/architecture/patterns/ Azure Well-Architected Framework: https://learn.microsoft.com/azure/well-architected/ 7. Integrating AI and agentic workflows into SDD Spec-Driven Development is a natural fit for AI and multi-agent systems because specs provide structured, reliable context. Here are some practical patterns. LangGraph and multi-agent orchestration using Microsoft Agent Framework You can design a graph where: A “spec agent” reads and validates specs. An “implementation agent” writes or updates code based on those specs. A “test agent” generates tests from contracts and acceptance criteria. The graph flow can mirror your SDD workflow: Spec → Contract → Code → Tests → Review, with each agent responsible for a stage. MCP (Model Context Protocol) Expose your spec repository, OpenAPI definitions, and ADRs as MCP tools so agents can query the true source of truth instead of hallucinating. For example, provide a tool that returns the OpenAPI for a given service and version, or a tool that returns the ADRs relevant to a particular domain. Learn more about MCP at https://aka.ms/mcp-for-beginners BPMN and process flows Store BPMN diagrams as part of the spec. Agents can read them to generate workflow code, state machines, or tests. For process-oriented integrations, see Azure Logic Apps guidance at https://learn.microsoft.com/azure/logic-apps/. CI/CD pipelines on Azure In your pipelines, validate that implementation matches the spec: Contract tests for APIs and events Schema checks for databases Linting and static analysis for spec conformance Use pipeline gates to block deployments if contracts or migrations are out of sync. Azure Pipelines https://learn.microsoft.com/azure/devops/pipelines/?view=azure-devops GitHub Agentic Workflow Patterns https://github.github.com/gh-aw/ Where to start The key is not to boil the ocean. Pick one domain, such as “Orders”, and design a thin but end-to-end SDD flow: spec → contract → tasks → code → tests. Run it with your AI agents in the loop, learn where the friction is, and iterate. Once that feels natural, you can roll the patterns out across the rest of your system. For AI developers and engineers, SDD is more than process hygiene. It is how you give your agents high-quality, unambiguous context so they can generate code, tests, and documentation that actually match what the business needs. `269Views1like0CommentsFrom Prompt to Production: Open in VS Code for Terraform in Azure Copilot
We’re excited to introduce a new step in the Terraform on Azure experience: Open in VS Code, now available directly from Azure Copilot in the Azure Portal. This capability helps you move seamlessly from AI‑generated Terraform code to real Azure deployments - within a connected, guided workflow designed for enterprise scenarios. Why This Matters Infrastructure as Code with Terraform is powerful, but moving from generated configuration to a deployed environment typically involves multiple tools and handoffs. Teams need to understand Terraform state, work with remote backends, and integrate their code into version‑controlled CI/CD pipelines - often backed by Terraform Cloud or Azure‑native backends in enterprise environments. Open in VS Code brings these steps together. It bridges the gap between AI‑assisted authoring in the Azure Portal and the real‑world workflows required to validate, manage state, and deploy infrastructure with confidence. Continue Your Workflow in VS Code With Azure Copilot, you can describe your infrastructure in natural language and generate Terraform configurations in seconds. For example: “Create an Azure Container App using Terraform with a managed environment, Log Analytics enabled, and a system‑assigned managed identity to securely pull images from Azure Container Registry.” Copilot generates the Terraform configuration for you. From there, you can select Open full view to enter a full‑screen Terraform editor, and then choose Open in VS Code to launch the configuration in an Azure‑hosted VS Code environment. There’s no need to download files or set up a local development environment. VS Code for the web opens with Azure authentication already configured, along with commonly used extensions, so you can immediately focus on refining, validating, and preparing your infrastructure for deployment. Built‑in Guidance for Real Deployments Beyond editing, the VS Code experience includes built‑in, step‑by‑step guidance to help you deploy your Terraform configuration into your own Azure environment - whether you’re experimenting or preparing for production. Because Terraform relies on state management, the workflow starts by helping you choose and configure a backend. Backend Options Option 1: Azure Storage Account as a remote backend A natural fit for Azure‑native and enterprise environments. The experience guides you through creating or selecting a storage account and configuring Terraform to store state securely in Azure. Option 2: HCP Terraform (Terraform Cloud) as a remote backend Ideal for teams already using Terraform Cloud. The guided flow helps you authenticate, connect to an existing organization and workspace, and generate the required backend configuration directly into your Terraform files. Option 3: Temporary workspace for quick validation Designed for learning and experimentation. You can run terraform plan and terraform apply directly in the Azure workspace with temporary state, without committing to a long‑term backend - ideal for quick validation, but not intended for production use. Each option includes an end‑to‑end walkthrough, so you can complete backend setup and run Terraform commands without leaving the VS Code environment or searching through external documentation. Connecting Code, State, and Deployment This experience connects three essential parts of the Terraform workflow: AI‑assisted code generation in Azure Portal Copilot Interactive editing and guided execution in VS Code for the web Flexible backend options for managing Terraform state Together, these pieces make it easier to move from idea to infrastructure in a structured, supported way—whether you’re new to Terraform or managing production workloads with established CI/CD pipelines. Available Now - and What’s Next The Open in VS Code experience for Terraform is now public preview in Azure Portal Copilot. We’re continuing to invest in this workflow, including clearer deployment guidance, future integration with GitHub Actions and other CI/CD pipelines, and deeper enhancements to the full‑screen Terraform editor experience. If you haven’t tried it yet, generate a Terraform configuration with Azure Copilot and open it in VS Code to go from prompt to production end to end in one connected workflow.817Views0likes0CommentsTurning GitHub Copilot into a “Best Practices Coach” with Copilot Spaces + a Markdown Knowledge Base
Why Copilot Spaces + Markdown repos work so well When you ask Copilot generic questions (“How should we log errors?” “What’s our API versioning approach?”), the model will often respond with reasonable defaults. But reasonable defaults are not the same as your standards. Copilot Spaces solve the context problem by allowing you to attach a curated set of sources (files, folders, repos, PRs/issues, uploads, free text) plus explicit instructions—so Copilot answers in the context of your team’s rules and artifacts. Spaces can be shared with your team and stay updated as the underlying GitHub content changes—so your “best practices coach” stays evergreen. The architecture (high level) Here’s the mental model: Engineering Knowledge Base Repo: A dedicated repo containing your standards as Markdown (coding style, architecture decisions, security rules, testing conventions, examples, templates). Copilot Space: “Engineering Standards Coach”: A Space that attaches the knowledge base repo (or key folders/files within it), optionally your main application repo(s), and a short set of “rules of engagement” (instructions). In-repo reinforcement (optional but powerful): Custom instruction files (repo-wide + path-specific) and prompt files (slash commands) inside your production repos to standardize behavior and workflows. Step 1 Create a Knowledge Base repo (Markdown-first) Create a repo such as: engineering-knowledge-base platform-playbook org-standards A practical starter structure: engineering-knowledge-base/ README.md standards/ coding-style.md logging.md error-handling.md performance.md security/ secure-coding.md secrets.md threat-modeling.md architecture/ overview.md adr/ 0001-service-boundaries.md 0002-api-versioning.md testing/ unit-testing.md integration-testing.md contract-testing.md templates/ pr-review-checklist.md api-design-checklist.md definition-of-done.md Tip: Keep these docs opinionated, concrete, and example-heavy—Copilot works best when it can point to specific patterns rather than abstract principles. Step 2 Create a Copilot Space and attach your sources Create a space, name it, choose an owner (personal or organization), then add sources and instructions. Inside the Space, add two types of context: Instructions (how Copilot should behave) and Sources (your actual code and docs). 2.1 Instructions (how Copilot should behave in this Space) Example instructions you can paste: You are the Engineering Standards Coach for this organization. Goals: - Recommend solutions that follow our standards in the attached knowledge base. - When proposing code, align with our logging, error-handling, security, and testing guidelines. - When uncertain, ask for the missing repo context or point to the exact standard that applies. Output format: - Start with the standard(s) you are applying (with a link or filename reference). - Then provide the recommended implementation. - Include a lightweight checklist for reviewers. 2.2 Sources (your real “knowledge base”) Attach: The knowledge base repo (or just the folders that matter) Your main code repo(s) (or select folders) PR checklist and Definition of Done templates Key architecture docs, runbooks, or troubleshooting guides Step 3 (Optional) Add instruction files to your production repos Spaces are excellent for curated context and team-wide “ask me anything about our standards.” But you can reinforce consistency directly inside each repo by adding custom instruction files. 3.1 Repo-wide instructions (.github/copilot-instructions.md) Create: your-app-repo/.github/copilot-instructions.md # Repository Copilot Instructions ## Tech stack - Language: TypeScript (strict) - Framework: Node.js + Express - Testing: Jest - Lint/format: ESLint + Prettier ## Engineering rules - Use structured logging as defined in /docs/logging.md - Never log secrets or raw tokens - Prefer small, composable functions - All new endpoints must include: input validation, authz checks, unit tests, and consistent error handling ## Build & test - Install: npm ci - Test: npm test - Lint: npm run lint 3.2 Path-specific instructions (.github/instructions/*.instructions.md) Create: your-app-repo/.github/instructions/security.instructions.md --- applyTo: "**/*.ts" --- # Security rules (TypeScript) - Never introduce dynamic SQL construction; use parameterized queries only. - Any new external HTTP call must enforce timeouts and retry policy. - Any auth logic must include negative tests. Step 4 (Optional) Turn your best practices into “slash commands” with prompt files To standardize repeatable workflows like code review, test scaffolding, or endpoint scaffolding, create prompt files (slash commands) as .prompt.md files—commonly in .github/prompts/. Engineers invoke them manually in chat by typing /. Create: your-app-repo/.github/prompts/standards-code-review.prompt.md --- description: Review code against our org standards (security, perf, style, tests) --- You are a senior engineer performing a standards-based review. Use these checks: 1) Security: input validation, authz, secrets, injection risks 2) Reliability: error handling, retries/timeouts, edge cases 3) Observability: structured logs, metrics, tracing where relevant 4) Testing: required coverage, negative tests, naming conventions 5) Style: follow repository rules in .github/copilot-instructions.md Output: - Summary (2-3 lines) - Issues (severity: BLOCKER/REQUIRED/SUGGESTION) - Suggested patch snippets (only where confident) - “Ready to merge?” verdict Now any engineer can type /standards-code-review and get the same structured output every time, without rewriting the prompt. How teams actually use this day-to-day Recipe A Onboarding a new engineer Ask inside the Space: “Summarize our service architecture and coding conventions for onboarding. Link the key docs.” Recipe B Writing a feature with best-practice guardrails Ask in the Space: “We’re adding endpoint X. Generate a plan that follows our API versioning ADR and error-handling standard.” Recipe C Enforcing review standards consistently In the repo, run the prompt file: /standards-code-review. Governance and best practices (what to do / what to avoid) Keep Spaces purpose-built. Avoid dumping an entire org into one Space if your goal is consistent, grounded output. Prefer linking the “golden source.” Keep standards in a single repo and update them via PR—treat it like code. Make instructions short but strict. Detailed rules belong in your Markdown standards. Avoid conflicting instruction files. If instructions contradict each other, results can be inconsistent. References (official docs for further reading) About GitHub Copilot Spaces: https://docs.github.com/en/copilot/concepts/context/spaces Creating GitHub Copilot Spaces: https://docs.github.com/en/copilot/how-tos/provide-context/use-copilot-spaces/create-copilot-spaces Adding custom instructions for GitHub Copilot: https://docs.github.com/en/copilot/how-tos/copilot-cli/customize-copilot/add-custom-instructions Use custom instructions in VS Code: https://code.visualstudio.com/docs/copilot/customization/custom-instructions Use prompt files in VS Code: https://code.visualstudio.com/docs/copilot/customization/prompt-files Closing: the “best practices” flywheel Once you implement this pattern, you get a virtuous cycle: teams encode standards as Markdown; Copilot Spaces ground answers in those standards; prompt files and instruction files standardize execution; and code reviews shift from style policing to design and correctness.Moving Beyond Prompts: A Practical Introduction to Spec-Driven Development
In the last year, many of us have started writing code differently. We describe what we want, let AI generate an answer, review it, tweak the prompt, and try again. This loop—prompt, retry, adjust—has quietly become part of our daily workflow. At first, it feels incredibly productive. But as the complexity of the task increases, something changes. The iteration cycle becomes longer, outputs become inconsistent, and the effort shifts from solving the problem to refining the prompt. This is where a subtle but important shift in approach can help: moving from prompt-driven development to spec-driven development. The Problem: Prompt → Retry → Guess Most AI-assisted workflows today look something like this: Write a prompt describing the task Review the generated output Adjust the prompt Repeat until it looks acceptable In practice, this often simplifies to: Prompt → Retry → Guess Figure: Prompt-driven vs spec-driven workflow comparison For simple tasks, this works well. But for anything involving multiple inputs, constraints, or edge cases, the process can become unpredictable. In my experience, the challenge is not the model—it is the lack of structure in how we describe the problem. A Shift in Thinking: From Prompts to Specifications Instead of asking AI to “figure it out,” spec-driven development introduces a simple idea: Define the problem clearly before asking for a solution. A specification (spec) is not a long document—it is a structured way of describing: Inputs Outputs Constraints Edge cases When this structure is provided upfront, the interaction changes significantly. Rather than iterating on vague prompts, you are guiding the system with a clear contract. What This Looks Like in Practice Let’s take a simple example: an order summary API (for example, a backend service hosted on Azure App Service). Without a Spec (Typical Prompt) “Write an API that returns order details for a user.” A model can generate something reasonable, but in practice, the responses often vary: Field names may be inconsistent Pagination may be missing Edge cases (no orders, large datasets) may not be handled Structure may change across iterations Example response (typical output): { "userId": 123, "orders": [ { "id": 1, "amount": 250 } ] } With a Spec (Structured Input) Now consider providing a simple specification: Specification: Input: userId page pageSize Output: userId orders[] orderId totalAmount orderDate pagination page pageSize totalRecords Constraints: Default pageSize = 10 Return empty list if no orders Handle large datasets efficiently Example response (based on the spec): { "userId": 123, "orders": [ { "orderId": 1, "totalAmount": 250, "orderDate": "2024-01-10" } ], "pagination": { "page": 1, "pageSize": 10, "totalRecords": 50 } } Why This Tends to Work The difference here is not just stylistic—it is structural. An unstructured prompt leaves room for interpretation. A spec reduces ambiguity by defining expectations explicitly. In practice, I have observed that providing structured inputs like this often leads to the following: More consistent field naming Better handling of edge cases Reduced need for repeated prompt refinement Rather than relying on trial-and-error, the interaction becomes more predictable and aligned with expectations. Applying This to Existing Code (Refactor Scenario) This approach becomes even more useful when applied to existing code. Instead of asking: “Fix the bug in the Auth controller” You can define expected behavior: Input validation rules Response formats Error handling Authorization behavior The task then becomes aligning the implementation with the defined spec. This shifts the interaction from guesswork to validation—comparing current behavior with intended behavior. Example Comparison (Auth Scenario) Without Spec (Typical Prompt) “Fix the login issue in Auth controller” Possible outcomes include: Partial validation added Inconsistent error responses No clear handling of repeated failed attempts With Spec (Defined Behavior) Spec defines: Validate username and password Return consistent error responses Lock account after 5 failed attempts Do not expose internal errors Resulting behavior: Input validation is consistently applied Error responses follow a defined structure Edge cases like account lockout are handled explicitly This mirrors the same pattern seen in the API example—moving from ambiguity to clearly defined behavior. A Practical Way to Start You do not need new tools or frameworks to try this. A simple workflow that has worked well in practice: Ask – Describe the problem (prompt, discussion, or notes) Write a spec – Define inputs, outputs, constraints Refine – Remove ambiguity Generate – Use the spec as input Validate – Compare output with the spec This adds a small upfront step, but it often reduces back-and-forth iterations later. The Practical Challenge One important point to note: Writing a good spec requires understanding the problem. Spec-driven development does not eliminate complexity—it surfaces it earlier. In many cases, the hardest part is not writing code, but clearly defining: What the system should do What it should not do How it should behave under edge conditions This is also why specs evolve over time. They do not need to be perfect upfront. They improve as your understanding improves. Where This Approach Helps From what I have seen, this approach is most useful in scenarios where the problem involves multiple inputs, defined contracts, or structured outputs such as APIs, schema-driven systems, or refactoring existing code where consistency matters. Where It May Not Be Necessary For simpler tasks such as small scripts, minor UI changes, or quick experiments, a detailed specification may not add much value. In those cases, a straightforward prompt is often sufficient. A Note on Tools Tools like GitHub Copilot, Azure AI Studio, and AI-assisted workflows in Visual Studio Code tend to be more effective when given clear, structured inputs. Spec-driven development is not tied to any specific tool. It is a way of thinking about how we interact with these systems more effectively. References https://github.com/features/copilot https://platform.openai.com/docs/guides/prompt-engineering https://github.com/github/spec-kit Amplifier - Modular AI Agent Framework - Amplifier Final Thoughts Many discussions around AI-assisted development focus on what tools can do. This approach focuses on something slightly different: How developers can structure problems more effectively before implementation. In my experience, moving from prompts to specs does not eliminate iteration, but it makes that iteration more predictable and purposeful.High Availability Testing for Azure Kubernetes Service in a Single Region with Availability Zones
Perform repeatable high availability testing for Azure Kubernetes Service workloads deployed in a single Azure region with Availability Zones. Validate pod recovery, node disruptions, and workload resiliency during infrastructure failures.