learning
406 TopicsBuilding Reliable AI Coding Workflows Using Modular AI Agent Optimization
Artificial Intelligence is rapidly transforming the modern software development industry. AI-powered coding assistants such as GitHub Copilot, Claude Code, and other Large Language Model (LLM)-based systems are helping developers automate repetitive coding tasks, improve productivity, and accelerate software development processes. These tools can generate code, assist with debugging, provide recommendations, and support developers during implementation. However, despite their growing capabilities, many AI coding assistants still face challenges related to reliability, maintainability, project-specific conventions, and structured software engineering workflows. Most coding assistants perform well for generic programming tasks but often struggle when working with domain-specific development requirements, API integrations, project architectures, validation workflows, and coding standards. In real-world software engineering environments, developers require systems that not only generate code but also follow project conventions, maintain readability, support modular development, and improve long-term maintainability. The project “AI Agents Optimization” focuses on improving the reliability and effectiveness of AI coding agents by designing structured workflows, modular configurations, validation mechanisms, and optimized task execution strategies. The objective of the project is to investigate how AI agents can become dependable collaborators in practical software engineering tasks instead of functioning only as autocomplete systems. The project explores different approaches for organizing AI agent workflows using structured instruction handling, modular task division, context management, validation systems, and integration of external tools and documentation sources. Different agent configurations are analyzed and evaluated to understand how workflow optimization affects software development quality and performance. Why Existing AI Coding Workflows Often Fail Most AI coding assistants perform well for isolated coding tasks but struggle in real-world engineering environments where projects involve multiple files, coding standards, APIs, validation requirements, and contextual dependencies. For example, a generic prompt such as: “Build authentication middleware” may generate functional code, but the output often lacks: Project-specific architecture Error handling consistency Validation logic Security best practices Dependency awareness This project approaches the problem differently by introducing a structured workflow pipeline where AI agents operate in defined stages rather than generating outputs in a single step. The workflow separates planning, generation, validation, and refinement into independent modules. This improves maintainability, reduces inconsistent outputs, and supports iterative refinement similar to real software engineering workflows. Project Objectives The primary objective of this project is to optimize AI coding agents for real-world software engineering workflows. The project aims to improve how AI systems handle development tasks such as code generation, debugging, testing, validation, feature implementation, and workflow management. Another major objective is to design modular AI workflows where different stages of software development are managed systematically. The workflow focuses on task planning, instruction processing, validation, refinement, and output evaluation. This structured approach improves transparency, maintainability, and consistency in AI-generated outputs. The project also aims to evaluate how AI coding agents perform under different configurations and development scenarios. By testing multiple workflows and structured instruction methods, the project analyzes how optimization techniques improve development reliability and coding quality. Technologies and Tools Used The project utilizes multiple modern technologies and development tools for experimentation and workflow optimization. Technology / Tool Purpose Python Automation and scripting GitHub Copilot AI-assisted coding Claude / LLM APIs AI workflow experimentation Visual Studio Code Development environment Git & GitHub Version control and repository management Structured Prompting Workflow optimization MCP Concepts Tool and context integration These tools collectively support the implementation and testing of optimized AI coding workflows. Implementation Workflow The system was implemented using a modular AI workflow pipeline where each stage performs a dedicated engineering task. Step 1 — Task Parsing The user submits a development task or coding requirement. The Instruction Processing Module extracts: Objective Constraints Project context Expected output format Example structured prompt: Task: Create JWT authentication middleware Language: Node.js Constraints: - Use Express.js - Add token validation - Follow modular architecture - Include error handling Step 2 — Planning & Reasoning The Planning Module divides the task into subtasks such as: Route handling Token verification Error management Security validation This improves reasoning consistency before generation begins. Step 3 — Code Generation The Code Generation Module produces outputs using structured prompts and contextual references instead of generic instructions. Step 4 — Validation Generated outputs are validated using: Syntax checks Logical consistency checks Formatting standards Dependency validation Step 5 — Refinement If validation fails, the workflow loops back into refinement where issues are corrected before final delivery. System Workflow The workflow of the AI Agents Optimization system is based on modular task execution and structured development processes. The workflow begins with task planning and requirement analysis. The AI agent receives structured instructions along with coding constraints, project context, and validation requirements. The system processes the provided instructions and generates outputs according to defined workflows and development standards. Different configurations are tested to evaluate how instruction structures and modular task handling influence the quality of generated code The workflow also includes validation and refinement stages where generated outputs are analyzed for correctness, maintainability, and consistency. The project focuses not only on code generation but also on improving readability, workflow transparency, debugging support, and adherence to project conventions. Key Features of the Project Structured AI workflow design Modular task execution AI-assisted software development Workflow optimization strategies Validation and refinement mechanisms Integration of development tools and documentation Improved maintainability and readability Support for practical software engineering workflows Challenges Faced During Development One of the major challenges encountered during the project was maintaining consistency and reliability in AI-generated outputs. Different AI models often produce different responses depending on prompts, context, and task structure. Designing workflows that improve output stability and maintain coding standards required careful experimentation and optimization. Another challenge involved integrating structured workflows while ensuring flexibility in task execution. AI systems often require clear instructions and contextual information to produce accurate outputs. Balancing automation with maintainability and project-specific requirements was an important aspect of the project. Managing validation and refinement processes was also challenging because generated outputs needed to be evaluated not only for correctness but also for readability, maintainability, and software engineering best practices. Observations and Outcomes During experimentation, structured workflows produced more reliable and maintainable outputs compared to single-prompt generation approaches. Some important observations included: Reduced repetitive corrections during code refinement Improved consistency in generated outputs Better adherence to coding structure and formatting More stable workflow behavior for multi-step tasks Improved readability and maintainability of generated code The validation and refinement stages were particularly effective in reducing incomplete outputs and improving response quality. Although the project focuses primarily on workflow architecture and qualitative analysis rather than benchmark testing, the results demonstrate that modular AI pipelines can significantly improve practical software engineering workflows. Future Enhancements The project can be further enhanced by implementing advanced multi-agent collaboration systems where multiple AI agents work together on complex software development tasks. Future versions may also include real-time documentation integration, automated testing frameworks, cloud-based workflow management, and improved reasoning models. Additional enhancements may include IDE extensions, intelligent debugging systems, automated code review mechanisms, and adaptive workflow optimization based on project requirements. Conclusion The AI Agents Optimization project demonstrates how structured workflows and modular configurations can improve the effectiveness of AI-powered coding assistants in modern software engineering environments. By focusing on workflow optimization, validation mechanisms, modular task execution, and structured instruction handling, the project highlights the future potential of AI agents as reliable development collaborators capable of supporting real-world software engineering processes. The project represents an important step toward building dependable AI-assisted development systems that improve productivity, maintainability, and software quality while supporting modern engineering practices. How to Try This Workflow Define a structured development task Provide project constraints and context Break the task into subtasks Generate output using structured prompts Validate output quality Refine based on validation feedback134Views0likes0CommentsModel Mondays S2E13: Open Source Models (Hugging Face)
1. Weekly Highlights 1. Weekly Highlights Here are the key updates we covered in the Season 2 finale: O1 Mini Reinforcement Fine-Tuning (GA): Fine-tune models with as few as ~100 samples using built-in Python code graders. Azure Live Interpreter API (Preview): Real-time speech-to-speech translation supporting 76 input languages and 143 locales with near human-level latency. Agent Factory – Part 5: Connecting agents using open standards like MCP (Model Context Protocol) and A2A (Agent-to-Agent protocol). Ask Ralph by Ralph Lauren: A retail example of agentic AI for conversational styling assistance, built on Azure OpenAI and Foundry’s agentic toolset. VS Code August Release: Brings auto-model selection, stronger safety guards for sensitive edits, and improved agent workflows through new agents.md support. 2. Spotlight – Open Source Models in Azure AI Foundry Guest: Jeff Boudier, VP of Product at Hugging Face Jeff showcased the deep integration between the Hugging Face community and Azure AI Foundry, where developers can access over 10 000 open-source models across multiple modalities—LLMs, speech recognition, computer vision, and even specialized domains like protein modeling and robotics. Demo Highlights Discover models through Azure AI Foundry’s task-based catalog filters. Deploy directly from Hugging Face Hub to Azure with one-click deployment. Explore Use Cases such as multilingual speech recognition and vision-language-action models for robotics. Jeff also highlighted notable models, including: SmoLM3 – a 3 B-parameter model with hybrid reasoning capabilities Qwen 3 Coder – a mixture-of-experts model optimized for coding tasks Parakeet ASR – multilingual speech recognition Microsoft Research protein-modeling collection MAGMA – a vision-language-action model for robotics Integration extends beyond deployment to programmatic access through the Azure CLI and Python SDKs, plus local development via new VS Code extensions. 3. Customer Story – DraftWise (BUILD 2025 Segment) The finale featured a customer spotlight on DraftWise, where CEO James Ding shared how the company accelerates contract drafting with Azure AI Foundry. Problem Legal contract drafting is time-consuming and error-prone. Solution DraftWise uses Azure AI Foundry to fine-tune Hugging Face language models on legal data, generating contract drafts and redline suggestions. Impact Faster drafting cycles and higher consistency Easy model management and deployment with Foundry’s secure workflows Transparent evaluation for legal compliance 4. Community Story – Hugging Face & Microsoft The episode also celebrated the ongoing collaboration between Hugging Face and Microsoft and the impact of open-source AI on the global developer ecosystem. Community Benefits Access to State-of-the-Art Models without licensing barriers Transparent Performance through public leaderboards and benchmarks Rapid Innovation as improvements and bug fixes spread quickly Education & Empowerment via tutorials, docs, and active forums Responsible AI Practices encouraged through community oversight 5. Key Takeaways Open Source AI Is Here to Stay Azure AI Foundry and Hugging Face make deploying, fine-tuning, and benchmarking open models easier than ever. Community Drives Innovation: Collaboration accelerates progress, improves transparency, and makes AI accessible to everyone. Responsible AI and Transparency: Open-source models come with clear documentation, licensing, and community-driven best practices. Easy Deployment & Customization: Azure AI Foundry lets you deploy, automate, and customize open models from a single, unified platform. Learn, Build, Share: The open-model ecosystem is a great place for students, developers, and researchers to learn, build, and share their work. Sharda's Tips: How I Wrote This Blog For this final recap, I focused on capturing the energy of the open source AI movement and the practical impact of Hugging Face and Azure AI Foundry collaboration. I watched the livestream, took notes on the demos and interviews, and linked directly to official resources for models, docs, and community sites. Here’s my Copilot prompt for this episode: "Generate a technical blog post for Model Mondays S2E13 based on the transcript and episode details. Focus on open source models, Hugging Face, Azure AI Foundry, and community workflows. Include practical links and actionable insights for developers and students! Learn & Connect Explore Open Models in Azure AI Foundry Hugging Face Leaderboard Responsible AI in Azure Machine Learning Llama-3 by Meta Hugging Face Community Azure AI Documentation About Model Mondays Model Mondays is your weekly Azure AI learning series: 5-Minute Highlights: Latest AI news and product updates 15-Minute Spotlight: Demos and deep dives with product teams 30-Minute AMA Fridays: Ask anything in Discord or the forum Start building: Watch Past Replays Register For AMA Recap Past AMAs Join The Community Don’t build alone! The Azure AI Developer Community is here for real-time chats, events, and support: Join the Discord Explore the Forum About Me I'm Sharda, a Gold Microsoft Learn Student Ambassador focused on cloud and AI. Find me on GitHub, Dev.to, Tech Community, and LinkedIn. In this blog series, I share takeaways from each week’s Model Mondays livestream.357Views0likes0CommentsEdge AI for Beginners : Getting Started with Foundry Local
In Module 08 of the EdgeAI for Beginners course, Microsoft introduces Foundry Local a toolkit that helps you deploy and test Small Language Models (SLMs) completely offline. In this blog, I’ll share how I installed Foundry Local, ran the Phi-3.5-mini model on my windows laptop, and what I learned through the process. What Is Foundry Local? Foundry Local allows developers to run AI models locally on their own hardware. It supports text generation, summarization, and code completion — all without sending data to the cloud. Unlike cloud-based systems, everything happens on your computer, so your data never leaves your device. Prerequisites Before starting, make sure you have: Windows 10 or 11 Python 3.10 or newer Git Internet connection (for the first-time model download) Foundry Local installed Step 1 — Verify Installation After installing Foundry Local, open Command Prompt and type: foundry --version If you see a version number, Foundry Local is installed correctly. Step 2 — Start the Service Start the Foundry Local service using: foundry service start You should see a confirmation message that the service is running. Step 3 — List Available Models To view the models supported by your system, run: foundry model list You’ll get a list of locally available SLMs. Here’s what I saw on my machine: Note: Model availability depends on your device’s hardware. For most laptops, phi-3.5-mini works smoothly on CPU. Step 4 — Run the Phi-3.5 Model Now let’s start chatting with the model: foundry model run phi-3.5-mini-instruct-generic-cpu:1 Once it loads, you’ll enter an interactive chat mode. Try a simple prompt: Hello! What can you do? The model replies instantly — right from your laptop, no cloud needed. To exit, type: /exit How It Works Foundry Local loads the model weights from your device and performs inference locally.This means text generation happens using your CPU (or GPU, if available). The result: complete privacy, no internet dependency, and instant responses. Benefits for Students For students beginning their journey in AI, Foundry Local offers several key advantages: No need for high-end GPUs or expensive cloud subscriptions. Easy setup for experimenting with multiple models. Perfect for class assignments, AI workshops, and offline learning sessions. Promotes a deeper understanding of model behavior by allowing step-by-step local interaction. These factors make Foundry Local a practical choice for learning environments, especially in universities and research institutions where accessibility and affordability are important. Why Use Foundry Local Running models locally offers several practical benefits compared to using AI Foundry in the cloud. With Foundry Local, you do not need an internet connection, and all computations happen on your personal machine. This makes it faster for small models and more private since your data never leaves your device. In contrast, AI Foundry runs entirely on the cloud, requiring internet access and charging based on usage. For students and developers, Foundry Local is ideal for quick experiments, offline testing, and understanding how models behave in real-time. On the other hand, AI Foundry is better suited for large-scale or production-level scenarios where models need to be deployed at scale. In summary, Foundry Local provides a flexible and affordable environment for hands-on learning, especially when working with smaller models such as Phi-3, Qwen2.5, or TinyLlama. It allows you to experiment freely, learn efficiently, and better understand the fundamentals of Edge AI development. Optional: Restart Later Next time you open your laptop, you don’t have to reinstall anything. Just run these two commands again: foundry service start foundry model run phi-3.5-mini-instruct-generic-cpu:1 What I Learned Following the EdgeAI for Beginners Study Guide helped me understand: How edge AI applications work How small models like Phi 3.5 can run on a local machine How to test prompts and build chat apps with zero cloud usage Conclusion Running the Phi-3.5-mini model locally with Foundry Localgave me hands-on insight into edge AI. It’s an easy, private, and cost-free way to explore generative AI development. If you’re new to Edge AI, start with the EdgeAI for Beginners course and follow its Study Guide to get comfortable with local inference and small language models. Resources: EdgeAI for Beginners GitHub Repo Foundry Local Official Site Phi Model Link934Views1like0CommentsAzure AI Model Inference API
The Azure AI Model Inference API provides a unified interface for developers to interact with various foundational models deployed in Azure AI Studio. This API allows developers to generate predictions from multiple models without changing their underlying code. By providing a consistent set of capabilities, the API simplifies the process of integrating and switching between different models, enabling seamless model selection based on task requirements.4.5KViews0likes2CommentsModel Mondays S2:E7 · AI-Assisted Azure Development
Welcome to Episode 7! This week, we explore how AI is transforming Azure development. We’ll break down two key tools—Azure MCP Server and GitHub Copilot for Azure—and see how they make working with Azure resources easier for everyone. We’ll also look at a real customer story from SightMachine, showing how AI streamlines manufacturing operations.393Views0likes0CommentsSpec-Driven Development for AI-Enabled Enterprise Systems
Spec-Driven Development for AI-Enabled Enterprise Systems How to make specs the single source of truth for your React frontends, backend services, data, and AI agents. If you are building an enterprise system with a React frontend, backend APIs and services, a database layer, and shared libraries, moving to Spec-Driven Development (SDD) can feel like a big cultural shift. For AI developers and engineers, though, it is a gift: structured, machine-readable specifications are exactly what both humans and AI coding agents need to stay aligned and productive. This post walks through how to structure specs, version contracts, design workflows, and integrate AI agents in a way that scales. Along the way, it references Microsoft’s public guidance on microservices, APIs, DevOps, and architecture so you can go deeper where needed. 1. Structuring specifications for an enterprise system For a serious enterprise system, treat specs as layered and modular rather than a single monolithic document. A good mental model is Domain-Driven Design (DDD) and bounded contexts (see https://learn.microsoft.com/azure/architecture/microservices/model/domain-analysis Business and domain layer This layer is technology-agnostic and captures: Business capabilities and problem statements Domain language and key entities Business rules and workflows Non-functional requirements (performance, security, compliance, SLAs) Solution and architecture layer Here you define how the system is shaped: System context and C4-style diagrams Service boundaries and ownership Integration patterns and event flows Data ownership and high-level models Microsoft’s microservices guidance is a solid reference: https://learn.microsoft.com/azure/architecture/microservices/. Implementation-oriented specs per component For each concrete component, keep a focused spec: Frontend / UI (React): screen catalogue, UX flows, state contracts, API dependencies, validation rules, accessibility and performance requirements. APIs / services: OpenAPI or AsyncAPI contracts, error models, authentication and authorisation, rate limits, SLAs, observability requirements. Database / schema: logical data model, ownership per service, migration strategy, retention, indexing, partitioning. Shared libraries: responsibilities, versioning policy, supported runtimes, compatibility matrix. Integrations: protocols, payloads, sequencing, idempotency, retry and backoff, SLAs, failure modes. In practice, this usually means: One “master” business and architecture spec per domain or product Separate specs per service or module (frontend app, each backend service, shared library, integration) Everything linked via IDs (for example REQ-123, SVC-ORDER-001) so you can trace from requirement to spec, implementation, and tests 2. Templates and standards that scale To keep things consistent across teams, use a base template that all components share, then extend it with technology-specific sections. This works well for both human readers and AI agents consuming the specs. Base specification template Every spec, regardless of component type, should include: Purpose and scope Stakeholders and dependencies Requirements mapping (list of requirement IDs covered) Architecture and interaction overview Contracts (APIs, events, data) Non-functional requirements Risks and open questions Test and acceptance criteria Extended templates per component Frontend: UX flows, wireframes or Figma links, accessibility, performance budgets, offline behaviour, error states. API / service: OpenAPI or AsyncAPI link, auth and authorisation, throttling, logging and metrics, health endpoints. See logging and monitoring guidance at https://learn.microsoft.com/azure/architecture/microservices/logging-monitoring Database: schema definition, migration plan, backup and restore, data lifecycle, multi-tenant strategy. Integration: sequence diagrams, error handling, retry and idempotency, message contracts, security. 3. Contracts, versioning, and change management API contracts For SDD, API contracts are first-class citizens. Define them via OpenAPI or AsyncAPI and treat the spec as the source of truth. Use contract testing to keep providers and consumers aligned, and version APIs explicitly (for example v1, v2) rather than breaking changes in place. Microsoft’s API design guidance is a good starting point: https://learn.microsoft.com/azure/architecture/best-practices/api-design and Azure API Management at https://learn.microsoft.com/azure/api-management/. Database migrations Any spec change that affects data should include a migration plan. Use migration tooling such as EF Core migrations, Flyway, or Liquibase, and treat migration scripts as code. Document backward-compatibility windows so APIs can support both old and new fields for a defined period. Shared DTOs and models Prefer sharing contracts (OpenAPI, JSON Schema) over large shared code libraries. If you must share code, version the shared library independently and document compatibility (for example, “Service A supports SharedLib 2.x”). Keep DTOs at the edges and map to internal domain models inside each service. Cross-service dependencies Capture dependencies explicitly in specs, such as “Order Service depends on Customer v1.3+ for endpoint /customers/{id}”. Use consumer-driven contracts and CI checks to prevent breaking changes. For event-driven systems, document event contracts and evolution rules. See event-driven architecture guidance at https://learn.microsoft.com/azure/architecture/reference-architectures/event-driven/event-driven-architecture-overview. Spec versioning and change management Version specs semantically (for example OrderServiceSpec v1.2.0) and record what changed, why, impact, and migration steps. Link spec versions to releases or tags in Git and to work items in Azure DevOps or GitHub Issues. Azure Boards is useful here: https://learn.microsoft.com/azure/devops/boards/?view=azure-devops. 4. A mature Spec-Driven Development workflow A realistic SDD workflow for AI-enabled teams might look like this: Discovery and domain analysis: capture business capabilities, domain language, and high-level workflows. Business and architecture specs: define bounded contexts, service boundaries, integration patterns, and NFRs. Contract design: design API specs (OpenAPI or AsyncAPI), event schemas, data models, and validation rules. Task generation: derive work items from specs, such as “Implement endpoint X”, “Add migration Y”, “Add UI flow Z”. This is a great place to use AI agents to read specs and generate tasks. Implementation: code is generated or written to satisfy the spec; the spec remains the reference, not the code. Validation and testing: contract tests, unit tests, integration tests, and end-to-end tests all trace back to spec IDs. Use quality gates in CI and CD, as described in Https://learn.microsoft.com/azure/architecture/framework/devops/devops-quality Review and sign-off: architecture and product review against the spec; update the spec if reality diverges. Release and observability: dashboards and alerts tied to specified SLIs and SLOs. 5. Governance, traceability, and avoiding drift Traceability across the lifecycle Use IDs everywhere: requirements, spec sections, tasks, tests, and deployment artefacts. In Azure DevOps or GitHub, link: Requirement (for example Azure DevOps Feature) Spec (stored in the repo) User stories and tasks Pull requests Tests Releases For key decisions, adopt Architecture Decision Records (ADRs). Microsoft’s guidance on ADRs is here: Https://learn.microsoft.com/azure/architecture/framework/devops/adrs Keeping humans and AI agents aligned To avoid implementation drift: Make specs as machine-readable as possible (OpenAPI, JSON Schema, YAML, BPMN). Enforce spec checks in CI: API implementation must match OpenAPI, DB schema must match migration plan, generated clients must be up to date. For AI coding agents, always provide the relevant spec files as context and constrain them to files linked to specific spec IDs. Add automated checks that compare generated code to contracts and fail builds when they diverge. 6. Enterprise best practices for repos and governance Example repository structure /docs /business /architecture /decisions (ADRs) /specs /frontend /services /orders /customers /integrations /data /src /frontend /services /shared /tests /ops /pipelines /infra-as-code Governance practices An architecture review group that reviews spec changes, not just code changes. Definition of Done includes: spec updated, tests linked, contracts validated. Regular “spec health” reviews to identify what is out of date or drifting. For broader architectural guidance, see: Azure microservices and DDD: https://learn.microsoft.com/azure/architecture/microservices/ Cloud design patterns: https://learn.microsoft.com/azure/architecture/patterns/ Azure Well-Architected Framework: https://learn.microsoft.com/azure/well-architected/ 7. Integrating AI and agentic workflows into SDD Spec-Driven Development is a natural fit for AI and multi-agent systems because specs provide structured, reliable context. Here are some practical patterns. LangGraph and multi-agent orchestration using Microsoft Agent Framework You can design a graph where: A “spec agent” reads and validates specs. An “implementation agent” writes or updates code based on those specs. A “test agent” generates tests from contracts and acceptance criteria. The graph flow can mirror your SDD workflow: Spec → Contract → Code → Tests → Review, with each agent responsible for a stage. MCP (Model Context Protocol) Expose your spec repository, OpenAPI definitions, and ADRs as MCP tools so agents can query the true source of truth instead of hallucinating. For example, provide a tool that returns the OpenAPI for a given service and version, or a tool that returns the ADRs relevant to a particular domain. Learn more about MCP at https://aka.ms/mcp-for-beginners BPMN and process flows Store BPMN diagrams as part of the spec. Agents can read them to generate workflow code, state machines, or tests. For process-oriented integrations, see Azure Logic Apps guidance at https://learn.microsoft.com/azure/logic-apps/. CI/CD pipelines on Azure In your pipelines, validate that implementation matches the spec: Contract tests for APIs and events Schema checks for databases Linting and static analysis for spec conformance Use pipeline gates to block deployments if contracts or migrations are out of sync. Azure Pipelines https://learn.microsoft.com/azure/devops/pipelines/?view=azure-devops GitHub Agentic Workflow Patterns https://github.github.com/gh-aw/ Where to start The key is not to boil the ocean. Pick one domain, such as “Orders”, and design a thin but end-to-end SDD flow: spec → contract → tasks → code → tests. Run it with your AI agents in the loop, learn where the friction is, and iterate. Once that feels natural, you can roll the patterns out across the rest of your system. For AI developers and engineers, SDD is more than process hygiene. It is how you give your agents high-quality, unambiguous context so they can generate code, tests, and documentation that actually match what the business needs. `432Views1like0CommentsFrom Demo to Production: Building Microsoft Foundry Hosted Agents with .NET
The Gap Between a Demo and a Production Agent Let's be honest. Getting an AI agent to work in a demo takes an afternoon. Getting it to work reliably in production, tested, containerised, deployed, observable, and maintainable by a team. is a different problem entirely. Most tutorials stop at the point where the agent prints a response in a terminal. They don't show you how to structure your code, cover your tools with tests, wire up CI, or deploy to a managed runtime with a proper lifecycle. That gap between prototype and production is where developer teams lose weeks. Microsoft Foundry Hosted Agents close that gap with a managed container runtime for your own custom agent code. And the Hosted Agents Workshop for .NET gives you a complete, copy-paste-friendly path through the entire journey. from local run to deployed agent to chat UI, in six structured labs using .NET 10. This post walks you through what the workshop delivers, what you will build, and why the patterns it teaches matter far beyond the workshop itself. What Is a Microsoft Foundry Hosted Agent? Microsoft Foundry supports two distinct agent types, and understanding the difference is the first decision you will make as an agent developer. Prompt agents are lightweight agents backed by a model deployment and a system prompt. No custom code required. Ideal for simple Q&A, summarisation, or chat scenarios where the model's built-in reasoning is sufficient. Hosted agents are container-based agents that run your own code .NET, Python, or any framework you choose inside Foundry's managed runtime. You control the logic, the tools, the data access, and the orchestration. When your scenario requires custom tool integrations, deterministic business logic, multi-step workflow orchestration, or private API access, a hosted agent is the right choice. The Foundry runtime handles the managed infrastructure; you own the code. For the official deployment reference, see Deploy a hosted agent to Foundry Agent Service on Microsoft Learn. What the Workshop Delivers The Hosted Agents Workshop for .NET is a beginner-friendly, hands-on workshop that takes you through the full development and deployment path for a real hosted agent. It is structured around a concrete scenario: a Hosted Agent Readiness Coach that helps delivery teams answer questions like: Should this use case start as a prompt agent or a hosted agent? What should a pilot launch checklist include? How should a team troubleshoot common early setup problems? The scenario is purposefully practical. It is not a toy chatbot. It is the kind of tool a real team would build and hand to other engineers, which means it needs to be testable, deployable, and extensible. The workshop covers: Local development and validation with .NET 10 Copilot-assisted coding with repo-specific instructions Deterministic tool implementation with xUnit test coverage CI pipeline validation with GitHub Actions Secure deployment to Azure Container Registry and Microsoft Foundry Chat UI integration using Blazor What You Will Build By the end of the workshop, you will have a code-based hosted agent that exposes an OpenAI Responses-compatible /responses endpoint on port 8088 . The agent is backed by three deterministic local tools, implemented in WorkshopLab.Core : RecommendImplementationShape — analyses a scenario and recommends hosted or prompt agent based on its requirements BuildLaunchChecklist — generates a pilot launch checklist for a given use case TroubleshootHostedAgent — returns structured troubleshooting guidance for common setup problems These tools are deterministic by design, no LLM call required to return a result. That choice makes them fast, predictable, and fully testable, which is the right architecture for business logic in a production agent. The end-to-end architecture looks like this: The Hands-On Journey: Lab by Lab The workshop follows a deliberate build → validate → ship progression. Each lab has a clear outcome. You do not move forward until the previous checkpoint passes. Lab 0 — Setup and Local Run Open the repo in VS Code or a GitHub Codespace, configure your Microsoft Foundry project endpoint and model deployment name, then run the agent locally. By the end of Lab 0, your agent is listening on http://localhost:8088/responses and responding to test requests. dotnet restore dotnet build dotnet run --project src/WorkshopLab.AgentHost Test it with a single PowerShell call: Invoke-RestMethod -Method Post ` -Uri "http://localhost:8088/responses" ` -ContentType "application/json" ` -Body '{"input":"Should we start with a hosted agent or a prompt agent?"}' Lab 0 instructions → Lab 1 — Copilot Customisation Configure repo-specific GitHub Copilot instructions so that Copilot understands the hosted-agent patterns used in this project. You will also add a Copilot review skill tailored to hosted agent code reviews. This step means every code suggestion you receive from Copilot is contextualised to the workshop scenario rather than giving generic .NET advice. Lab 1 instructions → Lab 2 — Tool Implementation Extend one of the deterministic tools in WorkshopLab.Core with a real feature change. The suggested change adds a stronger recommendation path to RecommendImplementationShape for scenarios that require all three hosted-agent strengths simultaneously. // In RecommendImplementationShape — add before the final return: if (requiresCode && requiresTools && requiresWorkflow) { return string.Join(Environment.NewLine, [ $"Recommended implementation: Hosted agent (full-stack)", $"Scenario goal: {goal}", "Why: the scenario requires custom code, external tool access, and " + "multi-step orchestration — all three hosted-agent strengths.", "Suggested next step: start with a code-based hosted agent, register " + "local tools for each integration, and add a workflow layer." ]); } You then write an xUnit test to cover it, run dotnet test , and validate the change against a live /responses call. This is the workshop's most important teaching moment: every tool change is covered by a test before it ships. Lab 2 instructions → Lab 3 — CI Validation Wire up a GitHub Actions workflow that builds the solution, runs the test suite, and validates that the agent container builds cleanly. No manual steps — if a change breaks the build or a test, CI catches it before any deployment happens. Lab 3 instructions → Lab 4 — Deployment to Microsoft Foundry Use the Azure Developer CLI ( azd ) to provision an Azure Container Registry, publish the agent image, and deploy the hosted agent to Microsoft Foundry. The workshop separates provisioning from deployment deliberately: azd owns the Azure resources; the Foundry control plane deployment is an explicit, intentional final step that depends on your real project endpoint and agent.yaml manifest values. Lab 4 instructions → Lab 5 — Chat UI Integration Connect a Blazor chat UI to the deployed hosted agent and validate end-to-end responses. By the end of Lab 5, you have a fully functioning agent accessible through a real UI, calling your deterministic tools via the Foundry control plane. Lab 5 instructions → Key Concepts to Take Away The workshop teaches concrete patterns that apply well beyond this specific scenario. Code-first agent design Prompt-only agents are fast to build but hard to test and reason about at scale. A hosted agent with code-backed tools gives you something you can unit test, refactor, and version-control like any other software. Deterministic tools and testability The workshop explicitly avoids LLM calls inside tool implementations. Deterministic tools return predictable outputs for a given input, which means you can write fast, reliable unit tests for them. This is the right pattern for business logic. Reserve LLM calls for the reasoning layer, not the execution layer. CI/CD for agent systems AI agents are software. They deserve the same build-test-deploy discipline as any other service. Lab 3 makes this concrete: you cannot ship without passing CI, and CI validates the container as well as the unit tests. Deployment separation The workshop's split between azd provisioning and Foundry control-plane deployment is not arbitrary. It reflects the real operational boundary: your Azure resources are long-lived infrastructure; your agent deployment is a lifecycle event tied to your project's specific endpoint and manifest. Keeping them separate reduces accidents and makes rollbacks easier. Observability and the validation mindset Every lab ends with an explicit checkpoint. The culture the workshop builds is: prove it works before moving on. That mindset is more valuable than any specific tool or command in the labs. Why Hosted Agents Are Worth the Investment The managed runtime in Microsoft Foundry removes the infrastructure overhead that makes custom agent deployment painful. You do not manage Kubernetes clusters, configure ingress rules, or handle TLS termination. Foundry handles the hosting; you handle the code. This matters most for teams making the transition from demo to production. A prompt agent is an afternoon's work. A hosted agent with proper CI, tested tools, and a deployment pipeline is a week's work done properly once, instead of several weeks of firefighting done poorly repeatedly. The Foundry agent lifecycle —> create, update, version, deploy —>also gives you the controls you need to manage agents in a real environment: staged rollouts, rollback capability, and clear separation between agent versions. For the full deployment guide, see Deploy a hosted agent to Foundry Agent Service. From Workshop to Real Project This workshop is not just a learning exercise. The repository structure, the tooling choices, and the CI/CD patterns are a reference implementation. The patterns you can lift directly into a production project include: The WorkshopLab.Core / WorkshopLab.AgentHost separation between business logic and agent hosting The agent.yaml manifest pattern for declarative Foundry deployment The GitHub Actions workflow structure for build, test, and container validation The azd + ACR pattern for image publishing without requiring Docker Desktop locally The Blazor chat UI as a starting point for internal tooling or developer-facing applications The scenario, a readiness coach for hosted agents. This is also something teams evaluating Microsoft Foundry will find genuinely useful. It answers exactly the questions that come up when onboarding a new team to the platform. Common Mistakes When Building Hosted Agents Having run workshops and spoken with developer teams building on Foundry, a few patterns come up repeatedly: Skipping local validation before containerising. Always validate the /responses endpoint locally first. Debugging inside a container is slower and harder than debugging locally. Putting business logic inside the LLM call. If the answer to a user query can be determined by code, use code. Reserve the model for reasoning, synthesis, and natural language output. Treating CI as optional. Agent code changes break things just like any other code change. If you do not have CI catching regressions, you will ship them. Conflating provisioning and deployment. Recreating Azure resources on every deploy is slow and error-prone. Provision once with azd ; deploy agent versions as needed through the Foundry control plane. Not having a rollback plan. The Foundry agent lifecycle supports versioning. Use it. Know how to roll back to a previous version before you deploy to production. Get Started The workshop is open source, beginner-friendly, and designed to be completed in a single day. You need a .NET 10 SDK, an Azure subscription, access to a Microsoft Foundry project, and a GitHub account. Clone the repository, follow the labs in order, and by the end you will have a production-ready reference implementation that your team can extend and adapt for real scenarios. Clone the workshop repository → Here is the quick start to prove the solution works locally before you begin the full lab sequence: git clone https://github.com/microsoft/Hosted_Agents_Workshop_dotNET.git cd Hosted_Agents_Workshop_dotNET # Set your Foundry project endpoint and model deployment $env:AZURE_AI_PROJECT_ENDPOINT = "https://<resource>.services.ai.azure.com/api/projects/<project>" $env:MODEL_DEPLOYMENT_NAME = "gpt-4.1-mini" # Build and run dotnet restore dotnet build dotnet run --project src/WorkshopLab.AgentHost Then send your first request: Invoke-RestMethod -Method Post ` -Uri "http://localhost:8088/responses" ` -ContentType "application/json" ` -Body '{"input":"Should we start with a hosted agent or a prompt agent?"}' When the agent answers as a Hosted Agent Readiness Coach, you are ready to begin the labs. Key Takeaways Hosted agents in Microsoft Foundry let you run custom .NET code in a managed container runtime — you own the logic, Foundry owns the infrastructure. Deterministic tools are the right pattern for business logic in production agents: fast, testable, and predictable. CI/CD is not optional for agent systems. Build it in from the start, not as an afterthought. Separate your provisioning ( azd ) from your deployment (Foundry control plane) — it reduces accidents and simplifies rollbacks. The workshop is a reference implementation, not just a tutorial. The patterns are production-grade and ready to adapt. References Hosted Agents Workshop for .NET — GitHub Repository Workshop Lab Guide Deploy a Hosted Agent to Foundry Agent Service — Microsoft Learn Microsoft Foundry Portal Azure Developer CLI (azd) — Microsoft Learn .NET 10 SDK Download414Views0likes0CommentsLevel up your Python + AI skills with our complete series
We've just wrapped up our live series on Python + AI, a comprehensive nine-part journey diving deep into how to use generative AI models from Python. The series introduced multiple types of models, including LLMs, embedding models, and vision models. We dug into popular techniques like RAG, tool calling, and structured outputs. We assessed AI quality and safety using automated evaluations and red-teaming. Finally, we developed AI agents using popular Python agents frameworks and explored the new Model Context Protocol (MCP). To help you apply what you've learned, all of our code examples work with GitHub Models, a service that provides free models to every GitHub account holder for experimentation and education. Even if you missed the live series, you can still access all the material using the links below! If you're an instructor, feel free to use the slides and code examples in your own classes. If you're a Spanish speaker, check out the Spanish version of the series. Python + AI: Large Language Models 📺 Watch recording In this session, we explore Large Language Models (LLMs), the models that power ChatGPT and GitHub Copilot. We use Python to interact with LLMs using popular packages like the OpenAI SDK and LangChain. We experiment with prompt engineering and few-shot examples to improve outputs. We also demonstrate how to build a full-stack app powered by LLMs and explain the importance of concurrency and streaming for user-facing AI apps. Slides for this session Code repository with examples: python-openai-demos Python + AI: Vector embeddings 📺 Watch recording In our second session, we dive into a different type of model: the vector embedding model. A vector embedding is a way to encode text or images as an array of floating-point numbers. Vector embeddings enable similarity search across many types of content. In this session, we explore different vector embedding models, such as the OpenAI text-embedding-3 series, through both visualizations and Python code. We compare distance metrics, use quantization to reduce vector size, and experiment with multimodal embedding models. Slides for this session Code repository with examples: vector-embedding-demos Python + AI: Retrieval Augmented Generation 📺 Watch recording In our third session, we explore one of the most popular techniques used with LLMs: Retrieval Augmented Generation. RAG is an approach that provides context to the LLM, enabling it to deliver well-grounded answers for a particular domain. The RAG approach works with many types of data sources, including CSVs, webpages, documents, and databases. In this session, we walk through RAG flows in Python, starting with a simple flow and culminating in a full-stack RAG application based on Azure AI Search. Slides for this session Code repository with examples: python-openai-demos Python + AI: Vision models 📺 Watch recording Our fourth session is all about vision models! Vision models are LLMs that can accept both text and images, such as GPT-4o and GPT-4o mini. You can use these models for image captioning, data extraction, question answering, classification, and more! We use Python to send images to vision models, build a basic chat-with-images app, and create a multimodal search engine. Slides for this session Code repository with examples: openai-chat-vision-quickstart Python + AI: Structured outputs 📺 Watch recording In our fifth session, we discover how to get LLMs to output structured responses that adhere to a schema. In Python, all you need to do is define a Pydantic BaseModel to get validated output that perfectly meets your needs. We focus on the structured outputs mode available in OpenAI models, but you can use similar techniques with other model providers. Our examples demonstrate the many ways you can use structured responses, such as entity extraction, classification, and agentic workflows. Slides for this session Code repository with examples: python-openai-demos Python + AI: Quality and safety 📺 Watch recording This session covers a crucial topic: how to use AI safely and how to evaluate the quality of AI outputs. There are multiple mitigation layers when working with LLMs: the model itself, a safety system on top, the prompting and context, and the application user experience. We focus on Azure tools that make it easier to deploy safe AI systems into production. We demonstrate how to configure the Azure AI Content Safety system when working with Azure AI models and how to handle errors in Python code. Then we use the Azure AI Evaluation SDK to evaluate the safety and quality of output from your LLM. Slides for this session Code repository with examples: ai-quality-safety-demos Python + AI: Tool calling 📺 Watch recording In the final part of the series, we focus on the technologies needed to build AI agents, starting with the foundation: tool calling (also known as function calling). We define tool call specifications using both JSON schema and Python function definitions, then send these definitions to the LLM. We demonstrate how to properly handle tool call responses from LLMs, enable parallel tool calling, and iterate over multiple tool calls. Understanding tool calling is absolutely essential before diving into agents, so don't skip over this foundational session. Slides for this session Code repository with examples: python-openai-demos Python + AI: Agents 📺 Watch recording In the penultimate session, we build AI agents! We use Python AI agent frameworks such as the new agent-framework from Microsoft and the popular LangGraph framework. Our agents start simple and then increase in complexity, demonstrating different architectures such as multiple tools, supervisor patterns, graphs, and human-in-the-loop workflows. Slides for this session Code repository with examples: python-ai-agent-frameworks-demos Python + AI: Model Context Protocol 📺 Watch recording In the final session, we dive into the hottest technology of 2025: MCP (Model Context Protocol). This open protocol makes it easy to extend AI agents and chatbots with custom functionality, making them more powerful and flexible. We demonstrate how to use the Python FastMCP SDK to build an MCP server running locally and consume that server from chatbots like GitHub Copilot. Then we build our own MCP client to consume the server. Finally, we discover how easy it is to connect AI agent frameworks like LangGraph and Microsoft agent-framework to MCP servers. With great power comes great responsibility, so we briefly discuss the security risks that come with MCP, both as a user and as a developer. Slides for this session Code repository with examples: python-mcp-demo9.8KViews6likes0CommentsFrom Zero to 16 Games in 2 Hours
From Zero to 16 Games in 2 Hours: Teaching Prompt Engineering to Students with GitHub Copilot CLI Introduction What happens when you give a room full of 14-year-olds access to AI-powered development tools and challenge them to build games? You might expect chaos, confusion, or at best, a few half-working prototypes. Instead, we witnessed something remarkable: 16 fully functional HTML5 games created in under two hours, all from students with varying programming experience. This wasn't magic, it was the power of GitHub Copilot CLI combined with effective prompt engineering. By teaching students to communicate clearly with AI, we transformed a traditional coding workshop into a rapid prototyping session that exceeded everyone's expectations. The secret weapon? A technique called "one-shot prompting" that enables anyone to generate complete, working applications from a single, well-crafted prompt. In this article, we'll explore how we structured this workshop using CopilotCLI-OneShotPromptGameDev, a methodology designed to teach prompt engineering fundamentals while producing tangible, exciting results. Whether you're an educator planning STEM workshops, a developer exploring AI-assisted coding, or simply curious about how young people can leverage AI tools effectively, this guide provides a practical blueprint you can replicate. What is GitHub Copilot CLI? GitHub Copilot CLI extends the familiar Copilot experience beyond your code editor into the command line. While Copilot in VS Code suggests code completions as you type, Copilot CLI allows you to have conversational interactions with AI directly in your terminal. You describe what you want to accomplish in natural language, and the AI responds with shell commands, explanations, or in our case, complete code files. This terminal-based approach offers several advantages for learning and rapid prototyping. Students don't need to configure complex IDE settings or navigate unfamiliar interfaces. They simply type their request, review the AI's output, and iterate. The command line provides a transparent view of exactly what's happening, no hidden abstractions or magical "autocomplete" that obscures the learning process. For our workshop, Copilot CLI served as a bridge between students' creative ideas and working code. They could describe a game concept in plain English, watch the AI generate HTML, CSS, and JavaScript, then immediately test the result in a browser. This rapid feedback loop kept engagement high and made the connection between language and code tangible. Installing GitHub Copilot CLI Setting up Copilot CLI requires a few straightforward steps. Before the workshop, we ensured all machines were pre-configured, but students also learned the installation process as part of understanding how developer tools work. First, you'll need Node.js installed on your system. Copilot CLI runs as a Node package, so this is a prerequisite: # Check if Node.js is installed node --version # If not installed, download from https://nodejs.org/ # Or use a package manager: # Windows (winget) winget install OpenJS.NodeJS.LTS # macOS (Homebrew) brew install node # Linux (apt) sudo apt install nodejs npm These commands verify your Node.js installation or guide you through installing it using your operating system's preferred package manager. Next, install the GitHub CLI, which provides the foundation for Copilot CLI: # Windows winget install GitHub.cli # macOS brew install gh # Linux sudo apt install gh This installs the GitHub command-line interface, which handles authentication and provides the framework for Copilot integration. With GitHub CLI installed, authenticate with your GitHub account: gh auth login This command initiates an interactive authentication flow that connects your terminal to your GitHub account, enabling access to Copilot features. Finally, install the Copilot CLI extension: gh extension install github/gh-copilot This adds Copilot capabilities to your GitHub CLI installation, enabling the conversational AI features we'll use for game development. Verify the installation by running: gh copilot --help If you see the help output with available commands, you're ready to start prompting. The entire setup takes about 5-10 minutes on a fresh machine, making it practical for classroom environments. Understanding One-Shot Prompting Traditional programming education follows an incremental approach: learn syntax, understand concepts, build small programs, gradually tackle larger projects. This method is thorough but slow. One-shot prompting inverts this model—you start with the complete vision and let AI handle the implementation details. A one-shot prompt provides the AI with all the context it needs to generate a complete, working solution in a single response. Instead of iteratively refining code through multiple exchanges, you craft one comprehensive prompt that specifies requirements, constraints, styling preferences, and technical specifications. The AI then produces complete, functional code. This approach teaches a crucial skill: clear communication of technical requirements. Students must think through their entire game concept before typing. What does the game look like? How does the player interact with it? What happens when they win or lose? By forcing this upfront thinking, one-shot prompting develops the same analytical skills that professional developers use when writing specifications or planning architectures. The technique also demonstrates a powerful principle: with sufficient context, AI can handle implementation complexity while humans focus on creativity and design. Students learned they could create sophisticated games without memorizing JavaScript syntax—they just needed to describe their vision clearly enough for the AI to understand. Crafting Effective Prompts for Game Development The difference between a vague prompt and an effective one-shot prompt is the difference between frustration and success. We taught students a structured approach to prompt construction that consistently produced working games. Start with the game type and core mechanic. Don't just say "make a game"—specify what kind: Create a complete HTML5 game where the player controls a spaceship that must dodge falling asteroids. This opening establishes the fundamental gameplay loop: control a spaceship, avoid obstacles. The AI now has a clear mental model to work from. Add visual and interaction details. Games are visual experiences, so specify how things should look and respond: Create a complete HTML5 game where the player controls a spaceship that must dodge falling asteroids. The spaceship should be a blue triangle at the bottom of the screen, controlled by left and right arrow keys. Asteroids are brown circles that fall from the top at random positions and increasing speeds. These additions provide concrete visual targets and define the input mechanism. The AI can now generate specific CSS colors and event handlers. Define win/lose conditions and scoring: Create a complete HTML5 game where the player controls a spaceship that must dodge falling asteroids. The spaceship should be a blue triangle at the bottom of the screen, controlled by left and right arrow keys. Asteroids are brown circles that fall from the top at random positions and increasing speeds. Display a score that increases every second the player survives. The game ends when an asteroid hits the spaceship, showing a "Game Over" screen with the final score and a "Play Again" button. This complete prompt now specifies the entire game loop: gameplay, scoring, losing, and restarting. The AI has everything needed to generate a fully playable game. The formula students learned: Game Type + Visual Description + Controls + Rules + Win/Lose + Score = Complete Game Prompt. Running the Workshop: Structure and Approach Our two-hour workshop followed a carefully designed structure that balanced instruction with hands-on creation. We partnered with University College London and students access to GitHub Education to access resources specifically designed for classroom settings, including student accounts with Copilot access and amazing tools like VSCode and Azure for Students and for Schools VSCode Education. The first 20 minutes covered fundamentals: what is AI, how does Copilot work, and why does prompt quality matter? We demonstrated this with a live example, showing how "make a game" produces confused output while a detailed prompt generates playable code. This contrast immediately captured students' attention, they could see the direct relationship between their words and the AI's output. The next 15 minutes focused on the prompt formula. We broke down several example prompts, highlighting each component: game type, visuals, controls, rules, scoring. Students practiced identifying these elements in prompts before writing their own. This analysis phase prepared them to construct effective prompts independently. The remaining 85 minutes were dedicated to creation. Students worked individually or in pairs, brainstorming game concepts, writing prompts, generating code, testing in browsers, and iterating. Instructors circulated to help debug prompts (not code an important distinction) and encourage experimentation. We deliberately avoided teaching JavaScript syntax. When students encountered bugs, we guided them to refine their prompts rather than manually fix code. This maintained focus on the core skill: communicating with AI effectively. Surprisingly, this approach resulted in fewer bugs overall because students learned to be more precise in their initial descriptions. Student Projects: The Games They Created The diversity of games produced in 85 minutes of building time amazed everyone present. Students didn't just follow a template, they invented entirely new concepts and successfully communicated them to Copilot CLI. One student created a "Fruit Ninja" clone where players clicked falling fruit to slice it before it hit the ground. Another built a typing speed game that challenged players to correctly type increasingly difficult words against a countdown timer. A pair of collaborators produced a two-player tank battle where each player controlled their tank with different keyboard keys. Several students explored educational games: a math challenge where players solve equations to destroy incoming meteors, a geography quiz with animated maps, and a vocabulary builder where correct definitions unlock new levels. These projects demonstrated that one-shot prompting isn't limited to entertainment, students naturally gravitated toward useful applications. The most complex project was a procedurally generated maze game with fog-of-war mechanics. The student spent extra time on their prompt, specifying exactly how visibility should work around the player character. Their detailed approach paid off with a surprisingly sophisticated result that would typically require hours of manual coding. By the session's end, we had 16 complete, playable HTML5 games. Every student who participated produced something they could share with friends and family a tangible achievement that transformed an abstract "coding workshop" into a genuine creative accomplishment. Key Benefits of Copilot CLI for Rapid Prototyping Our workshop revealed several advantages that make Copilot CLI particularly valuable for rapid prototyping scenarios, whether in educational settings or professional development. Speed of iteration fundamentally changes what's possible. Traditional game development requires hours to produce even simple prototypes. With Copilot CLI, students went from concept to playable game in minutes. This compressed timeline enables experimentation, if your first idea doesn't work, try another. This psychological freedom to fail fast and try again proved more valuable than any technical instruction. Accessibility removes barriers to entry. Students with no prior coding experience produced results comparable to those who had taken programming classes. The playing field leveled because success depended on creativity and communication rather than memorized syntax. This democratization of development opens doors for students who might otherwise feel excluded from technical fields. Focus on design over implementation teaches transferable skills. Whether students eventually become programmers, designers, product managers, or pursue entirely different careers, the ability to clearly specify requirements and think through complete systems applies universally. They learned to think like system designers, not just coders. The feedback loop keeps engagement high. Seeing your words transform into working software within seconds creates an addictive cycle of creation and testing. Students who typically struggle with attention during lectures remained focused throughout the building session. The immediate gratification of seeing their games work motivated continuous refinement. Debugging through prompts teaches root cause analysis. When games didn't work as expected, students had to analyze what they'd asked for versus what they received. This comparison exercise developed critical thinking about specifications a skill that serves developers throughout their careers. Tips for Educators: Running Your Own Workshop If you're planning to replicate this workshop, several lessons from our experience will help ensure success. Pre-configure machines whenever possible. While installation is straightforward, classroom time is precious. Having Copilot CLI ready on all devices lets you dive into content immediately. If pre-configuration isn't possible, allocate the first 15-20 minutes specifically for setup and troubleshoot as a group. Prepare example prompts across difficulty levels. Some students will grasp one-shot prompting immediately; others will need more scaffolding. Having templates ranging from simple ("Create Pong") to complex (the spaceship example above) lets you meet students where they are. Emphasize that "prompt debugging" is the goal. When students ask for help fixing broken code, redirect them to examine their prompt. What did they ask for? What did they get? Where's the gap? This redirection reinforces the workshop's core learning objective and builds self-sufficiency. Celebrate and share widely. Build in time at the end for students to demonstrate their games. This showcase moment validates their work and often inspires classmates to try new approaches in future sessions. Consider creating a shared folder or simple website where all games can be accessed after the workshop. Access GitHub Education resources at education.github.com before your workshop. The GitHub Education program provides free access to developer tools for students and educators, including Copilot. The resources there include curriculum materials, teaching guides, and community support that can enhance your workshop. Beyond Games: Where This Leads The techniques students learned extend far beyond game development. One-shot prompting with Copilot CLI works for any development task: creating web pages, building utilities, generating data processing scripts, or prototyping application interfaces. The fundamental skill, communicating requirements clearly to AI applies wherever AI-assisted development tools are used. Several students have continued exploring after the workshop. Some discovered they enjoy the creative aspects of game design and are learning traditional programming to gain more control. Others found that prompt engineering itself interests them, they're exploring how different phrasings affect AI outputs across various domains. For professional developers, the workshop's lessons apply directly to working with Copilot, ChatGPT, and other AI coding assistants. The ability to craft precise, complete prompts determines whether these tools save time or create confusion. Investing in prompt engineering skills yields returns across every AI-assisted workflow. Key Takeaways Clear prompts produce working code: The one-shot prompting formula (Game Type + Visuals + Controls + Rules + Win/Lose + Score) reliably generates playable games from single prompts Copilot CLI democratizes development: Students with no coding experience created functional applications by focusing on communication rather than syntax Rapid iteration enables experimentation: Minutes-per-prototype timelines encourage creative risk-taking and learning from failures Prompt debugging builds analytical skills: Comparing intended versus actual results teaches specification writing and root cause analysis Sixteen games in two hours is achievable: With proper structure and preparation, young students can produce impressive results using AI-assisted development Conclusion and Next Steps Our workshop demonstrated that AI-assisted development tools like GitHub Copilot CLI aren't just productivity boosters for experienced programmers, they're powerful educational instruments that make software creation accessible to beginners. By focusing on prompt engineering rather than traditional syntax instruction, we enabled 14-year-old students to produce complete, functional games in a fraction of the time traditional methods would require. The sixteen games created during those two hours represent more than just workshop outputs. They represent a shift in how we might teach technical creativity: start with vision, communicate clearly, iterate quickly. Whether students pursue programming careers or not, they've gained experience in thinking systematically about requirements and translating ideas into specifications that produce real results. To explore this approach yourself, visit the CopilotCLI-OneShotPromptGameDev repository for prompt templates, workshop materials, and example games. For educational resources and student access to GitHub tools including Copilot, explore GitHub Education. And most importantly, start experimenting. Write a prompt, generate some code, and see what you can create in the next few minutes. Resources CopilotCLI-OneShotPromptGameDev Repository - Workshop materials, prompt templates, and example games GitHub Education - Free developer tools and resources for students and educators GitHub Copilot CLI Documentation - Official installation and usage guide GitHub CLI - Foundation tool required for Copilot CLI GitHub Copilot - Overview of Copilot features and pricing637Views2likes3CommentsBuild an AI-Powered Space Invaders Game
Build an AI-Powered Space Invaders Game: Integrating LLMs into HTML5 Games with Microsoft Foundry Local Introduction What if your game could talk back to you? Imagine playing Space Invaders while an AI commander taunts you during battle, delivers personalized mission briefings, and provides real-time feedback based on your performance. This isn't science fiction it's something you can build today using HTML, JavaScript, and a locally-running AI model. In this tutorial, we'll explore how to create an HTML5 game with integrated Large Language Model (LLM) features using Microsoft Foundry Local. You'll learn how to combine classic game development with modern AI capabilities, all running entirely on your own machine—no cloud services, no API costs, no internet connection required during gameplay. We'll be working with the Space Invaders - AI Commander Edition project, which demonstrates exactly how to architect games that leverage local AI. Whether you're a student learning game development, exploring AI integration patterns, or building your portfolio, this guide provides practical, hands-on experience with technologies that are reshaping how we build interactive applications. What You'll Learn By the end of this tutorial, you'll understand how to combine traditional web development with local AI inference. These skills transfer directly to building chatbots, interactive tutorials, AI-enhanced productivity tools, and any application where you want intelligent, context-aware responses. Set up Microsoft Foundry Local for running AI models on your machine Understand the architecture of games that integrate LLM features Use GitHub Copilot CLI to accelerate your development workflow Implement AI-powered game features like dynamic commentary and adaptive feedback Extend the project with your own creative AI features Why Local AI for Games? Before diving into the code, let's understand why running AI locally matters for game development. Traditional cloud-based AI services have limitations that make them impractical for real-time gaming experiences. Latency is the first challenge. Cloud API calls typically take 500ms to several seconds, an eternity in a game running at 60 frames per second. Local inference can respond in tens of milliseconds, enabling AI responses that feel instantaneous and natural. When an enemy ship appears, your AI commander can taunt you immediately, not three seconds later. Cost is another consideration. Cloud AI services charge per token, which adds up quickly when generating dynamic content during gameplay. Local models have zero per-use cost, once installed, they run entirely on your hardware. This frees you to experiment without worrying about API bills. Privacy and offline capability complete the picture. Local AI keeps all data on your machine, perfect for games that might handle player information. And since nothing requires internet connectivity, your game works anywhere, on planes, in areas with poor connectivity, or simply when you want to play without network access. Understanding Microsoft Foundry Local Microsoft Foundry Local is a runtime that enables you to run small language models (SLMs) directly on your computer. It's designed for developers who want to integrate AI capabilities into applications without requiring cloud infrastructure. Think of it as having a miniature AI assistant living on your laptop. Foundry Local handles the complex work of loading AI models, managing memory, and processing inference requests through a simple API. You send text prompts, and it returns AI-generated responses, all happening locally on your CPU or GPU. The models are optimized to run efficiently on consumer hardware, so you don't need a supercomputer. For our Space Invaders game, Foundry Local powers the "AI Commander" feature. During gameplay, the game sends context about what's happening, your score, accuracy, current level, enemies remaining and receives back contextual commentary, taunts, and encouragement. The result feels like playing alongside an AI companion who actually understands the game. Setting Up Your Development Environment Let's get your machine ready for AI-powered game development. We'll install Foundry Local, clone the project, and verify everything works. The entire setup takes about 10-15 minutes. Step 1: Install Microsoft Foundry Local Foundry Local installation varies by operating system. Open your terminal and run the appropriate command: # Windows (using winget) winget install Microsoft.FoundryLocal # macOS (using Homebrew) brew install microsoft/foundrylocal/foundrylocal These commands download and install the Foundry Local runtime along with a default small language model. The installation includes everything needed to run AI inference locally. Verify the installation by running: foundry --version If you see a version number, Foundry Local is ready. If you encounter errors, ensure you have administrator/sudo privileges and that your package manager is up to date. Step 2: Install Node.js (If Not Already Installed) Our game's AI features require a small Node.js server to communicate between the browser and Foundry Local. Check if Node.js is installed: node --version If you see a version number (v16 or higher recommended), you're set. Otherwise, install Node.js: # Windows winget install OpenJS.NodeJS.LTS # macOS brew install node # Linux sudo apt install nodejs npm Node.js provides the JavaScript runtime that powers our proxy server, bridging browser code with the local AI model. Step 3: Clone the Project Get the Space Invaders project onto your machine: git clone https://github.com/leestott/Spaceinvaders-FoundryLocal.git cd Spaceinvaders-FoundryLocal This downloads all game files, including the HTML interface, game logic, AI integration module, and server code. Step 4: Install Dependencies and Start the Server Install the Node.js packages and launch the AI-enabled server: npm install npm start The first command downloads required packages (primarily for the proxy server). The second starts the server, which listens for AI requests from the game. You should see output indicating the server is running on port 3001. Step 5: Play the Game Open your browser and navigate to: http://localhost:3001 You should see Space Invaders with "AI: ONLINE" displayed in the game HUD, indicating that AI features are active. Use arrow keys or A/D to move, SPACE to fire, and P to pause. The AI Commander will start providing commentary as you play! Understanding the Project Architecture Now that the game is running, let's explore how the different pieces fit together. Understanding this architecture will help you modify the game and apply these patterns to your own projects. The project follows a clean separation of concerns, with each file handling a specific responsibility: Spaceinvaders-FoundryLocal/ ├── index.html # Main game page and UI structure ├── styles.css # Retro arcade visual styling ├── game.js # Core game logic and rendering ├── llm.js # AI integration module ├── sound.js # Web Audio API sound effects ├── server.js # Node.js proxy for Foundry Local └── package.json # Project configuration index.html: Defines the game canvas and UI elements. It's the entry point that loads all other modules. game.js: Contains the game loop, physics, collision detection, scoring, and rendering logic. This is the heart of the game. llm.js: Handles all communication with the AI backend. It formats game state into prompts and processes AI responses. server.js: A lightweight Express server that proxies requests between the browser and Foundry Local. sound.js: Synthesizes retro sound effects using the Web Audio API—no audio files needed! How the AI Integration Works The magic of the AI Commander happens through a simple but powerful pattern. Let's trace the flow from gameplay event to AI response. When something interesting happens in the game, you clear a wave, achieve a combo, or lose a life, the game logic in game.js triggers an AI request. This request includes context about the current game state: your score, accuracy percentage, current level, lives remaining, and what just happened. The llm.js module formats this context into a prompt. For example, when you clear a wave with 85% accuracy, it might construct: You are an AI Commander in a Space Invaders game. The player just cleared wave 3 with 85% accuracy. Score: 12,500. Lives: 3. Provide a brief, enthusiastic comment (1-2 sentences). This prompt travels to server.js , which forwards it to Foundry Local. The AI model processes the prompt and generates a response like: "Impressive accuracy, pilot! Wave 3 didn't stand a chance. Keep that trigger finger sharp!" The response flows back through the server to the browser, where llm.js passes it to the game. The game displays the message in the HUD, creating the illusion of playing alongside an AI companion. This entire round trip typically completes in 50-200 milliseconds, fast enough to feel responsive without interrupting gameplay. Using GitHub Copilot CLI to Explore and Modify the Code GitHub Copilot CLI accelerates your development workflow by letting you ask questions and generate code directly in your terminal. Let's use it to understand and extend the Space Invaders project. Installing Copilot CLI If you haven't installed Copilot CLI yet, here's the quick setup: # Install GitHub CLI winget install GitHub.cli # Windows brew install gh # macOS # Authenticate with GitHub gh auth login # Add Copilot extension gh extension install github/gh-copilot # Verify installation gh copilot --help With Copilot CLI ready, you can interact with AI directly from your terminal while working on the project. Exploring Code with Copilot CLI Use Copilot to understand unfamiliar code. Navigate to the project directory and try: gh copilot explain "How does llm.js communicate with the server?" Copilot analyzes the code and explains the communication pattern, helping you understand the architecture without reading every line manually. You can also ask about specific functions: gh copilot explain "What does the generateEnemyTaunt function do?" This accelerates onboarding to unfamiliar codebases, a valuable skill when working with open source projects or joining teams. Generating New Features Want to add a new AI feature? Ask Copilot to help generate the code: gh copilot suggest "Create a function that asks the AI to generate a mission briefing at the start of each level, including the level number and a random mission objective" Copilot generates starter code that you can customize and integrate. This combination of AI-powered development tools and AI-integrated gameplay demonstrates how LLMs are transforming both how we build games and how games behave. Customizing the AI Commander The default AI Commander provides generic gaming commentary, but you can customize its personality and responses. Open llm.js to find the prompt templates that control AI behavior. Changing the AI's Personality The system prompt defines who the AI "is." Find the base prompt and modify it: // Original const systemPrompt = "You are an AI Commander in a Space Invaders game."; // Customized - Drill Sergeant personality const systemPrompt = `You are Sergeant Blaster, a gruff but encouraging drill sergeant commanding space cadets. Use military terminology, call the player "cadet," and be tough but fair.`; // Customized - Supportive Coach personality const systemPrompt = `You are Coach Nova, a supportive and enthusiastic gaming coach. Use encouraging language, celebrate small victories, and provide gentle guidance when players struggle.`; These personality changes dramatically alter the game's feel without changing any gameplay code. It's a powerful example of how AI can add variety to games with minimal development effort. Adding New Commentary Triggers Currently the AI responds to wave completions and game events. You can add new triggers in game.js : // Add AI commentary when player achieves a kill streak if (killStreak >= 5 && !streakCommentPending) { requestAIComment('killStreak', { count: killStreak }); streakCommentPending = true; } // Add AI reaction when player narrowly avoids death if (nearMissOccurred) { requestAIComment('nearMiss', { livesRemaining: lives }); } Each new trigger point adds another opportunity for the AI to engage with the player, making the experience more dynamic and personalized. Understanding the Game Features Beyond AI integration, the Space Invaders project demonstrates solid game development patterns worth studying. Let's explore the key features. Power-Up System The game includes eight different power-ups, each with unique effects: SPREAD (Orange): Fires three projectiles in a spread pattern LASER (Red): Powerful beam with high damage RAPID (Yellow): Dramatically increased fire rate MISSILE (Purple): Homing projectiles that track enemies SHIELD (Blue): Grants an extra life EXTRA LIFE (Green): Grants two extra lives BOMB (Red): Destroys all enemies on screen BONUS (Gold): Random score bonus between 250-750 points Power-ups demonstrate state management, tracking which power-up is active, applying its effects to player actions, and handling timeouts. Study the power-up code in game.js to understand how temporary state modifications work. Leaderboard System The game persists high scores using the browser's localStorage API: // Saving scores localStorage.setItem('spaceInvadersScores', JSON.stringify(scores)); // Loading scores const savedScores = localStorage.getItem('spaceInvadersScores'); const scores = savedScores ? JSON.parse(savedScores) : []; This pattern works for any data you want to persist between sessions—game progress, user preferences, or accumulated statistics. It's a simple but powerful technique for web games. Sound Synthesis Rather than loading audio files, the game synthesizes retro sound effects using the Web Audio API in sound.js . This approach has several benefits: no external assets to load, smaller project size, and complete control over sound parameters. Examine how oscillators and gain nodes combine to create laser sounds, explosions, and victory fanfares. This knowledge transfers directly to any web project requiring audio feedback. Extending the Project: Ideas for Students Ready to make the project your own? Here are ideas ranging from beginner-friendly to challenging, each teaching valuable skills. Beginner: Customize Visual Theme Modify styles.css to create a new visual theme. Try changing the color scheme from green to blue, or create a "sunset" theme with orange and purple gradients. This builds CSS skills while making the game feel fresh. Intermediate: Add New Enemy Types Create a new enemy class in game.js with different movement patterns. Perhaps enemies that move in sine waves, or boss enemies that take multiple hits. This teaches object-oriented programming and game physics. Intermediate: Expand AI Interactions Add new AI features like: Pre-game mission briefings that set up the story Dynamic difficulty hints when players struggle Post-game performance analysis and improvement suggestions AI-generated names for enemy waves Advanced: Multiplayer Commentary Modify the game for two-player support and have the AI provide play-by-play commentary comparing both players' performance. This combines game networking concepts with advanced AI prompting. Advanced: Voice Integration Use the Web Speech API to speak the AI Commander's responses aloud. This creates a more immersive experience and demonstrates browser speech synthesis capabilities. Troubleshooting Common Issues If something isn't working, here are solutions to common problems. "AI: OFFLINE" Displayed in Game This means the game can't connect to the AI server. Check that: The server is running ( npm start shows no errors) You're accessing the game via http://localhost:3001 , not directly opening the HTML file Foundry Local is installed correctly ( foundry --version works) Server Won't Start If npm start fails: Ensure you ran npm install first Check that port 3001 isn't already in use by another application Verify Node.js is installed ( node --version ) AI Responses Are Slow Local AI performance depends on your hardware. If responses feel sluggish: Close other resource-intensive applications Ensure your laptop is plugged in (battery mode may throttle CPU) Consider that first requests may be slower as the model loads Key Takeaways Local AI enables real-time game features: Microsoft Foundry Local provides fast, free, private AI inference perfect for gaming applications Clean architecture matters: Separating game logic, AI integration, and server code makes projects maintainable and extensible AI personality is prompt-driven: Changing a few lines of prompt text completely transforms how the AI interacts with players Copilot CLI accelerates learning: Use it to explore unfamiliar code and generate new features quickly The patterns transfer everywhere: Skills from this project apply to chatbots, assistants, educational tools, and any AI-integrated application Conclusion and Next Steps You've now seen how to integrate AI capabilities into a browser-based game using Microsoft Foundry Local. The Space Invaders project demonstrates that modern AI features don't require cloud services or complex infrastructure, they can run entirely on your laptop, responding in milliseconds. More importantly, you've learned patterns that extend far beyond gaming. The architecture of sending context to an AI, receiving generated responses, and integrating them into user experiences applies to countless applications: customer support bots, educational tutors, creative writing tools, and accessibility features. Your next step is experimentation. Clone the repository, modify the AI's personality, add new commentary triggers, or build an entirely new game using these patterns. The combination of GitHub Copilot CLI for development assistance and Foundry Local for runtime AI gives you powerful tools to bring intelligent applications to life. Start playing, start coding, and discover what you can create when your games can think. Resources Space Invaders - AI Commander Edition Repository - Full source code and documentation Play Space Invaders Online - Try the basic version without AI features Microsoft Foundry Local Documentation - Official installation and API guide GitHub Copilot CLI Documentation - Installation and usage guide GitHub Education - Free developer tools for students Web Audio API Documentation - Learn about browser sound synthesis Canvas API Documentation - Master HTML5 game rendering542Views0likes1Comment