ai foundry

31 Topics

From Concept to Code: Building Production-Ready Multi-Agent Systems with Microsoft Foundry
We have reached a critical inflection point in AI development. Within the Microsoft Foundry ecosystem, the core value proposition of "Agents" is shifting decisively—moving from passive content generation to active task execution and process automation. These are no longer just conversational interfaces. They are intelligent entities capable of connecting models, data, and tools to actively execute complex business logic. To support this evolution, Microsoft has introduced a powerful suite of capabilities: the Microsoft Agent Framework for sophisticated orchestration, the Agent V2 SDK, and integrated Microsoft Foundry VSCode Extensions. These innovations provide the tooling necessary to bridge the gap between theoretical research and secure, scalable enterprise landing. But how do you turn these separate components into a cohesive business solution? That is the challenge we address today. This post dives into the practical application of these tools, demonstrating how to connect the dots and transform complex multi-agent concepts into deployed reality. The Scenario: Recruitment through an "Agentic Lens" Let’s ground this theoretical discussion with a real-world scenario that perfectly models a multi-agent environment: The Recruitment Process. By examining recruitment through an agentic lens, we can identify distinct entities with specific mandates: The Recruiter Agent: Tasked with setting boundary conditions (job requirements) and preparing data retrieval mechanisms (interview questions). The Applicant Agent: Objective is to process incoming queries and synthesize the best possible output to meet the recruiter's acceptance criteria. Phase 1: Design Achieving Orchestration via Microsoft Foundry Workflows To bridge the gap between our scenario and technical reality, we start with Foundry Workflows. Workflows serves as the visual integration environment within Foundry. It allows you to build declarative pipelines that seamlessly combine deterministic business logic with the probabilistic nature of autonomous AI agents. By adopting this visual, low-code paradigm, you eliminate the need to write complex orchestration logic from scratch. Workflows empowers you to coordinate specialized agents intuitively, creating adaptive systems that solve complex business problems collaboratively. Visually Orchestrating the Cycle Microsoft Foundry provides an intuitive, web-based drag-and-drop interface. This canvas allows you to integrate specialized AI agents alongside standard procedural logic blocks, transforming abstract ideas into executable processes without writing extensive glue code. To translate our recruitment scenario into a functional workflow, we follow a structured approach: Agent Prerequisites: We pre-configure our specialized agents within Foundry. We create a Recruiter Agent (prompted to generate evaluation criteria) and an Applicant Agent (prompted to synthesize responses). Orchestrating the Interaction: We drag these nodes onto the board and define the data flow. The process begins with the Recruiter generating questions, piping that output directly as input for the Applicant agent. Adding Business Logic: A true workflow requires decision-making. We introduce control flow logic, such as IF/ELSE conditional blocks, to evaluate the recruiter's questions based on predefined criteria. This allows the workflow to branch dynamically—if satisfied, the candidate answers the questions; if not, the questions are regenerated. Alternative: YAML Configuration For developers who prefer a code-first approach or wish to rapidly replicate this logic across environments, Foundry allows you to export the underlying YAML. kind: workflow trigger: kind: OnConversationStart id: trigger_wf actions: - kind: SetVariable id: action-1763742724000 variable: Local.LatestMessage value: =UserMessage(System.LastMessageText) - kind: InvokeAzureAgent id: action-1763736666888 agent: name: HiringManager input: messages: =System.LastMessage output: autoSend: true messages: Local.LatestMessage - kind: Question variable: Local.Input id: action-1763737142539 entity: StringPrebuiltEntity skipQuestionMode: SkipOnFirstExecutionIfVariableHasValue prompt: Boss, can you confirm this ? - kind: ConditionGroup conditions: - condition: =Local.Input="Yes" actions: - kind: InvokeAzureAgent id: action-1763744279421 agent: name: ApplyAgent input: messages: =Local.LatestMessage output: autoSend: true messages: Local.LatestMessage - kind: EndConversation id: action-1763740066007 id: if-action-1763736954795-0 id: action-1763736954795 elseActions: - kind: GotoAction actionId: action-1763736666888 id: action-1763737425562 id: "" name: HRDemo description: "" Simulating the End-to-End Process Once constructed, Foundry provides a robust, built-in testing environment. You can trigger the workflow with sample input data to simulate the end-to-end cycle. This allows you to debug hand-offs and interactions in real-time before writing a single line of application code. Phase 2: Develop From Cloud Canvas to Local Code with VSCode Foundry Workflows excels at rapid prototyping. However, a visual UI is rarely sufficient for enterprise-grade production. The critical question becomes: How do we integrate these visual definitions into a rigorous Software Development Lifecycle (SDLC)? While the cloud portal is ideal for design, enterprise application delivery happens in the local IDE. The Microsoft Foundry VSCode Extension bridges this gap. This extension allows developers to: Sync: Pull down workflow definitions from the cloud to your local machine. Inspect: Review the underlying logic in your preferred environment. Scaffold: Rapidly generate the underlying code structures needed to run the flow. This accelerates the shift from "understanding" the flow to "implementing" it. Phase 3: Deploy Productionizing Intelligence with the Microsoft Agent Framework Once the multi-agent orchestration has been validated locally, the final step is transforming it into a shipping application. This is where the Microsoft Agent Framework shines as a runtime engine. It natively ingests the declarative Workflow definitions (YAML) exported from Foundry. This allows artifacts from the prototyping phase to be directly promoted to application deployment. By simply referencing the workflow configuration libraries, you can "hydrate" the entire multi-agent system with minimal boilerplate. Here is the code required to initialize and run the workflow within your application. Note - Check the source code https://github.com/microsoft/Agent-Framework-Samples/tree/main/09.Cases/MicrosoftFoundryWithAITKAndMAF Summary: The Journey from Conversation to Action Microsoft Foundry is more than just a toolbox; it is a comprehensive solution designed to bridge the chasm between theoretical AI research and secure, scalable enterprise applications. In this post, we walked through the three critical stages of modern AI development: Design (Low-Code): Leveraging Foundry Workflows to visually orchestrate specialized agents (Recruiter vs. Applicant) mixed with deterministic business rules. Develop (Local SDLC): Utilizing the VSCode Extension to break down the barriers between the cloud canvas and the local IDE, enabling seamless synchronization and debugging. Deploy (Native Runtime): Using the Microsoft Agent Framework to ingest declarative YAML, realizing the promise of "Configuration as Code" and eliminating tedious logic rewriting. By following this path, developers can move beyond simple content generation and build adaptive, multi-agent systems that drive real business value. Learning Resoures What's Microsoft Foundry (https://learn.microsoft.com/azure/ai-foundry/what-is-azure-ai-foundry?view=foundry) Work with Declarative (Low-code) Agent workflows in Visual Studio Code (preview) (https://learn.microsoft.com/azure/ai-foundry/agents/how-to/vs-code-agents-workflow-low-code?view=foundry) Microsoft Agent Framework(https://github.com/microsoft/agent-framework) Microsoft Foundry VSCode Extension(https://marketplace.visualstudio.com/items?itemName=TeamsDevApp.vscode-ai-foundry)
kinfey
Nov 25, 2025 Place Microsoft Developer Community Blog
7.4KViews
1like
0Comments
Azure Skilling at Microsoft Ignite 2025
The energy at Microsoft Ignite was unmistakable. Developers, architects, and technical decision-makers converged in San Francisco to explore the latest innovations in cloud technology, AI applications, and data platforms. Beyond the keynotes and product announcements was something even more valuable: an integrated skilling ecosystem designed to transform how you build with Azure. This year Azure Skilling at Microsoft Ignite 2025 brought together distinct learning experiences, over 150+ hands-on labs, and multiple pathways to industry-recognized credentials—all designed to help you master skills that matter most in today's AI-driven cloud landscape. Just Launched at Ignite Microsoft Ignite 2025 offered an exceptional array of learning opportunities, each designed to meet developers anywhere on the skilling journey. Whether you joined us in-person or on-demand in the virtual experience, multiple touchpoints are available to deepen your Azure expertise. Ignite 2025 is in the books, but you can still engage with the latest Microsoft skilling opportunities, including: The Azure Skills Challenge provides a gamified learning experience that lets you compete while completing task-based achievements across Azure's most critical technologies. These challenges aren't just about badges and bragging rights—they're carefully designed to help you advance technical skills and prepare for Microsoft role-based certifications. The competitive element adds urgency and motivation, turning learning into an engaging race against the clock and your peers. For those seeking structured guidance, Plans on Learn offer curated sets of content designed to help you achieve specific learning outcomes. These carefully assembled learning journeys include built-in milestones, progress tracking, and optional email reminders to keep you on track. Each plan represents 12-15 hours of focused learning, taking you from concept to capability in areas like AI application development, data platform modernization, or infrastructure optimization. The Microsoft Reactor Azure Skilling Series, running December 3-11, brings skilling to life through engaging video content, mixing regular programming with special Ignite-specific episodes. This series will deliver technical readiness and programming guidance in a livestream presentation that's more digestible than traditional documentation. Whether you're catching episodes live with interactive Q&A or watching on-demand later, you’ll get world-class instruction that makes complex topics approachable. Beyond Ignite: Your Continuous Learning Journey Here's the critical insight that separates Ignite attendees who transform their careers from those who simply collect swag: the real learning begins after the event ends. Microsoft Ignite is your launchpad, not your destination. Every module you start, every lab you complete, and every challenge you tackle connects to a comprehensive learning ecosystem on Microsoft Learn that's available 24/7, 365 days a year. Think of Ignite as your intensive immersion experience—the moment when you gain context, build momentum, and identify the skills that will have the biggest impact on your work. What you do in the weeks and months following determines whether that momentum compounds into career-defining expertise or dissipates into business as usual. For those targeting career advancement through formal credentials, Microsoft Certifications, Applied Skills and AI Skills Navigator, provide globally recognized validation of your expertise. Applied Skills focus on scenario-based competencies, demonstrating that you can build and deploy solutions, not simply answer theoretical questions. Certifications cover role-based scenarios for developers, data engineers, AI engineers, and solution architects. The assessment experiences include performance-based testing in dedicated Azure tenants where you complete real configuration and development tasks. And finally, the NEW AI Skills Navigator is an agentic learning space, bringing together AI-powered skilling experiences and credentials in a single, unified experience with Microsoft, LinkedIn Learning and GitHub – all in one spot Why This Matters: The Competitive Context The cloud skills race is intensifying. While our competitors offer robust training and content, Microsoft's differentiation comes not from having more content—though our 1.4 million module completions last fiscal year and 35,000+ certifications awarded speak to scale—but from integration of services to orchestrate workflows. Only Microsoft offers a truly unified ecosystem where GitHub Copilot accelerates your development, Azure AI services power your applications, and Azure platform services deploy and scale your solutions—all backed by integrated skilling content that teaches you to maximize this connected experience. When you continue your learning journey after Ignite, you're not just accumulating technical knowledge. You're developing fluency in an integrated development environment that no competitor can replicate. You're learning to leverage AI-powered development tools, cloud-native architectures, and enterprise-grade security in ways that compound each other's value. This unified expertise is what transforms individual developers into force-multipliers for their organizations. Start Now, Build Momentum, Never Stop Microsoft Ignite 2025 offered the chance to compress months of learning into days of intensive, hands-on experience, but you can still take part through the on-demand videos, the Global Ignite Skills Challenge, visiting the GitHub repos for the /Ignite25 labs, the Reactor Azure Skilling Series, and the curated Plans on Learn provide multiple entry points regardless of your current skill level or preferred learning style. But remember: the developers who extract the most value from Ignite are those who treat the event as the beginning, not the culmination, of their learning journey. They join hackathons, contribute to GitHub repositories, and engage with the Azure community on Discord and technical forums. The question isn't whether you'll learn something valuable from Microsoft Ignite 2025-that's guaranteed. The question is whether you'll convert that learning into sustained momentum that compounds over months and years into career-defining expertise. The ecosystem is here. The content is ready. Your skilling journey doesn't end when Ignite does—it accelerates.
AaronStark
Nov 26, 2025 Place Microsoft Developer Community Blog
3.2KViews
0likes
0Comments
AI Toolkit Extension Pack for Visual Studio Code: Ignite 2025 Update
Unlock the Latest Agentic App Capabilities The Ignite 2025 update delivers a major leap forward for the AI Toolkit extension pack in VS Code, introducing a unified, end-to-end environment for building, visualizing, and deploying agentic applications to Microsoft Foundry, and the addition of Anthropic’s frontier Claude models in the Model Catalog! This release enables developers to build and debug locally in VS Code, then deploy to the cloud with a single click. Seamlessly switch between VS Code and the Foundry portal for visualization, orchestration, and evaluation, creating a smooth roundtrip workflow that accelerates innovation and delivers a truly unified AI development experience. Download the http://aka.ms/aitoolkit today and start building next-generation agentic apps in VS Code! What Can You Do with the AI Toolkit Extension Pack? Access Anthropic models in the Model Catalog Following the Microsoft, NVIDIA and Anthropic strategic partnerships announcement today, we are excited to share that Anthropic’s frontier Claude models including Claude Sonnet 4.5, Claude Opus 4.1, and Claude Haiku 4.5, are now integrated into the AI Toolkit, providing even more choices and flexibility when building intelligent applications and AI agents. Build AI Agents Using GitHub Copilot Scaffold agent applications using best-practice patterns, tool-calling examples, tracing hooks, and test scaffolds, all powered by Copilot and aligned with the Microsoft Agent Framework. Generate agent code in Python or .NET, giving you flexibility to target your preferred runtime. Build and Customize YAML Workflows Design YAML-based workflows in the Foundry portal, then continue editing and testing directly in VS Code. To customize your YAML-based workflows, instantly convert it to Agent Framework code using GitHub Copilot. Upgrade from declarative design to code-first customization without starting from scratch. Visualize Multi-Agent Workflows Envision your code-based agent workflows with an interactive graph visualizer that reveals each component and how they connect Watch in real-time how each node lights up as you run your agent. Use the visualizer to understand and debug complex agent graphs, making iteration fast and intuitive. Experiment, Debug, and Evaluate Locally Use the Hosted Agents Playground to quickly interact with your agents on your development machine. Leverage local tracing support to debug reasoning steps, tool calls, and latency hotspots—so you can quickly diagnose and fix issues. Define metrics, tasks, and datasets for agent evaluation, then implement metrics using the Foundry Evaluation SDK and orchestrate evaluations runs with the help of Copilot. Seamless Integration Across Environments Jump from Foundry Portal to VS Code Web for a development environment in your preferred code editor setting. Open YAML workflows, playgrounds, and agent templates directly in VS Code for editing and deployment. How to Get Started Install the AI Toolkit extension pack from the VS Code marketplace. Check out documentation. Get started with building workflows with Microsoft Foundry in VS Code 1. Work with Hosted (Pro-code) Agent workflows in VS Code 2. Work with Declarative (Low-code) Agent workflows in VS Code Feedback & Support Try out the extensions and let us know what you think! File issues or feedback on our GitHub repo for Foundry extension and AI Toolkit extension. Your input helps us make continuous improvements.
leoyao
Nov 18, 2025 Place Microsoft Developer Community Blog
2.2KViews
4likes
0Comments
Serverless MCP Agent with LangChain.js v1 — Burgers, Tools, and Traces 🍔
AI agents that can actually do stuff (not just chat) are the fun part nowadays, but wiring them cleanly into real APIs, keeping things observable, and shipping them to the cloud can get... messy. So we built a fresh end‑to‑end sample to show how to do it right with the brand new LangChain.js v1 and Model Context Protocol (MCP). In case you missed it, MCP is a recent open standard that makes it easy for LLM agents to consume tools and APIs, and LangChain.js, a great framework for building GenAI apps and agents, has first-class support for it. You can quickly get up speed with the MCP for Beginners course and AI Agents for Beginners course. This new sample gives you: A LangChain.js v1 agent that streams its result, along reasoning + tool steps An MCP server exposing real tools (burger menu + ordering) from a business API A web interface with authentication, sessions history, and a debug panel (for developers) A production-ready multi-service architecture Serverless deployment on Azure in one command ( azd up ) Yes, it’s a burger ordering system. Who doesn't like burgers? Grab your favorite beverage ☕, and let’s dive in for a quick tour! TL;DR key takeaways New sample: full-stack Node.js AI agent using LangChain.js v1 + MCP tools Architecture: web app → agent API → MCP server → burger API Runs locally with a single npm start , deploys with azd up Uses streaming (NDJSON) with intermediate tool + LLM steps surfaced to the UI Ready to fork, extend, and plug into your own domain / tools What will you learn here? What this sample is about and its high-level architecture What LangChain.js v1 brings to the table for agents How to deploy and run the sample How MCP tools can expose real-world APIs Reference links for everything we use GitHub repo LangChain.js docs Model Context Protocol Azure Developer CLI MCP Inspector Use case You want an AI assistant that can take a natural language request like “Order two spicy burgers and show me my pending orders” and: Understand intent (query menu, then place order) Call the right MCP tools in sequence, calling in turn the necessary APIs Stream progress (LLM tokens + tool steps) Return a clean final answer Swap “burgers” for “inventory”, “bookings”, “support tickets”, or “IoT devices” and you’ve got a reusable pattern! Sample overview Before we play a bit with the sample, let's have a look at the main services implemented here: Service Role Tech Agent Web App ( agent-webapp ) Chat UI + streaming + session history Azure Static Web Apps, Lit web components Agent API ( agent-api ) LangChain.js v1 agent orchestration + auth + history Azure Functions, Node.js Burger MCP Server ( burger-mcp ) Exposes burger API as tools over MCP (Streamable HTTP + SSE) Azure Functions, Express, MCP SDK Burger API ( burger-api ) Business logic: burgers, toppings, orders lifecycle Azure Functions, Cosmos DB Here's a simplified view of how they interact: There are also other supporting components like databases and storage not shown here for clarity. For this quickstart we'll only interact with the Agent Web App and the Burger MCP Server, as they are the main stars of the show here. LangChain.js v1 agent features The recent release of LangChain.js v1 is a huge milestone for the JavaScript AI community! It marks a significant shift from experimental tools to a production-ready framework. The new version doubles down on what’s needed to build robust AI applications, with a strong focus on agents. This includes first-class support for streaming not just the final output, but also intermediate steps like tool calls and agent reasoning. This makes building transparent and interactive agent experiences (like the one in this sample) much more straightforward. Quickstart Requirements GitHub account Azure account (free signup, or if you're a student, get free credits here) Azure Developer CLI Deploy and run the sample We'll use GitHub Codespaces for a quick zero-install setup here, but if you prefer to run it locally, check the README. Click on the following link or open it in a new tab to launch a Codespace: Create Codespace This will open a VS Code environment in your browser with the repo already cloned and all the tools installed and ready to go. Provision and deploy to Azure Open a terminal and run these commands: # Install dependencies npm install # Login to Azure azd auth login # Provision and deploy all resources azd up Follow the prompts to select your Azure subscription and region. If you're unsure of which one to pick, choose East US 2 . The deployment will take about 15 minutes the first time, to create all the necessary resources (Functions, Static Web Apps, Cosmos DB, AI Models). If you're curious about what happens under the hood, you can take a look at the main.bicep file in the infra folder, which defines the infrastructure as code for this sample. Test the MCP server While the deployment is running, you can run the MCP server and API locally (even in Codespaces) to see how it works. Open another terminal and run: npm start This will start all services locally, including the Burger API and the MCP server, which will be available at http://localhost:3000/mcp . This may take a few seconds, wait until you see this message in the terminal: 🚀 All services ready 🚀 When these services are running without Azure resources provisioned, they will use in-memory data instead of Cosmos DB so you can experiment freely with the API and MCP server, though the agent won't be functional as it requires a LLM resource. MCP tools The MCP server exposes the following tools, which the agent can use to interact with the burger ordering system: Tool Name Description get_burgers Get a list of all burgers in the menu get_burger_by_id Get a specific burger by its ID get_toppings Get a list of all toppings in the menu get_topping_by_id Get a specific topping by its ID get_topping_categories Get a list of all topping categories get_orders Get a list of all orders in the system get_order_by_id Get a specific order by its ID place_order Place a new order with burgers (requires userId , optional nickname ) delete_order_by_id Cancel an order if it has not yet been started (status must be pending , requires userId ) You can test these tools using the MCP Inspector. Open another terminal and run: npx -y @modelcontextprotocol/inspector Then open the URL printed in the terminal in your browser and connect using these settings: Transport: Streamable HTTP URL: http://localhost:3000/mcp Connection Type: Via Proxy (should be default) Click on Connect, then try listing the tools first, and run get_burgers tool to get the menu info. Test the Agent Web App After the deployment is completed, you can run the command npm run env to print the URLs of the deployed services. Open the Agent Web App URL in your browser (it should look like https://<your-web-app>.azurestaticapps.net ). You'll first be greeted by an authentication page, you can sign in either with your GitHub or Microsoft account and then you should be able to access the chat interface. From there, you can start asking any question or use one of the suggested prompts, for example try asking: Recommend me an extra spicy burger . As the agent processes your request, you'll see the response streaming in real-time, along with the intermediate steps and tool calls. Once the response is complete, you can also unfold the debug panel to see the full reasoning chain and the tools that were invoked: Tip: Our agent service also sends detailed tracing data using OpenTelemetry. You can explore these either in Azure Monitor for the deployed service, or locally using an OpenTelemetry collector. We'll cover this in more detail in a future post. Wrap it up Congratulations, you just finished spinning up a full-stack serverless AI agent using LangChain.js v1, MCP tools, and Azure’s serverless platform. Now it's your turn to dive in the code and extend it for your use cases! 😎 And don't forget to azd down once you're done to avoid any unwanted costs. Going further This was just a quick introduction to this sample, and you can expect more in-depth posts and tutorials soon. Since we're in the era of AI agents, we've also made sure that this sample can be explored and extended easily with code agents like GitHub Copilot. We even built a custom chat mode to help you discover and understand the codebase faster! Check out the Copilot setup guide in the repo to get started. You can quickly get up speed with the MCP for Beginners course and AI Agents for Beginners course. If you like this sample, don't forget to star the repo ⭐️! You can also join us in the Azure AI community Discord to chat and ask any questions. Happy coding and burger ordering! 🍔
sinedied
Oct 22, 2025 Place Microsoft Developer Community Blog
2.2KViews
0likes
1Comment
Introducing the Microsoft Agent Framework
Introducing the Microsoft Agent Framework: A Unified Foundation for AI Agents and Workflows The landscape of AI development is evolving rapidly, and Microsoft is at the forefront with the release of the Microsoft Agent Framework an open-source SDK designed to empower developers to build intelligent, multi-agent systems with ease and precision. Whether you're working in .NET or Python, this framework offers a unified, extensible foundation that merges the best of Semantic Kernel and AutoGen, while introducing powerful new capabilities for agent orchestration and workflow design. Introducing Microsoft Agent Framework: The Open-Source Engine for Agentic AI Apps | Azure AI Foundry Blog Introducing Microsoft Agent Framework | Microsoft Azure Blog Why Another Agent Framework? Both Semantic Kernel and AutoGen have pioneered agentic development, Semantic Kernel with its enterprise-grade features and AutoGen with its research-driven abstractions. The Microsoft Agent Framework is the next generation of both, built by the same teams to unify their strengths: AutoGen’s simplicity in multi-agent orchestration. Semantic Kernel’s robustness in thread-based state management, telemetry, and type safety. New capabilities like graph-based workflows, checkpointing, and human-in-the-loop support This convergence means developers no longer have to choose between experimentation and production. The Agent Framework is designed to scale from single-agent prototypes to complex, enterprise-ready systems Core Capabilities AI Agents AI agents are autonomous entities powered by LLMs that can process user inputs, make decisions, call tools and MCP servers, and generate responses. They support providers like Azure OpenAI, OpenAI, and Azure AI, and can be enhanced with: Agent threads for state management. Context providers for memory. Middleware for action interception. MCP clients for tool integration Use cases include customer support, education, code generation, research assistance, and more—especially where tasks are dynamic and underspecified. Workflows Workflows are graph-based orchestrations that connect multiple agents and functions to perform complex, multi-step tasks. They support: Type-based routing Conditional logic Checkpointing Human-in-the-loop interactions Multi-agent orchestration patterns (sequential, concurrent, hand-off, Magentic) Workflows are ideal for structured, long-running processes that require reliability and modularity. Developer Experience The Agent Framework is designed to be intuitive and powerful: Installation: Python: pip install agent-framework .NET: dotnet add package Microsoft.Agents.AI Integration: Works with Foundry SDK, MCP SDK, A2A SDK, and M365 Copilot Agents Samples and Manifests: Explore declarative agent manifests and code samples Learning Resources: Microsoft Learn modules AI Agents for Beginners AI Show demos Azure AI Foundry Discord community Migration and Compatibility If you're currently using Semantic Kernel or AutoGen, migration guides are available to help you transition smoothly. The framework is designed to be backward-compatible where possible, and future updates will continue to support community contributions via the GitHub repository. Important Considerations The Agent Framework is in public preview. Feedback and issues are welcome on the GitHub repository. When integrating with third-party servers or agents, review data sharing practices and compliance boundaries carefully. The Microsoft Agent Framework marks a pivotal moment in AI development, bringing together research innovation and enterprise readiness into a single, open-source foundation. Whether you're building your first agent or orchestrating a fleet of them, this framework gives you the tools to do it safely, scalably, and intelligently. Ready to get started? Download the SDK, explore the documentation, and join the community shaping the future of AI agents.
Lee_Stott
Oct 01, 2025 Place Microsoft Developer Community Blog
1.7KViews
0likes
1Comment
GPT-5 Family of Models & GPT OSS Are Now Available in AI Toolkit for VS Code
AI Toolkit v0.18.3 is here—and it’s a major milestone. This release introduces full support for: The latest GPT-5 family of models OpenAI’s open-source models (GPT OSS) via Azure AI Foundry and ONNX Runtime Whether you're building in the cloud, on the edge, or experimenting locally, this update gives you more flexibility, performance, and choice than ever.
junjieli
Aug 10, 2025 Place Microsoft Developer Community Blog
1.7KViews
1like
0Comments
LangChain v1 is now generally available!
Today LangChain v1 officially launches and marks a new era for the popular AI agent library. The new version ushers in a more streamlined, and extensible foundation for building agentic LLM applications. In this post we'll breakdown what’s new, what changed, and what “general availability” means in practice. Join Microsoft Developer Advocates, Marlene Mhangami and Yohan Lasorsa, to see live demos of the new API and find out more about what JavaScript and Python developers need to know about v1. Register for this event here. Why v1? The Motivation Behind the Redesign The number of abstractions in LangChain had grown over the years to include chains, agents, tools, wrappers, prompt helpers and more, which, while powerful, introduced complexity and fragmentation. As model APIs evolve (multimodal inputs, richer structured output, tool-calling semantics), LangChain needed a cleaner, more consistent core to ensure production ready stability. In v1: All existing chains and agent abstractions in the old LangChain are deprecated; they are replaced by a single high-level agent abstraction built on LangGraph internals. LangGraph becomes the foundational runtime for durable, stateful, orchestrated execution. LangChain now emphasizes being the “fast path to agents” that doesn’t hide but builds upon LangGraph. The internal message format has been upgraded to support standard content blocks (e.g. text, reasoning, citations, tool calls) across model providers, decoupling “content” from raw strings. Namespace cleanup: the langchain package now focuses tightly on core abstractions (agents, models, messages, tools), while legacy patterns are moved into langchain-classic (or equivalents). What’s New & Noteworthy for Developers Here are key changes developers should pay attention to: 1. create_agent becomes the default API The create_agent function is now the idiomatic way to spin up agents in v1. It replaces older constructs (e.g. create_react_agent) with a clearer, more modular API. You can also now compose middleware around model calls, tool calls, before/after hooks, error handling, etc. 2. Standard content blocks & normalized message model One of LangChain's greatest stregnth's is it's model agnosticism. Content blocks move to standardize all outputs, so developers know exactly what to expect regardless of the model they are using. Responses from models are no longer opaque strings. Instead, they carry structured `content_blocks` which classify parts of the output (e.g. “text”, “reasoning”, “citation”, “tool_call”). 3. Multimodal and richer model inputs / outputs LangChain continues to support more than just text-based interactions, but in a more comprehensive way in v1. Models can accept and return files, images, video, etc., and the message format reflects this flexibility. This upgrade prepares us well for the next generation of models with mixed modalities (vision, audio, etc.). 4. Middleware hooks Because create_agent is designed as a pluggable pipeline, developers can now inject logic before/after model calls, before tool calls and more. New middleware such as 'human in the loop' and 'summarization' middleware have been added. This is a feature of the new package that I am most excited about it! Even with the simplified agents API, this option provides more room to customize workflows! Developers can try pre-built middleware or make their own. 5. Simplified, leaner namespace Many formerly top-level modules or helper classes have been removed or relocated to langchain-classic (or similarly stamped “legacy”) to declutter the main API surface. A migration guide is available to help projects transition from v0 to v1. While v1 is now the main line, older v0 is still documented and maintained for compatibility. What “General Availability” Means (and Doesn’t) v1 is production-ready, after testing the alpha version. The stable v0 release line remains supported for those unwilling or unable to migrate immediately. Breaking changes in public APIs will be accompanied by version bumps (i.e. minor version increments) and deprecation notices. The roadmap anticipates minor versions every 2–3 months (with patch releases more frequently). Because the field of LLM applications is evolving rapidly, the team expects continued iterations in v1—even in GA mode—with users encouraged to surface feedback, file issues, and adopt the migration path. (This is in line with the philosophy stated in docs.) Developer Callouts & Suggested Steps Some things we recommend for developers to do to get started with v1: Try the new API Now! LangChain Azure AI and Azure OpenAI have migrated to LangChain v1 and are ready to test! Learn more about using LangChain and Azure AI: Python: https://docs.langchain.com/oss/python/integrations/providers/azure_ai JavaScript: https://docs.langchain.com/oss/javascript/integrations/providers/microsoft Join us for a Live Stream on Wednesday 22 October 2025 Join Microsoft Developer Advocates Marlene Mhangami and Yohan Lasorsa for a livestream this Wednesday to see live demos and find out more about what JavaScript and Python developers need to know about v1. Register for this event here.
mmhangami
Oct 17, 2025 Place Microsoft Developer Community Blog
1.4KViews
0likes
0Comments
Build Multi‑Agent AI Systems with Microsoft
Like many of you I have been on a journey to build AI systems where multiple agents (AI models with tools and autonomy) collaborate to solve complex tasks. In this post, I want to share the engineering challenges we faced, the architecture we designed with Azure AI Foundry, and the lessons learned along the way. Our goal is to empower AI engineers and developers to leverage multi-agent systems for real-world applications, with the benefit of Microsoft’s tools, research insights, and enterprise-grade platform. Why Multi‑Agent Systems? The Need for AI Teamwork Building a single AI agent to perform a task is often straightforward. However, many real-world processes are too complex for one agent alone. Tasks like in-depth research, enterprise workflow automation, or multi-step customer service involve context switching and specialized knowledge that overwhelm a lone chatbot. Multi-agent systems address this by distributing work across specialized agents while maintaining coordination. This approach brings several advantages: Scalability: Workloads can be split among agents, enabling horizontal scaling as tasks or data increase. More agents can handle more subtasks in parallel, avoiding bottlenecks. Specialisation: Each agent can be fine-tuned for a specific role or domain (e.g. research, summarisation, data extraction), which improves performance and maintainability. No single model has to be a master of all trades. Flexibility: Modular agents can be reused in different workflows or recombined to create new capabilities. It’s easy to extend the system by adding or swapping an agent without redesigning everything. Robustness: If one agent fails or underperforms, others can pick up the slack. Decoupling tasks means the overall system can tolerate faults better than a monolithic agent. This mirrors how human teams work: we achieve more by dividing and conquering complex problems. In fact, internal experiments and industry reports have shown that groups of AI agents can significantly outperform a single powerful model on complex, open-ended tasks. For example, Anthropic found a multi-agent system (Claude agents working together) answered 90% more queries correctly than a single-agent approach in one evaluation. The ability to operate in parallel is key – our experience likewise showed that multiple agents exploring different aspects of a problem can cover far more ground, albeit with increased resource usage. Challenge: A downside of multi-agent setups is they consume more resources (more model calls, more tokens) than single-agent runs. In Anthropic’s research, multi-agent systems used ~15× the tokens of a single chat session. We’ve observed similarly that letting agents think and interact in depth pays off in better results, but at a cost. Ensuring the task’s value justifies the cost is important when choosing a multi-agent solution. Designing the Architecture: Orchestration via a Lead Agent To harness these benefits, we designed a multi-agent architecture built around an orchestrator-worker pattern – very similar to Anthropic’s “lead agent and subagents” approach. In Azure AI Foundry (our enterprise AI platform), this takes shape as Connected Agents: a mechanism where a main agent can spawn and coordinate child agents to handle sub-tasks. The main agent is the brain of the operation, responsible for understanding the user’s request, breaking it into parts, and delegating those parts to the appropriate specialist agents. Each agent in the system is defined with three core components: Instructions (prompt/policy): defining the agent’s goal, role, and constraints (its “game plan”). Model: an LLM that powers the agent’s reasoning and dialogue (e.g. GPT-4 or other models available in Foundry). Tools: external capabilities the agent can invoke to get information or take actions (e.g. web search, databases, APIs). By composing agents with different instructions and tools, we create a team where each agent has a clear role. The main agent’s role is orchestration; the sub-agents focus on specific tasks. This separation of concerns makes the system easier to understand and debug, and prevents any single context window from becoming overloaded. How it works (overview): When a user query comes in, the lead agent analyzes the request and devises a plan. It may decide that multiple pieces of information or steps are needed. The lead agent then spins up subordinate agents in parallel to gather or compute those pieces]. Each sub-agent operates with its own context window and tools, exploring one aspect of the task. They report their findings back to the lead agent, which integrates the results and decides if more exploration is required. The loop continues until the lead agent is satisfied that it can produce a final answer, at which point it consolidates everything and returns the result to the user. This orchestrator/sub-agent pattern is powerful because it lets complex tasks be solved through natural language delegation rather than hard-coded logic. Notably, the main agent doesn’t need an if/else tree written by us to decide which sub-agent handles what; it uses the language model’s reasoning to route tasks. In Azure AI Foundry’s Connected Agents, the primary agent simply says (in effect) “You, Agent A, do X; You, Agent B, do Y,” and the platform handles the rest—no custom orchestration code needed. This drastically simplified our development: we focus on crafting the right prompts and agent designs, and let the AI figure out the coordination. Example: Sales Assistant with Specialist Agents To make this concrete, imagine a Sales Preparation Assistant that helps a sales team research a client before a meeting. Instead of trying to cram all knowledge and skills into one model, we give the assistant a team of four sub-agents, each an expert in a different area. The main agent (“Sales Assistant”) will ask each specialist for input and then compile a briefing. Agent Role Purpose & Task Example Tools/Models Used Market Research Agent Gathers industry trends and news related to the client’s sector. Bing Web Search, internal news API Competitive Analysis Agent Finds information on the client’s competitors and market position. Web Search, Company DB Customer Insights Agent Summarises the client’s history and interactions (from CRM data). Azure Cognitive Search on CRM, GPT-4 Financial Analysis Agent Reviews the client’s financial data and recent performance. Finance database query tool, Excel APIs Main Sales Assistant Orchestrator that delegates to the above agents, then synthesises a final report for the sales team. GPT-4 (with instructions to compile and format results) In this scenario, the Main Sales Assistant agent would ask each sub-agent to report on their specialty (market news, competition, CRM insights, finances). Rather than one AI trying to do it all (and possibly missing nuances), we have focused mini-AIs each doing a thorough job in parallel. This approach was shown to reduce the overall time required and improve the quality of the final output. In early trials, such multi-agent setups often succeed where single agents fall short – for instance, finding all relevant facts across disparate sources and preparing a comprehensive briefing more quickly. Development is easier too: if tomorrow we need to add a “Regulatory Compliance Agent” for a new client requirement, we can plug it in without retraining or heavily modifying the others. Orchestration under the hood: Azure AI Foundry’s Agent Service provides the runtime that makes all this work reliably. It manages the message passing between the main agent and sub-agents, ensures each tool invocation is executed (with retries on failure), and keeps a structured log of the entire multi-agent conversation (we call it a thread). This means developers don’t have to manually implement how agents call each other or share data; the platform handles those mechanics. Foundry also supports true agent-to-agent messaging if agents need to talk directly, but often a hierarchical pattern (through the main agent) suffices for task delegation. Tools, Knowledge, and the Model Context Protocol (MCP) For agents to be effective, especially in enterprise scenarios, they must integrate with external knowledge sources and services – no single LLM knows everything or can perform all actions. Microsoft’s approach emphasizes a rich tool integration layer. In our system, tools range from web search and databases to APIs for taking real actions (sending emails, executing workflows, etc.) Equipping agents with the right tools extends their capabilities dramatically: an agent can retrieve up-to-date info, pull data behind corporate firewalls, or trigger business processes. One key innovation is the Model-Context Protocol (MCP), which Foundry uses to manage tools. MCP provides a structured way for agents to discover and use tools dynamically at runtime. Traditionally, if you wanted your AI agent to use a new tool, you might have to hard-code that tool’s API and update the agent’s code or prompt. With MCP, tools are defined on a central tool server (with descriptions and endpoints), and agents can query this server to see what tools are available. The agent’s SDK then generates the necessary code “stubs” to call the tool on the fly. This means: Easier maintenance: You can add, update, or remove tools in one place (the MCP registry) without changing the agent’s code. When the Finance database API updates, just update its MCP entry; all agents automatically get the new version next time they run. Dynamic adaptability: Agents can choose tools based on context. For example, a research agent might discover that a new MarketAnalysisAPI tool is available and start using it for a finance query, whereas previously it only had a generic web search. Separation of concerns: Those building AI agents can rely on domain experts to maintain the tool definitions, while they focus on the agent logic. Agents treat tools in a uniform way, as functions they can call. In practice, tool selection became a critical part of our agent design. A lesson we learned is that giving agents access to the right tool, with a clear description, can make or break their performance. If a tool’s description is vague or overlapping with another, the agent might choose the wrong approach and wander down a blind alley. For instance, we saw cases where an agent would stubbornly query an internal knowledge base for information that actually only existed on the web, simply because the tool prompt made the web search sound less relevant. We addressed this by carefully curating tool descriptions and even building an internal tool-testing agent that automatically tries out tools and suggests better descriptions for them. Ensuring each tool had a distinct purpose and clear usage guidance dramatically improved our agents’ success rate in choosing the optimal tool for a given job. Finally, multi-modal support is worth noting. Some tasks involve not just text, but images or other media. Our multi-agent architecture, especially with Azure AI Foundry, can incorporate vision-capable models as agents or tools. For example, an “Image Analysis Agent” could be part of a team, or an agent might call a vision API tool. The Telco customer service demo (using Foundry + OpenAI Agent SDK) featured an agent that could handle image uploads (like an ID document) by invoking an image-processing function. The orchestration framework doesn’t fundamentally change with multi-modality, it simply treats the vision model as another specialist agent or tool in the conversation. The ability to plug in different AI skills (text, vision, search, etc.) under a unified agent system is a big advantage of Microsoft’s approach: the agent team becomes cross-functional, each member with their own modality or expertise, collectively solving richer tasks than any single foundation model could. Reliability, Safety, and Enterprise-Grade Engineering While the basic idea of agents chatting and calling tools is elegant, productionizing this system for enterprise use brought serious engineering challenges. We needed our multi-agent system to be reliable, controllable, and secure. Here are the key areas we focused on and how we addressed them: Observability and Debugging Multi-agent chains can be complex and non-deterministic each run might involve different paths as agents make choices. Early on, we realized that treating the system as a black box was untenable. Developers and operators must be able to observe what each agent is “thinking” and doing, or else diagnosing issues would be impossible. Azure AI Foundry’s Agent Service was built with full conversation traceability in mind. Every message between agents (and to the user), every tool invocation and result, is captured in a structured thread log. We integrated this with Azure Application Insights telemetry, so one can monitor performance, latencies, errors, and even token consumption of agents in real time. This tracing proved invaluable. For example, when a complex workflow wasn’t producing the expected outcome, we could replay the entire agent conversation step by step to see where things went awry. In one instance, we found that two sub-agents were given slightly overlapping responsibilities, causing them to waste time retrieving nearly identical information. The logs and message transcripts made this immediately clear, guiding us to tighten the role definitions. Moreover, because the system logs are structured (not just free-form text), we could build automatic analysis tools like checking how often an agent hits a retry or how many cycles a conversation goes through before completion – to spot anomalies. This kind of observability was something the open-source community also highlighted as crucial; in fact, Sematic Kernel, AutoGen frameworks introduce metrics tracking and message tracing for exactly this reason. We also developed visual debugging tools. One example is the AutoGen Studio (a low-code interface from Microsoft Research) which allows developers to visually inspect agent interactions in real time, pause agents, or adjust their behavior on the fly. This interactive approach accelerates the prompt-engineering loop: one can watch agents argue or collaborate live, and intervene if needed. Such capabilities turned out to be vital for understanding emergent behaviors in multi-agent setups. Coordination Complexity and State Management As more agents come into play, keeping them coordinated and preserving shared context is hard. Early versions of agents would sometimes spawn excessive numbers of agents or get stuck in loops. For instance, one of our prototypes (before we applied strict limits) ended up in a degenerate state where two agents kept handing control back and forth without making progress. This taught us to implement guardrails and smarter orchestration policies. In Azure AI Foundry, beyond the simple connected-agent pattern, we introduced a more structured orchestration capability called Multi-Agent Workflows. This lets developers explicitly define states, transitions, and triggers in a workflow that involves multiple agents. It’s like flowcharting the high-level process that the agents should follow, including how they pass data around. We use this for long-running or highly critical processes where you want extra determinism for example, an onboarding workflow might have clearly defined phases (Verification → Provisioning → Notification) each handled by different agents, and you want to ensure the process doesn’t derail. The workflow engine enforces that the system moves to the next state only when all agents in the current state have completed and certain conditions (triggers) are met. It also provides persistence: if the process needs to wait (say, for an external event or simply because it’s a lengthy task), the state is saved and can be resumed later without losing context. These workflow features were a response to reliability needs, they give fine-grained control and error recovery in multi-agent systems that operate over extended periods. In practice, we learned to use the simpler Connected Agents approach for quick, on-the-fly delegations (it’s amazingly capable with minimal setup), and reserve Workflow Orchestration for scenarios where we must guarantee a robust sequence over minutes, hours, or days. By having both options, we can strike a balance between flexibility and control as needed. Trust, Safety, and Governance When you let AI agents act autonomously (especially if they can use tools that modify data or interact with the real world), safety is paramount. From day one, our design included enterprise-grade safety measures: Content Filtering and Policy Enforcement: All AI outputs go through content filters to catch disallowed content or potential prompt injection attacks. The Foundry Agent Service has integrated guardrails so that even if an agent tries something risky (e.g., a tool returns a sensitive info that should not be shown), policies can prevent misuse or leakage. For example, we configured financial analysis agents with rules not to output certain PII or to stop if they detect a regulatory compliance issue, handing off to a human instead. Identity and Access Control: Agents operate with identities managed via Microsoft Entra ID (Azure AD). This means every action an agent takes can be attributed and audited. Role-Based Access Control (RBAC) is enforced: an agent only has access to the data and APIs its role permits. If an agent’s credentials are compromised or misused, Azure’s standard auditing can alert us. Essentially, agents are first-class service principals in our cloud stack. Network Isolation and Compliance: For enterprise deployments, Azure AI Foundry allows agents to run in isolated networks (so they can’t arbitrarily call external services unless allowed) and to use customer-managed storage and search indices. This addresses the data governance aspect, we can ensure an agent looking up internal documents only sees what it’s supposed to, and all data stays within compliant boundaries. Auditability: As mentioned earlier, every decision an agent makes (every tool it calls, every answer it gives) is recorded. This is crucial for trust, if a multi-agent system is making business decisions, we need to be able to explain and justify those decisions later. By retaining the full reasoning trace and sources, we make the system’s work transparent and auditable. In fact, our “Deep Research” agents output not just answers but also a log of how they arrived at that answer, including citations to source material for each claim. This level of detail is a must-have in regulated industries or any high-stakes use case. Overall, baking in trust and safety by design was a non-negotiable requirement. It does introduce some overhead – e.g., being strict about content filtering can sometimes stop an agent from a creative solution until we refine its prompt or the filter thresholds, but it’s worth it for the confidence it gives to deploy these agents at scale. Performance and Cost Considerations We touched on the resource cost of multi-agent systems. Another challenge was ensuring the system runs efficiently. Without care, adding agents can linearly increase cost and latency. We mitigated this in a few ways: Parallelism: We make agents run concurrently wherever possible. Our lead agents typically fire off multiple sub-agents at once rather than sequentially waiting for one then starting the next. Also, our agents themselves can issue parallel tool calls. In fact, we enabled some of our retrieval agents to batch multiple search queries and send them all at the same time. Anthropic reported that this kind of parallelism cut their research task times by up to 90%, and we’ve observed similar dramatic speed-ups. By doing in 1 minute what a single agent might take 10 minutes to do step-by-step, we make the approach far more practical. Of course, the flip side is hitting many APIs and LLM endpoints concurrently can spike usage costs; we carefully monitor usage and recommend multi-agent mode only when needed for the problem complexity. Scaling rules and agent limits: One lesson learned was to prevent “agent sprawl.” We devised guidelines (and encoded some in prompts) about how many sub-agents to use for a given task complexity. For simple fact queries, the main agent is encouraged to handle it alone or with at most one helper; for moderately complex tasks, maybe spin up 2–3; only truly complex projects get a dozen specialists. This avoids the situation where an overzealous orchestrator might launch an army of agents and overkill the problem. These limits were informed by experimentation and echo the principle of scaling effort to the problem size. Model selection: Multi-agent systems don’t always need the largest model for every agent. We often use a mix of model sizes to optimize cost. For instance, a straightforward data extraction agent might be powered by a cheaper GPT-3.5, while the synthesis agent uses GPT-4 for the final answer quality. Foundry makes it easy to deploy a range of model endpoints (including open-source Llama-based models) and each agent can pick the one best suited. We learned that using an expensive model for a simple sub-task is wasteful; a smaller model with the right tools can do the job just as well. This mix-and-match approach helped keep our compute costs in check without sacrificing outcome quality. Lessons Learned and Best Practices Building these multi-agent systems was an iterative learning process. Here are some of the key lessons and best practices that emerged, which we believe will be useful to anyone developing their own: Let’s expand on a couple of these points: Prompt engineering for multi-agent is different: We quickly discovered that writing prompts for a team of agents is an order of magnitude more complex than for a single chatbot. Not only do you have to get each agent’s behavior right, you must shape how they interact. One principle that served us well was: “Think like your agents.” When debugging, we’d often step through the conversation from each agent’s perspective, almost role-playing as them, to see why they might be doing something silly. If an agent was repeating another’s results, maybe our instructions were too vague and they didn’t realise that sub-task was already covered. The fix would be to clarify the division of labour in the lead agent’s prompt or introduce an ordering (e.g., Agent B only runs after Agent A’s info is in, etc.). Another principle: teach the orchestrator to delegate effectively. The main agent’s prompt now includes explicit guidance on how to break down tasks and how to phrase sub-agent assignments with plenty of detail. We learned that if the lead just says “Research topic X” to two different agents, they might both do the same thing. Now, the lead agent provides distinct objectives and context to each sub-agent (e.g., focus one on recent news, another on historical data, etc.). This reduced redundancy and missed coverage dramatically. Let the AI help improve itself: One delightful surprise was that large models can be quite good at analyzing and refining their own strategies when asked. We sometimes gave an agent a chance to critique its output or plan, essentially a self-reflection step. In other cases we had a “judge” agent evaluate the final answers against criteria (accuracy, completeness, etc.) These evaluations not only gave us a score for benchmarking changes, but the judge’s feedback (being an LLM) often highlighted exactly where an agent went off track or missed something. In a sense, we used one AI to tell us how to make another AI better. This kind of meta-prompting and self-correction became a powerful tool in our development cycle, allowing faster iteration without full human-in-the-loop at every turn. Know when to simplify: Not every problem needs a fleet of agents. A big lesson was to use the simplest approach that works. If a single agent with a smart prompt can handle a task reliably, that’s fine! We reserved multi-agent mode for when there was clear added value e.g., problems requiring parallel exploration, different expertise, or lengthy reasoning that benefits from splitting into parts. This discipline kept our systems leaner and easier to maintain. It also helped us explain the value to stakeholders: we could justify the complexity by pointing to concrete gains (like a task that went from 2 hours by a single high-end model to 10 minutes by a team of agents with better results). Conclusion and Next Steps Multi-agent AI systems have moved from intriguing research demos to practical, production-ready solutions. Our journey involved close collaboration between teams such as those who built open-source frameworks like AutoGen to experiment with multi-agent interactions) and the Azure AI product teams (who turned these concepts into the robust Azure AI Foundry Agent Service). Along the way, we learned how to orchestrate LLMs at scale, how to keep them in check, and how to squeeze the most value out of agent collaboration. Today, Azure AI Foundry’s Agents platform provides a unified environment to develop, test, and deploy multi-agent systems, complete with the orchestration, observability, and safety features to make them enterprise-ready. The public preview of features like Connected Agents and Deep Research (which is essentially an advanced research agent that uses the web + analysis in a multi-step process) is already enabling customers to build “AI teams” that tackle complex workflows. This is just the beginning. We’re continuing to improve the platform with feedback from developers: upcoming releases will further tighten integration with the broader Azure ecosystem (for example, more seamless use of Azure Cognitive Search, Excel as a tool, etc.), expand the library of pre-built agent templates in the Agent Catalog (so you can start with a solid example for common scenarios), and introduce more advanced coordination patterns inspired by real-world use cases. If you’re an AI engineer or developer eager to explore multi-agent systems, now is a great time to dive in. Here are some resources to get you started: Microsoft AI Agents for Beginners - Learn all about AI Agents with this FREE curricula Azure AI Foundry Documentation – Learn more about the Agent Service and how to configure agents, tools, and workflows. Microsoft Learn Modules – step-by-step tutorial to build a connected multi-agent solution (for example, a ticket triage system) using Azure AI Foundry Agent Service. This will walk you through setting up agents and using the SDK. Microsoft MCP for Beginners: Integrating MCP Tools – Another tutorial focused on the Model Context Protocol, showing how to enable dynamic tool discovery for your agents. Azure AI Foundry Agent Catalog – Browse a growing collection of open-sourced agent examples contributed by Microsoft and partners, covering scenarios from content compliance to manufacturing optimization. These samples are great starting points to see how multi-agent code is structured in real projects. Multi-agent systems represent a significant shift in how we conceptualise AI solutions: from single brilliant assistants to teams of specialised agents working in concert. The engineering journey hasn’t been easy we navigated challenges in coordination, built new tooling for control, and refined prompts endlessly. But the end result is a new class of AI applications that are more powerful, resilient, and tunable. We hope the insights shared here help you in your own journey to build with AI agents. We’re excited to see what you will create with these technologies. As we continue to push the frontier of agentic AI (both in research and in Azure), one thing is clear: many minds – human or AI – are often better than one. Happy building! Userful References Introducing Multi-Agent Orchestration in Foundry Agent Service – Build ... Building a multimodal, multi-agent system using Azure AI Agent Service ... How we built our multi-agent research system \ Anthropic What is Azure AI Foundry Agent Service? - Azure AI Foundry Multi-Agent Systems and MCP Tools Integration with Azure AI Foundry ... Introducing Deep Research in Azure AI Foundry Agent Service AutoGen v0.4: Advancing the development of agentic AI systems
Lee_Stott
Sep 18, 2025 Place Microsoft Developer Community Blog
1.2KViews
1like
0Comments
Practical ways to use AI in your Data Science and ML journey
Are you ready to take your learning from “just reading” to “actively doing”? We are launching a series of interactive events designed to help you bridge the gap between theory and practice, connect with peers, and learn directly from experts, all in a vibrant, supportive Discord community. The series will guide you on practical ways to use AI in your learning journey with Data Science and Machine Learning. Join the series on Discord at: https://aka.ms/ds4beginners/discord Why You Should Join Guided Pathways: Move seamlessly from the popular Data Science for Beginners and Machine Learning for Beginners curriculums to interact with fellow learners in the Azure AI Foundry Discord community. We’ve designed study plans, office hours sessions, and showcases so you can learn, ask questions, and get feedback in real time. Practical AI Skills: See how Artificial Intelligence can supercharge your learning. . Community & Collaboration: Find peer learners with similar goals, collaborate on projects, and get support from experienced moderators and MVPs. Machine Learning AMA 1 Theme: Learning Machine Learning with AI, responsibly. Date: Tuesday, 23 September (10AM PST | 1PM ET | 5PM GMT | 8PM EAT) What to Expect: Curriculum overview, hands-on AI demos, critique and correct a simple classifier, and peer collaboration. Link: https://aka.ms/learnwithai/ml Data Science AMA 2 Theme: Learning Data Science with AI, responsibly. Date: Tuesday, 30 September (10AM PST | 1PM ET | 5PM GMT | 8PM EAT) What to Expect: Curriculum overview, integrating AI into your learning, live Q&A, and practical tips for responsible AI use. Link: https://aka.ms/learnwithai/ds Showcase 1: GitHub Copilot for Data Science Theme: Hands-on AI Agents for Data Scientists Date: Thursday, 25 September (10AM PST | 1PM ET | 5PM GMT | 8PM EAT) What to Expect: Learn to use Copilot’s advanced features in Python and Jupyter Notebooks, customize agents, and extend your analyses. Link: https://aka.ms/learnwithai/agents How to Get Involved Join the Discord Channels: https://aka.ms/ds4beginners/discord https://aka.ms/ml4beginners/discord Prepare: Make sure you have a GitHub account, VS Code, Data Wrangler, ready for hands-on sessions. What Makes This Program Different? Discord first learning: Experience higher engagement and deeper connections. Curriculum-anchored activities: Every event is designed to build on what you’ve learned in the open-source curriculums. AI-powered study plans and resources: Get starter notebooks, prompt cards, and recap templates to accelerate your progress. Resources Hands-on workshop: https://aka.ms/ds-agents Data Science for beginner's curriculum: https://aka.ms/ds4beginners Machine Learning for beginner's curriculum: https://aka.ms/ml4beginners Ready to transform your learning journey? Join us for these exciting events and join us in the Azure AI Foundry Discord community. Whether you’re just starting out or looking to deepen your skills, there’s a place for you here!
bethanyjep
Sep 17, 2025 Place Microsoft Developer Community Blog
1.2KViews
0likes
0Comments
Transform Your AI Applications with Local LLM Deployment
Introduction Are you tired of watching your AI application costs spiral out of control every time your user base grows? As AI Engineers and Developers, we've all felt the pain of cloud-dependent LLM deployments. Every API call adds up, latency becomes a bottleneck in real-time applications, and sensitive data must leave your infrastructure to get processed. Meanwhile, your users demand faster responses, better privacy, and more reliable service. What if there was a way to run powerful language models directly on your users' devices or your local infrastructure? Enter the world of Edge AI deployment with Microsoft's Foundry Local a game-changing approach that brings enterprise-grade LLM capabilities to local hardware while maintaining full OpenAI API compatibility. The Edge AI for Beginners https://aka.ms/edgeai-for-beginners curriculum provides AI Engineers and Developers with comprehensive, hands-on training to master local LLM deployment. This isn't just another theoretical course, it's a practical guide that will transform how you think about AI infrastructure, combining cutting-edge local deployment techniques with production-ready implementation patterns. In this post, we'll explore why Edge AI deployment represents the future of AI applications, dive deep into Foundry Local's capabilities across multiple frameworks, and show you exactly how to implement local LLM solutions that deliver both technical excellence and significant business value. Why Edge AI Deployment Changes Everything for Developers The shift from cloud-dependent to edge-deployed AI represents more than just a technical evolution, it's a fundamental reimagining of how we build intelligent applications. As AI Engineers, we're witnessing a transformation that addresses the most pressing challenges in modern AI deployment while opening up entirely new possibilities for innovation. Consider the current state of cloud-based LLM deployment. Every user interaction requires a round-trip to external servers, introducing latency that can kill user experience in real-time applications. Costs scale linearly (or worse) with usage, making successful applications expensive to operate. Sensitive data must traverse networks and live temporarily in external systems, creating compliance nightmares for enterprise applications. Edge AI deployment fundamentally changes this equation. By running models locally, we achieve several critical advantages: Data Sovereignty and Privacy Protection: Your sensitive data never leaves your infrastructure. For healthcare applications processing patient records, financial services handling transactions, or enterprise tools managing proprietary information, this represents a quantum leap in security posture. You maintain complete control over data flow, meeting even the strictest compliance requirements without architectural compromises. Real-Time Performance at Scale: Local inference eliminates network latency entirely. Instead of 200-500ms round-trips to cloud APIs, you get sub-10ms response times. This enables entirely new categories of applications—real-time code completion, interactive AI tutoring systems, voice assistants that respond instantly, and IoT devices that make intelligent decisions without connectivity. Predictable Cost Structure: Transform variable API costs into fixed infrastructure investments. Instead of paying per-token for potentially unlimited usage, you invest in local hardware that serves unlimited requests. This makes ROI calculations straightforward and removes the fear of viral success destroying your margins. Offline Capabilities and Resilience: Local deployment means your AI features work even when connectivity fails. Mobile applications can provide intelligent features in areas with poor network coverage. Critical systems maintain AI capabilities during network outages. Edge devices in remote locations operate autonomously. The technical implications extend beyond these obvious benefits. Local deployment enables new architectural patterns: AI-powered applications that work entirely client-side, edge computing nodes that make intelligent routing decisions, and distributed systems where intelligence lives close to data sources. Foundry Local: Multi-Framework Edge AI Deployment Made Simple Microsoft's Foundry Local https://www.foundrylocal.ai represents a breakthrough in local AI deployment, designed specifically for developers who need production-ready edge AI solutions. Unlike single-framework tools, Foundry Local provides a unified platform that works seamlessly across multiple programming languages and deployment scenarios while maintaining full compatibility with existing OpenAI-based workflows. The platform's approach to multi-framework support means you're not locked into a single technology stack. Whether you're building TypeScript applications, Python ML pipelines, Rust systems programming projects, or .NET enterprise applications, Foundry Local provides native SDKs and consistent APIs that integrate naturally with your existing codebase. Enterprise-Grade Model Catalog: Foundry Local comes with a curated selection of production-ready models optimized for edge deployment. The `phi-3.5-mini` model delivers impressive performance in a compact footprint, perfect for resource-constrained environments. For applications requiring more sophisticated reasoning, `qwen2.5-0.5b` provides enhanced capabilities while maintaining efficiency. When you need maximum capability and have sufficient hardware resources, `gpt-oss-20b` offers state-of-the-art performance with full local control. Intelligent Hardware Optimization: One of Foundry Local's most powerful features is its automatic hardware detection and optimization. The platform automatically identifies your available compute resources, NVIDIA CUDA GPUs, AMD GPUs, Intel NPUs, Qualcomm Snapdragon NPUs, or CPU-only environments and downloads the most appropriate model variant. This means the same application code delivers optimal performance across diverse hardware configurations without manual intervention. ONNX Runtime Acceleration: Under the hood, Foundry Local leverages Microsoft's ONNX Runtime for maximum performance. This provides significant advantages over generic inference engines, delivering optimized execution paths for different hardware architectures while maintaining model accuracy and compatibility. OpenAI SDK Compatibility: Perhaps most importantly for developers, Foundry Local maintains complete API compatibility with the OpenAI SDK. This means existing applications can migrate to local inference by changing only the endpoint configuration—no rewriting of application logic, no learning new APIs, no disruption to existing workflows. The platform handles the complex aspects of local AI deployment automatically: model downloading, hardware-specific optimization, memory management, and inference scheduling. This allows developers to focus on building intelligent applications rather than managing AI infrastructure. Framework-Agnostic Benefits: Foundry Local's multi-framework approach delivers consistent benefits regardless of your technology choices. Whether you're working in a Node.js microservices architecture, a Python data science environment, a Rust embedded system, or a C# enterprise application, you get the same advantages: reduced latency, eliminated API costs, enhanced privacy, and offline capabilities. This universal compatibility means teams can adopt edge AI deployment incrementally, starting with pilot projects in their preferred language and expanding across their technology stack as they see results. The learning curve is minimal because the API patterns remain familiar while the underlying infrastructure transforms to local deployment. Implementing Edge AI: From Code to Production Moving from cloud APIs to local AI deployment requires understanding the implementation patterns that make edge AI both powerful and practical. Let's explore how Foundry Local's SDKs enable seamless integration across different development environments, with real-world code examples that you can adapt for your production systems. Python Implementation for Data Science and ML Pipelines Python developers will find Foundry Local's integration particularly natural, especially in data science and machine learning contexts where local processing is often preferred for security and performance reasons. import openai from foundry_local import FoundryLocalManager # Initialize with automatic hardware optimization alias = "phi-3.5-mini" manager = FoundryLocalManager(alias) This simple initialization handles a remarkable amount of complexity automatically. The `FoundryLocalManager` detects your hardware configuration, downloads the most appropriate model variant for your system, and starts the local inference service. Behind the scenes, it's making intelligent decisions about memory allocation, selecting optimal execution providers, and preparing the model for efficient inference. # Configure OpenAI client for local deployment client = openai.OpenAI( base_url=manager.endpoint, api_key=manager.api_key # Not required for local, but maintains API compatibility ) # Production-ready inference with streaming def analyze_document(content: str): stream = client.chat.completions.create( model=manager.get_model_info(alias).id, messages=[{ "role": "system", "content": "You are an expert document analyzer. Provide structured analysis." }, { "role": "user", "content": f"Analyze this document: {content}" }], stream=True, temperature=0.7 ) result = "" for chunk in stream: if chunk.choices[0].delta.content: content_piece = chunk.choices[0].delta.content result += content_piece yield content_piece # Enable real-time UI updates return result Key implementation benefits here: • Automatic model management: The `FoundryLocalManager` handles model lifecycle, memory optimization, and hardware-specific acceleration without manual configuration. • Streaming interface compatibility: Maintains the familiar OpenAI streaming API while processing locally, enabling real-time user interfaces with zero latency overhead. • Production error handling: The manager includes built-in retry logic, graceful degradation, and resource management for reliable production deployment. JavaScript/TypeScript Implementation for Web Applications JavaScript and TypeScript developers can integrate local AI capabilities directly into web applications, enabling entirely new categories of client-side intelligent features. import { OpenAI } from "openai"; import { FoundryLocalManager } from "foundry-local-sdk"; class LocalAIService { constructor() { this.foundryManager = null; this.openaiClient = null; this.isInitialized = false; } async initialize(modelAlias = "phi-3.5-mini") { this.foundryManager = new FoundryLocalManager(); const modelInfo = await this.foundryManager.init(modelAlias); this.openaiClient = new OpenAI({ baseURL: this.foundryManager.endpoint, apiKey: this.foundryManager.apiKey, }); this.isInitialized = true; return modelInfo; } The initialization pattern establishes local AI capabilities with full error handling and resource management. This enables web applications to provide AI features without external API dependencies. async generateCodeCompletion(codeContext, userPrompt) { if (!this.isInitialized) { throw new Error("LocalAI service not initialized"); } try { const completion = await this.openaiClient.chat.completions.create({ model: this.foundryManager.getModelInfo().id, messages: [ { role: "system", content: "You are a code completion assistant. Provide accurate, efficient code suggestions." }, { role: "user", content: `Context: ${codeContext}\n\nComplete: ${userPrompt}` } ], max_tokens: 150, temperature: 0.2 }); return completion.choices[0].message.content; } catch (error) { console.error("Local AI completion failed:", error); throw new Error("Code completion unavailable"); } } } Implementation advantages for web applications • Zero-dependency AI features: Applications work entirely offline once models are downloaded, enabling AI capabilities in disconnected environments. • Instant response times: Eliminate network latency for real-time features like code completion, content generation, or intelligent search. • Client-side privacy: Sensitive code or content never leaves the user's device, meeting strict security requirements for enterprise development tools. Cross-Platform Production Deployment Patterns Both Python and JavaScript implementations share common production deployment patterns that make Foundry Local particularly suitable for enterprise applications: Automatic Hardware Optimization: The platform automatically detects and utilizes available acceleration hardware. On systems with NVIDIA GPUs, it leverages CUDA acceleration. On newer Intel systems, it uses NPU acceleration. On ARM-based systems like Apple Silicon or Qualcomm Snapdragon, it optimizes for those architectures. This means the same application code delivers optimal performance across diverse deployment environments. Graceful Resource Management: Foundry Local includes sophisticated memory management and resource allocation. Models are loaded efficiently, memory is recycled properly, and concurrent requests are handled intelligently to maintain system stability under load. Production Monitoring Integration: The platform provides comprehensive metrics and logging that integrate naturally with existing monitoring systems, enabling production observability for AI workloads running at the edge. These implementation patterns demonstrate how Foundry Local transforms edge AI from an experimental concept into a practical, production-ready deployment strategy that works consistently across different technology stacks and hardware environments. Measuring Success: Technical Performance and Business Impact The transition to edge AI deployment delivers measurable improvements across both technical and business metrics. Understanding these impacts helps justify the architectural shift and demonstrates the concrete value of local LLM deployment in production environments. Technical Performance Gains Latency Elimination: The most immediately visible benefit is the dramatic reduction in response times. Cloud API calls typically require 200-800ms round-trips, depending on geographic location and network conditions. Local inference with Foundry Local reduces this to sub-10ms response times—a 95-99% improvement that fundamentally changes user experience possibilities. Consider a code completion feature: cloud-based completion feels sluggish and interrupts developer flow, while local completion provides instant suggestions that enhance productivity. The same applies to real-time chat applications, interactive AI tutoring systems, and any application where response latency directly impacts usability. Automatic Hardware Utilization: Foundry Local's intelligent hardware detection and optimization delivers significant performance improvements without manual configuration. On systems with NVIDIA RTX 4000 series GPUs, inference speeds can be 10-50x faster than CPU-only processing. On newer Intel systems with NPUs, the platform automatically leverages neural processing units for efficient AI workloads. Apple Silicon systems benefit from Metal Performance Shaders optimization, delivering excellent performance per watt. ONNX Runtime Optimization: Microsoft's ONNX Runtime provides substantial performance advantages over generic inference engines. In benchmark testing, ONNX Runtime consistently delivers 2-5x performance improvements compared to standard PyTorch or TensorFlow inference, while maintaining full model accuracy and compatibility. Scalability Characteristics: Local deployment transforms scaling economics entirely. Instead of linear cost scaling with usage, you get horizontal scaling through hardware deployment. A single modern GPU can handle hundreds of concurrent inference requests, making per-request costs approach zero for high-volume applications. Business Impact Analysis Cost Structure Transformation: The financial implications of local deployment are profound. Consider an application processing 1 million tokens daily through OpenAI's API—this represents $20-60 in daily costs depending on the model. Over a year, this becomes $7,300-21,900 in recurring expenses. A comparable local deployment might require a $2,000-5,000 hardware investment with no ongoing API costs. For high-volume applications, the savings become dramatic. Applications processing 100 million tokens monthly face $60,000-180,000 annual API costs. Local deployment with appropriate hardware infrastructure could reduce this to electricity and maintenance costs—typically under $10,000 annually for equivalent processing capacity. Enhanced Privacy and Compliance: Local deployment eliminates data sovereignty concerns entirely. Healthcare applications processing patient records, financial services handling transaction data, and enterprise tools managing proprietary information can deploy AI capabilities without data leaving their infrastructure. This simplifies compliance with GDPR, HIPAA, SOX, and other regulatory frameworks while reducing legal and security risks. Operational Resilience: Local deployment provides significant business continuity advantages. Applications continue functioning during network outages, API service disruptions, or third-party provider issues. For mission-critical systems, this resilience can prevent costly downtime and maintain user productivity during external service failures. Development Velocity: Local deployment accelerates development cycles by eliminating API rate limits, usage quotas, and external dependencies during development and testing. Developers can iterate freely, run comprehensive test suites, and experiment with AI features without cost concerns or rate limiting delays. Enterprise Adoption Metrics Real-world enterprise deployments demonstrate measurable business value: Local Usage: Foundry Local for internal AI-powered tools, reporting 60-80% reduction in AI-related operational costs while improving developer productivity through instant AI responses in development environments. Manufacturing Applications: Industrial IoT deployments using edge AI for predictive maintenance show 40-60% reduction in unplanned downtime while eliminating cloud connectivity requirements in remote facilities. Financial Services: Trading firms deploying local LLMs for market analysis report sub-millisecond decision latencies while maintaining complete data isolation for competitive advantage and regulatory compliance. ROI Calculation Framework For AI Engineers evaluating edge deployment, consider these quantifiable factors: Direct Cost Savings: Compare monthly API costs against hardware amortization over 24-36 months. Most applications with >$1,000 monthly API costs achieve positive ROI within 12-18 months. Performance Value: Quantify the business impact of reduced latency. For customer-facing applications, each 100ms of latency reduction typically correlates with 1-3% conversion improvement. Risk Mitigation: Calculate the cost of downtime or compliance violations prevented by local deployment. For many enterprise applications, avoiding a single significant outage justifies the infrastructure investment. Development Efficiency: Measure developer productivity improvements from unlimited local AI access during development. Teams report 20-40% faster iteration cycles when AI features can be tested without external dependencies. These metrics demonstrate that edge AI deployment with Foundry Local delivers both immediate technical improvements and substantial long-term business value, making it a strategic investment in AI infrastructure that pays dividends across multiple dimensions. Your Edge AI Journey Starts Here The shift to edge AI represents more than just a technical evolution, it's an opportunity to fundamentally improve your applications while building valuable expertise in an emerging field. Whether you're looking to reduce costs, improve performance, or enhance privacy, the path forward involves both learning new concepts and connecting with a community of practitioners solving similar challenges. Master Edge AI with Comprehensive Training The Edge AI for Beginners https://aka.ms/edgeai-for-beginners curriculum provides the complete foundation you need to become proficient in local AI deployment. This isn't a superficial overview, it's a comprehensive, hands-on program designed specifically for developers who want to build production-ready edge AI applications. The curriculum takes you through hours of structured learning, progressing from fundamental concepts to advanced deployment scenarios. You'll start by understanding the principles of edge AI and local inference, then dive deep into practical implementation with Foundry Local across multiple programming languages. The program includes working examples and comprehensive sample applications that demonstrate real-world use cases. What sets this curriculum apart is its practical focus. Instead of theoretical discussions, you'll build actual applications: document analysis systems that work offline, real-time code completion tools, intelligent chatbots that protect user privacy, and IoT applications that make decisions locally. Each project teaches both the technical implementation and the architectural thinking needed for successful edge AI deployment. The curriculum covers multi-framework deployment patterns extensively, ensuring you can apply edge AI principles regardless of your preferred development stack. Whether you're working in Python data science environments, JavaScript web applications, C# enterprise systems, or Rust embedded projects, you'll learn the patterns and practices that make edge AI successful. Join a Community of AI Engineers Learning edge AI doesn't happen in isolation, it requires connection with other developers who are solving similar challenges and discovering new possibilities. The Foundry Local Discord community https://aka.ms/foundry-local-discord provides exactly this environment, connecting AI Engineers and Developers from around the world who are implementing local AI solutions. This community serves multiple crucial functions for your development as an edge AI practitioner. You'll find experienced developers sharing implementation patterns they've discovered, debugging complex deployment issues collaboratively, and discussing the architectural decisions that make edge AI successful in production environments. The Discord community includes dedicated channels for different programming languages, specific deployment scenarios, and technical discussions about optimization and performance. Whether you're implementing your first local AI feature or optimizing a complex multi-model deployment, you'll find peers and experts ready to help problem-solve and share insights. Beyond technical support, the community provides valuable career and business insights. Members share their experiences with edge AI adoption in different industries, discuss the business cases that have proven most successful, and collaborate on open-source projects that advance the entire ecosystem. Share Your Experience and Build Expertise One of the most effective ways to solidify your edge AI expertise is by sharing your implementation experiences with the community. As you build applications with Foundry Local and deploy edge AI solutions, documenting your process and sharing your learnings provides value both to others and to your own professional development. Consider sharing your deployment stories, whether they're successes or challenges you've overcome. The community benefits from real-world case studies that show how edge AI performs in different environments and use cases. Your experience implementing local AI in a healthcare application, financial services system, or manufacturing environment provides valuable insights that others can build upon. Technical contributions are equally valuable, whether it's sharing configuration patterns you've discovered, performance optimizations you've implemented, or integration approaches you've developed for specific frameworks or libraries. The edge AI field is evolving rapidly, and practical contributions from working developers drive much of the innovation. Sharing your work also builds your professional reputation as an edge AI expert. As organizations increasingly adopt local AI deployment strategies, developers with proven experience in this area become valuable resources for their teams and the broader industry. The combination of structured learning through the Edge AI curriculum, active participation in the community, and sharing your practical experiences creates a comprehensive path to edge AI expertise that serves both your immediate project needs and your long-term career development as AI deployment patterns continue evolving. Key Takeaways Local LLM deployment transforms application economics: Replace variable API costs with fixed infrastructure investments that scale to unlimited usage, typically achieving ROI within 12-18 months for applications with significant AI workloads. Foundry Local enables multi-framework edge AI: Consistent deployment patterns across Python, JavaScript, C#, and Rust environments with automatic hardware optimization and OpenAI API compatibility. Performance improvements are dramatic and measurable: Sub-10ms response times replace 200-800ms cloud API latency, while automatic hardware acceleration delivers 2-50x performance improvements depending on available compute resources. Privacy and compliance become architectural advantages: Local deployment eliminates data sovereignty concerns, simplifies regulatory compliance, and provides complete control over sensitive information processing. Edge AI expertise is a strategic career investment: As organizations increasingly adopt local AI deployment, developers with hands-on edge AI experience become valuable technical resources with unique skills in an emerging field. Conclusion Edge AI deployment represents the next evolution in intelligent application development, transforming both the technical possibilities and economic models of AI-powered systems. With Foundry Local and the comprehensive Edge AI for Beginners curriculum, you have access to production-ready tools and expert guidance to make this transition successfully. The path forward is clear: start with the Edge AI for Beginners curriculum to build solid foundations, connect with the Foundry Local Discord community to learn from practicing developers, and begin implementing local AI solutions in your projects. Each step builds valuable expertise while delivering immediate improvements to your applications. As cloud costs continue rising and privacy requirements become more stringent, organizations will increasingly rely on developers who can implement local AI solutions effectively. Your early adoption of edge AI deployment patterns positions you at the forefront of this technological shift, with skills that will become increasingly valuable as the industry evolves. The future of AI deployment is local, private, and performance-optimized. Start building that future today. Resources Edge AI for Beginners Curriculum: Comprehensive training with 36-45 hours of hands-on content examples, and production-ready deployment patterns https://aka.ms/edgeai-for-beginners Foundry Local GitHub Repository: Official documentation, samples, and community contributions for local AI deployment https://github.com/microsoft/foundry_local Foundry Local Discord Community: Connect with AI Engineers and Developers implementing edge AI solutions worldwide https://aka.ms/foundry/discord Foundry Local Documentation: Complete technical documentation and API references Foundry Local documentation | Microsoft Learn Foundry Local Model Catalog: Browse available models and deployment options for different hardware configurations Foundry Local Models - Browse AI Models
Lee_Stott
Oct 23, 2025 Place Microsoft Developer Community Blog
1KViews
0likes
1Comment