ai foundry
31 TopicsAI Toolkit Extension Pack for Visual Studio Code: Ignite 2025 Update
Unlock the Latest Agentic App Capabilities The Ignite 2025 update delivers a major leap forward for the AI Toolkit extension pack in VS Code, introducing a unified, end-to-end environment for building, visualizing, and deploying agentic applications to Microsoft Foundry, and the addition of Anthropic’s frontier Claude models in the Model Catalog! This release enables developers to build and debug locally in VS Code, then deploy to the cloud with a single click. Seamlessly switch between VS Code and the Foundry portal for visualization, orchestration, and evaluation, creating a smooth roundtrip workflow that accelerates innovation and delivers a truly unified AI development experience. Download the http://aka.ms/aitoolkit today and start building next-generation agentic apps in VS Code! What Can You Do with the AI Toolkit Extension Pack? Access Anthropic models in the Model Catalog Following the Microsoft, NVIDIA and Anthropic strategic partnerships announcement today, we are excited to share that Anthropic’s frontier Claude models including Claude Sonnet 4.5, Claude Opus 4.1, and Claude Haiku 4.5, are now integrated into the AI Toolkit, providing even more choices and flexibility when building intelligent applications and AI agents. Build AI Agents Using GitHub Copilot Scaffold agent applications using best-practice patterns, tool-calling examples, tracing hooks, and test scaffolds, all powered by Copilot and aligned with the Microsoft Agent Framework. Generate agent code in Python or .NET, giving you flexibility to target your preferred runtime. Build and Customize YAML Workflows Design YAML-based workflows in the Foundry portal, then continue editing and testing directly in VS Code. To customize your YAML-based workflows, instantly convert it to Agent Framework code using GitHub Copilot. Upgrade from declarative design to code-first customization without starting from scratch. Visualize Multi-Agent Workflows Envision your code-based agent workflows with an interactive graph visualizer that reveals each component and how they connect Watch in real-time how each node lights up as you run your agent. Use the visualizer to understand and debug complex agent graphs, making iteration fast and intuitive. Experiment, Debug, and Evaluate Locally Use the Hosted Agents Playground to quickly interact with your agents on your development machine. Leverage local tracing support to debug reasoning steps, tool calls, and latency hotspots—so you can quickly diagnose and fix issues. Define metrics, tasks, and datasets for agent evaluation, then implement metrics using the Foundry Evaluation SDK and orchestrate evaluations runs with the help of Copilot. Seamless Integration Across Environments Jump from Foundry Portal to VS Code Web for a development environment in your preferred code editor setting. Open YAML workflows, playgrounds, and agent templates directly in VS Code for editing and deployment. How to Get Started Install the AI Toolkit extension pack from the VS Code marketplace. Check out documentation. Get started with building workflows with Microsoft Foundry in VS Code 1. Work with Hosted (Pro-code) Agent workflows in VS Code 2. Work with Declarative (Low-code) Agent workflows in VS Code Feedback & Support Try out the extensions and let us know what you think! File issues or feedback on our GitHub repo for Foundry extension and AI Toolkit extension. Your input helps us make continuous improvements.2.4KViews4likes0CommentsUnderstanding Small Language Modes
Small Language Models (SLMs) bring AI from the cloud to your device. Unlike Large Language Models that require massive compute and energy, SLMs run locally, offering speed, privacy, and efficiency. They’re ideal for edge applications like mobile, robotics, and IoT.I want to show my agent a picture—Can I?
Welcome to Agent Support—a developer advice column for those head-scratching moments when you’re building an AI agent! Each post answers a question inspired by real conversations in the AI developer community, offering practical advice and tips. To kick things off, we’re tackling a common challenge for anyone experimenting with multimodal agents: working with image input. Let’s dive in! Dear Agent Support, I’m building an AI agent, and I’d like to include screenshots or product photos as part of the input. But I’m not sure if that’s even possible, or if I need to use a different kind of model altogether. Can I actually upload an image and have the agent process it? Great question, and one that trips up a lot of people early on! The short answer is: yes, some models can process images—but not all of them. Let’s break that down a bit. 🧠 Understanding Image Input When we talk about image input or image attachments, we’re talking about the ability to send a non-text file (like a .png, .jpg, or screenshot) into your prompt and have the model analyze or interpret it. That could mean describing what’s in the image, extracting text from it, answering questions about a chart, or giving feedback on a design layout. 🚫 Not All Models Support Image Input That said, this isn’t something every model can do. Most base language models are trained on text data only, they’re not designed to interpret non-text inputs like images. In most tools and interfaces, the option to upload an image only appears if the selected model supports it, since platforms typically hide or disable features that aren't compatible with a model's capabilities. So, if your current chat interface doesn’t mention anything about vision or image input, it’s likely because the model itself isn’t equipped to handle it. That’s where multimodal models come in. These are models that have been trained (or extended) to understand both text and images, and sometimes other data types too. Think of them as being fluent in more than one language, except in this case, one of those “languages” is visual. 🔎 How to Find Image-Supporting Models If you’re trying to figure out which models support images, the AI Toolkit is a great place to start! The extension includes a built-in Model Catalog where you can filter models by Feature—like Image Attachment—so you can skip the guesswork. Here’s how to do it: Open the Model Catalog from the AI Toolkit panel in Visual Studio Code. Click the Feature filter near the search bar. Select Image Attachment. Browse the filtered results to see which models can accept visual input. Once you've got your filtered list, you can check out the model details or try one in the Playground to test how it handles image-based prompts. 🧪 Test Before You Build Before you plug a model into your agent and start wiring things together, it’s a good idea to test how the model handles image input on its own. This gives you a quick feel for the model’s behavior and helps you catch any limitations before you're deep into building. You can do this in the Playground, where you can upload an image and pair it with a simple prompt like: “Describe the contents of this image.” OR “Summarize what’s happening in this screenshot.” If the model supports image input, you’ll be able to attach a file and get a response based on its visual analysis. If you don’t see the option to upload an image, double-check that the model you’ve selected has image capabilities—this is usually a model issue, not a UI bug. 🔁 Recap Here’s a quick rundown of what we covered: Not all models support image input—you’ll need a multimodal model specifically built to handle visual data. Most platforms won’t let you upload an image unless the model supports it, so if you don’t see that option, it’s probably a model limitation. You can use the AI Toolkit’s Model Catalog to filter models by capability—just check the box for Image Attachment. Test the model in the Playground before integrating it into your agent to make sure it behaves the way you expect. 📺 Want to Go Deeper? Check out my latest video on how to choose the right model for your agent—it’s part of the Build an Agent Series, where I walk through the building blocks of turning an idea into a working AI agent. And if you’re looking to sharpen your model instincts, don’t miss Model Mondays—a weekly series that helps developers like you build your Model IQ, one spotlight at a time. Whether you’re just starting out or already building AI-powered apps, it’s a great way to stay current and confident in your model choices. 👉 Explore the series and catch the next episode: aka.ms/model-mondays/rsvp If you're just getting started with building agents, check out our Agents for Beginners curriculum. And for all your general AI and AI agent questions, join us in the Azure AI Foundry Discord! You can find me hanging out there answering your questions about the AI Toolkit. I'm looking forward to chatting with you there! Whatever you're building, the right model is out there—and with the right tools, you'll know exactly how to find it.On‑Device AI with Windows AI Foundry and Foundry Local
From “waiting” to “instant”- without sending data away AI is everywhere, but speed, privacy, and reliability are critical. Users expect instant answers without compromise. On-device AI makes that possible: fast, private and available, even when the network isn’t - empowering apps to deliver seamless experiences. Imagine an intelligent assistant that works in seconds, without sending a text to the cloud. This approach brings speed and data control to the places that need it most; while still letting you tap into cloud power when it makes sense. Windows AI Foundry: A Local Home for Models Windows AI Foundry is a developer toolkit that makes it simple to run AI models directly on Windows devices. It uses ONNX Runtime under the hood and can leverage CPU, GPU (via DirectML), or NPU acceleration, without requiring you to manage those details. The principle is straightforward: Keep the model and the data on the same device. Inference becomes faster, and data stays local by default unless you explicitly choose to use the cloud. Foundry Local Foundry Local is the engine that powers this experience. Think of it as local AI runtime - fast, private, and easy to integrate into an app. Why Adopt On‑Device AI? Faster, more responsive apps: Local inference often reduces perceived latency and improves user experience. Privacy‑first by design: Keep sensitive data on the device; avoid cloud round trips unless the user opts in. Offline capability: An app can provide AI features even without a network connection. Cost control: Reduce cloud compute and data costs for common, high‑volume tasks. This approach is especially useful in regulated industries, field‑work tools, and any app where users expect quick, on‑device responses. Hybrid Pattern for Real Apps On-device AI doesn’t replace the cloud, it complements it. Here’s how: Standalone On‑Device: Quick, private actions like document summarization, local search, and offline assistants. Cloud‑Enhanced (Optional): Large-context models, up-to-date knowledge, or heavy multimodal workloads. Design an app to keep data local by default and surface cloud options transparently with user consent and clear disclosures. Windows AI Foundry supports hybrid workflows: Use Foundry Local for real-time inference. Sync with Azure AI services for model updates, telemetry, and advanced analytics. Implement fallback strategies for resource-intensive scenarios. Application Workflow Code Example using Foundry Local: 1. Only On-Device: Tries Foundry Local first, falls back to ONNX if foundry_runtime.check_foundry_available(): # Use on-device Foundry Local models try: answer = foundry_runtime.run_inference(question, context) return answer, source="Foundry Local (On-Device)" except Exception as e: logger.warning(f"Foundry failed: {e}, trying ONNX...") if onnx_model.is_loaded(): # Fallback to local BERT ONNX model try: answer = bert_model.get_answer(question, context) return answer, source="BERT ONNX (On-Device)" except Exception as e: logger.warning(f"ONNX failed: {e}") return "Error: No local AI available" 2. Hybrid approach: On-device first, cloud as last resort def get_answer(question, context): """ Priority order: 1. Foundry Local (best: advanced + private) 2. ONNX Runtime (good: fast + private) 3. Cloud API (fallback: requires internet, less private) # in case of Hybrid approach, based on real-time scenario """ if foundry_runtime.check_foundry_available(): # Use on-device Foundry Local models try: answer = foundry_runtime.run_inference(question, context) return answer, source="Foundry Local (On-Device)" except Exception as e: logger.warning(f"Foundry failed: {e}, trying ONNX...") if onnx_model.is_loaded(): # Fallback to local BERT ONNX model try: answer = bert_model.get_answer(question, context) return answer, source="BERT ONNX (On-Device)" except Exception as e: logger.warning(f"ONNX failed: {e}, trying cloud...") # Last resort: Cloud API (requires internet) if network_available(): try: import requests response = requests.post( '{BASE_URL_AI_CHAT_COMPLETION}', headers={'Authorization': f'Bearer {API_KEY}'}, json={ 'model': '{MODEL_NAME}', 'messages': [{ 'role': 'user', 'content': f'Context: {context}\n\nQuestion: {question}' }] }, timeout=10 ) answer = response.json()['choices'][0]['message']['content'] return answer, source="Cloud API (Online)" except Exception as e: return "Error: No AI runtime available", source="Failed" else: return "Error: No internet and no local AI available", source="Offline" Demo Project Output: Foundry Local answering context-based questions offline : The Foundry Local engine ran the Phi-4-mini model offline and retrieved context-based data. : The Foundry Local engine ran the Phi-4-mini model offline and mentioned that there is no answer. Practical Use Cases Privacy-First Reading Assistant: Summarize documents locally without sending text to the cloud. Healthcare Apps: Analyze medical data on-device for compliance. Financial Tools: Risk scoring without exposing sensitive financial data. IoT & Edge Devices: Real-time anomaly detection without network dependency. Conclusion On-device AI isn’t just a trend - it’s a shift toward smarter, faster, and more secure applications. With Windows AI Foundry and Foundry Local, developers can deliver experiences that respect user specific data, reduce latency, and work even when connectivity fails. By combining local inference with optional cloud enhancements, you get the best of both worlds: instant performance and scalable intelligence. Whether you’re creating document summarizers, offline assistants, or compliance-ready solutions, this approach ensures your apps stay responsive, reliable, and user-centric. References Get started with Foundry Local - Foundry Local | Microsoft Learn What is Windows AI Foundry? | Microsoft Learn https://devblogs.microsoft.com/foundry/unlock-instant-on-device-ai-with-foundry-local/Impariamo a conoscere MCP: Introduzione al Model Context Protocol (MCP)
Non perderti il prossimo evento “Let’s Learn – MCP” su Microsoft Reactor il 24 di Luglio, pensato per chiunque voglia conoscere meglio il nuovo standard per agenti intelligenti (il Model Context Protocol) e imparare a metterlo in pratica. La sessione è in Italiano e le demo sono in Python, ma fa parte di una serie di live-streaming disponibili in tantissime lingue.AI Repo of the Week: Generative AI for Beginners with JavaScript
Introduction Ready to explore the fascinating world of Generative AI using your JavaScript skills? This week’s featured repository, Generative AI for Beginners with JavaScript, is your launchpad into the future of application development. Whether you're just starting out or looking to expand your AI toolbox, this open-source GitHub resource offers a rich, hands-on journey. It includes interactive lessons, quizzes, and even time-travel storytelling featuring historical legends like Leonardo da Vinci and Ada Lovelace. Each chapter combines narrative-driven learning with practical exercises, helping you understand foundational AI concepts and apply them directly in code. It’s immersive, educational, and genuinely fun. What You'll Learn 1. 🧠 Foundations of Generative AI and LLMs Start with the basics: What is generative AI? How do large language models (LLMs) work? This chapter lays the groundwork for how these technologies are transforming JavaScript development. 2. 🚀 Build Your First AI-Powered App Walk through setting up your environment and creating your first AI app. Learn how to configure prompts and unlock the potential of AI in your own projects. 3. 🎯 Prompt Engineering Essentials Get hands-on with prompt engineering techniques that shape how AI models respond. Explore strategies for crafting prompts that are clear, targeted, and effective. 4. 📦 Structured Output with JSON Learn how to guide the model to return structured data formats like JSON—critical for integrating AI into real-world applications. 5. 🔍 Retrieval-Augmented Generation (RAG) Go beyond static prompts by combining LLMs with external data sources. Discover how RAG lets your app pull in live, contextual information for more intelligent results. 6. 🛠️ Function Calling and Tool Use Give your LLM new powers! Learn how to connect your own functions and tools to your app, enabling more dynamic and actionable AI interactions. 7. 📚 Model Context Protocol (MCP) Dive into MCP, a new standard for organizing prompts, tools, and resources. Learn how it simplifies AI app development and fosters consistency across projects. 8. ⚙️ Enhancing MCP Clients with LLMs Build on what you’ve learned by integrating LLMs directly into your MCP clients. See how to make them smarter, faster, and more helpful. ✨ More chapters coming soon—watch the repo for updates! Companion App: Interact with History Experience the power of generative AI in action through the companion web app—where you can chat with historical figures and witness how JavaScript brings AI to life in real time. Conclusion Generative AI for Beginners with JavaScript is more than a course—it’s an adventure into how storytelling, coding, and AI can come together to create something fun and educational. Whether you're here to upskill, experiment, or build the next big thing, this repository is your all-in-one resource to get started with confidence. 🔗 Jump into the future of development—check out the repo and start building with AI today!AI Upskilling Framework Level 3 Building
The Global AI Community is excited to bring you the latest updates on AI Upskilling Framework Level 3 Building, straight from Microsoft Ignite! This session dives deep into advanced concepts for building agentic workflows and showcases new announcements that will help developers accelerate their Agentic AI journey.AI Dev Days 2025: Your Gateway to the Future of AI Development
What’s in Store? Day 1 – 10 December: Video Link Building AI Applications with Azure, GitHub, and Foundry Explore cutting-edge topics like: Agentic DevOps Azure SRE Agent Microsoft Foundry MCP Models for AI innovation Day 2 – 11 December Agenda: Video Link Using AI to Boost Developer Productivity Get hands-on with: Agent HQ VS Code & Visual Studio 2026 GitHub Copilot Coding Agent App Modernisation Strategies Why Join? Hands-on Labs: Apply the latest product features immediately. Highlights from Microsoft Ignite & GitHub Universe 2025: Stay ahead of the curve. Global Reach: Local-language workshops for LATAM and EMEA coming soon. You’ll recognise plenty of familiar faces in the lineup – don’t miss the chance to connect and learn from the best! 👉 Register now and share widely across your networks – there’s truly something for everyone! https://aka.ms/ai-dev-daysFrom Concept to Code: Building Production-Ready Multi-Agent Systems with Microsoft Foundry
We have reached a critical inflection point in AI development. Within the Microsoft Foundry ecosystem, the core value proposition of "Agents" is shifting decisively—moving from passive content generation to active task execution and process automation. These are no longer just conversational interfaces. They are intelligent entities capable of connecting models, data, and tools to actively execute complex business logic. To support this evolution, Microsoft has introduced a powerful suite of capabilities: the Microsoft Agent Framework for sophisticated orchestration, the Agent V2 SDK, and integrated Microsoft Foundry VSCode Extensions. These innovations provide the tooling necessary to bridge the gap between theoretical research and secure, scalable enterprise landing. But how do you turn these separate components into a cohesive business solution? That is the challenge we address today. This post dives into the practical application of these tools, demonstrating how to connect the dots and transform complex multi-agent concepts into deployed reality. The Scenario: Recruitment through an "Agentic Lens" Let’s ground this theoretical discussion with a real-world scenario that perfectly models a multi-agent environment: The Recruitment Process. By examining recruitment through an agentic lens, we can identify distinct entities with specific mandates: The Recruiter Agent: Tasked with setting boundary conditions (job requirements) and preparing data retrieval mechanisms (interview questions). The Applicant Agent: Objective is to process incoming queries and synthesize the best possible output to meet the recruiter's acceptance criteria. Phase 1: Design Achieving Orchestration via Microsoft Foundry Workflows To bridge the gap between our scenario and technical reality, we start with Foundry Workflows. Workflows serves as the visual integration environment within Foundry. It allows you to build declarative pipelines that seamlessly combine deterministic business logic with the probabilistic nature of autonomous AI agents. By adopting this visual, low-code paradigm, you eliminate the need to write complex orchestration logic from scratch. Workflows empowers you to coordinate specialized agents intuitively, creating adaptive systems that solve complex business problems collaboratively. Visually Orchestrating the Cycle Microsoft Foundry provides an intuitive, web-based drag-and-drop interface. This canvas allows you to integrate specialized AI agents alongside standard procedural logic blocks, transforming abstract ideas into executable processes without writing extensive glue code. To translate our recruitment scenario into a functional workflow, we follow a structured approach: Agent Prerequisites: We pre-configure our specialized agents within Foundry. We create a Recruiter Agent (prompted to generate evaluation criteria) and an Applicant Agent (prompted to synthesize responses). Orchestrating the Interaction: We drag these nodes onto the board and define the data flow. The process begins with the Recruiter generating questions, piping that output directly as input for the Applicant agent. Adding Business Logic: A true workflow requires decision-making. We introduce control flow logic, such as IF/ELSE conditional blocks, to evaluate the recruiter's questions based on predefined criteria. This allows the workflow to branch dynamically—if satisfied, the candidate answers the questions; if not, the questions are regenerated. Alternative: YAML Configuration For developers who prefer a code-first approach or wish to rapidly replicate this logic across environments, Foundry allows you to export the underlying YAML. kind: workflow trigger: kind: OnConversationStart id: trigger_wf actions: - kind: SetVariable id: action-1763742724000 variable: Local.LatestMessage value: =UserMessage(System.LastMessageText) - kind: InvokeAzureAgent id: action-1763736666888 agent: name: HiringManager input: messages: =System.LastMessage output: autoSend: true messages: Local.LatestMessage - kind: Question variable: Local.Input id: action-1763737142539 entity: StringPrebuiltEntity skipQuestionMode: SkipOnFirstExecutionIfVariableHasValue prompt: Boss, can you confirm this ? - kind: ConditionGroup conditions: - condition: =Local.Input="Yes" actions: - kind: InvokeAzureAgent id: action-1763744279421 agent: name: ApplyAgent input: messages: =Local.LatestMessage output: autoSend: true messages: Local.LatestMessage - kind: EndConversation id: action-1763740066007 id: if-action-1763736954795-0 id: action-1763736954795 elseActions: - kind: GotoAction actionId: action-1763736666888 id: action-1763737425562 id: "" name: HRDemo description: "" Simulating the End-to-End Process Once constructed, Foundry provides a robust, built-in testing environment. You can trigger the workflow with sample input data to simulate the end-to-end cycle. This allows you to debug hand-offs and interactions in real-time before writing a single line of application code. Phase 2: Develop From Cloud Canvas to Local Code with VSCode Foundry Workflows excels at rapid prototyping. However, a visual UI is rarely sufficient for enterprise-grade production. The critical question becomes: How do we integrate these visual definitions into a rigorous Software Development Lifecycle (SDLC)? While the cloud portal is ideal for design, enterprise application delivery happens in the local IDE. The Microsoft Foundry VSCode Extension bridges this gap. This extension allows developers to: Sync: Pull down workflow definitions from the cloud to your local machine. Inspect: Review the underlying logic in your preferred environment. Scaffold: Rapidly generate the underlying code structures needed to run the flow. This accelerates the shift from "understanding" the flow to "implementing" it. Phase 3: Deploy Productionizing Intelligence with the Microsoft Agent Framework Once the multi-agent orchestration has been validated locally, the final step is transforming it into a shipping application. This is where the Microsoft Agent Framework shines as a runtime engine. It natively ingests the declarative Workflow definitions (YAML) exported from Foundry. This allows artifacts from the prototyping phase to be directly promoted to application deployment. By simply referencing the workflow configuration libraries, you can "hydrate" the entire multi-agent system with minimal boilerplate. Here is the code required to initialize and run the workflow within your application. Note - Check the source code https://github.com/microsoft/Agent-Framework-Samples/tree/main/09.Cases/MicrosoftFoundryWithAITKAndMAF Summary: The Journey from Conversation to Action Microsoft Foundry is more than just a toolbox; it is a comprehensive solution designed to bridge the chasm between theoretical AI research and secure, scalable enterprise applications. In this post, we walked through the three critical stages of modern AI development: Design (Low-Code): Leveraging Foundry Workflows to visually orchestrate specialized agents (Recruiter vs. Applicant) mixed with deterministic business rules. Develop (Local SDLC): Utilizing the VSCode Extension to break down the barriers between the cloud canvas and the local IDE, enabling seamless synchronization and debugging. Deploy (Native Runtime): Using the Microsoft Agent Framework to ingest declarative YAML, realizing the promise of "Configuration as Code" and eliminating tedious logic rewriting. By following this path, developers can move beyond simple content generation and build adaptive, multi-agent systems that drive real business value. Learning Resoures What's Microsoft Foundry (https://learn.microsoft.com/azure/ai-foundry/what-is-azure-ai-foundry?view=foundry) Work with Declarative (Low-code) Agent workflows in Visual Studio Code (preview) (https://learn.microsoft.com/azure/ai-foundry/agents/how-to/vs-code-agents-workflow-low-code?view=foundry) Microsoft Agent Framework(https://github.com/microsoft/agent-framework) Microsoft Foundry VSCode Extension(https://marketplace.visualstudio.com/items?itemName=TeamsDevApp.vscode-ai-foundry)7.6KViews1like0CommentsUnderstanding Small Language Modes
In Part 1, we discussed the differences between Large Language Models (LLMs) and Small Language Models (SLMs), highlighting how SLMs are transforming edge computing. Unlike LLMs, which rely on massive cloud infrastructure, SLMs can run directly on local devices like smartphones and IoT systems, offering faster performance, better privacy, and lower energy use. In this second part, we dive into how SLMs work, exploring their technical foundations and design principles that make them ideal for edge environments.