ai foundry
18 TopicsI want to show my agent a picture—Can I?
Welcome to Agent Support—a developer advice column for those head-scratching moments when you’re building an AI agent! Each post answers a question inspired by real conversations in the AI developer community, offering practical advice and tips. To kick things off, we’re tackling a common challenge for anyone experimenting with multimodal agents: working with image input. Let’s dive in! Dear Agent Support, I’m building an AI agent, and I’d like to include screenshots or product photos as part of the input. But I’m not sure if that’s even possible, or if I need to use a different kind of model altogether. Can I actually upload an image and have the agent process it? Great question, and one that trips up a lot of people early on! The short answer is: yes, some models can process images—but not all of them. Let’s break that down a bit. 🧠 Understanding Image Input When we talk about image input or image attachments, we’re talking about the ability to send a non-text file (like a .png, .jpg, or screenshot) into your prompt and have the model analyze or interpret it. That could mean describing what’s in the image, extracting text from it, answering questions about a chart, or giving feedback on a design layout. 🚫 Not All Models Support Image Input That said, this isn’t something every model can do. Most base language models are trained on text data only, they’re not designed to interpret non-text inputs like images. In most tools and interfaces, the option to upload an image only appears if the selected model supports it, since platforms typically hide or disable features that aren't compatible with a model's capabilities. So, if your current chat interface doesn’t mention anything about vision or image input, it’s likely because the model itself isn’t equipped to handle it. That’s where multimodal models come in. These are models that have been trained (or extended) to understand both text and images, and sometimes other data types too. Think of them as being fluent in more than one language, except in this case, one of those “languages” is visual. 🔎 How to Find Image-Supporting Models If you’re trying to figure out which models support images, the AI Toolkit is a great place to start! The extension includes a built-in Model Catalog where you can filter models by Feature—like Image Attachment—so you can skip the guesswork. Here’s how to do it: Open the Model Catalog from the AI Toolkit panel in Visual Studio Code. Click the Feature filter near the search bar. Select Image Attachment. Browse the filtered results to see which models can accept visual input. Once you've got your filtered list, you can check out the model details or try one in the Playground to test how it handles image-based prompts. 🧪 Test Before You Build Before you plug a model into your agent and start wiring things together, it’s a good idea to test how the model handles image input on its own. This gives you a quick feel for the model’s behavior and helps you catch any limitations before you're deep into building. You can do this in the Playground, where you can upload an image and pair it with a simple prompt like: “Describe the contents of this image.” OR “Summarize what’s happening in this screenshot.” If the model supports image input, you’ll be able to attach a file and get a response based on its visual analysis. If you don’t see the option to upload an image, double-check that the model you’ve selected has image capabilities—this is usually a model issue, not a UI bug. 🔁 Recap Here’s a quick rundown of what we covered: Not all models support image input—you’ll need a multimodal model specifically built to handle visual data. Most platforms won’t let you upload an image unless the model supports it, so if you don’t see that option, it’s probably a model limitation. You can use the AI Toolkit’s Model Catalog to filter models by capability—just check the box for Image Attachment. Test the model in the Playground before integrating it into your agent to make sure it behaves the way you expect. 📺 Want to Go Deeper? Check out my latest video on how to choose the right model for your agent—it’s part of the Build an Agent Series, where I walk through the building blocks of turning an idea into a working AI agent. And if you’re looking to sharpen your model instincts, don’t miss Model Mondays—a weekly series that helps developers like you build your Model IQ, one spotlight at a time. Whether you’re just starting out or already building AI-powered apps, it’s a great way to stay current and confident in your model choices. 👉 Explore the series and catch the next episode: aka.ms/model-mondays/rsvp If you're just getting started with building agents, check out our Agents for Beginners curriculum. And for all your general AI and AI agent questions, join us in the Azure AI Foundry Discord! You can find me hanging out there answering your questions about the AI Toolkit. I'm looking forward to chatting with you there! Whatever you're building, the right model is out there—and with the right tools, you'll know exactly how to find it.Impariamo a conoscere MCP: Introduzione al Model Context Protocol (MCP)
Non perderti il prossimo evento “Let’s Learn – MCP” su Microsoft Reactor il 24 di Luglio, pensato per chiunque voglia conoscere meglio il nuovo standard per agenti intelligenti (il Model Context Protocol) e imparare a metterlo in pratica. La sessione è in Italiano e le demo sono in Python, ma fa parte di una serie di live-streaming disponibili in tantissime lingue.AI Repo of the Week: Generative AI for Beginners with JavaScript
Introduction Ready to explore the fascinating world of Generative AI using your JavaScript skills? This week’s featured repository, Generative AI for Beginners with JavaScript, is your launchpad into the future of application development. Whether you're just starting out or looking to expand your AI toolbox, this open-source GitHub resource offers a rich, hands-on journey. It includes interactive lessons, quizzes, and even time-travel storytelling featuring historical legends like Leonardo da Vinci and Ada Lovelace. Each chapter combines narrative-driven learning with practical exercises, helping you understand foundational AI concepts and apply them directly in code. It’s immersive, educational, and genuinely fun. What You'll Learn 1. 🧠 Foundations of Generative AI and LLMs Start with the basics: What is generative AI? How do large language models (LLMs) work? This chapter lays the groundwork for how these technologies are transforming JavaScript development. 2. 🚀 Build Your First AI-Powered App Walk through setting up your environment and creating your first AI app. Learn how to configure prompts and unlock the potential of AI in your own projects. 3. 🎯 Prompt Engineering Essentials Get hands-on with prompt engineering techniques that shape how AI models respond. Explore strategies for crafting prompts that are clear, targeted, and effective. 4. 📦 Structured Output with JSON Learn how to guide the model to return structured data formats like JSON—critical for integrating AI into real-world applications. 5. 🔍 Retrieval-Augmented Generation (RAG) Go beyond static prompts by combining LLMs with external data sources. Discover how RAG lets your app pull in live, contextual information for more intelligent results. 6. 🛠️ Function Calling and Tool Use Give your LLM new powers! Learn how to connect your own functions and tools to your app, enabling more dynamic and actionable AI interactions. 7. 📚 Model Context Protocol (MCP) Dive into MCP, a new standard for organizing prompts, tools, and resources. Learn how it simplifies AI app development and fosters consistency across projects. 8. ⚙️ Enhancing MCP Clients with LLMs Build on what you’ve learned by integrating LLMs directly into your MCP clients. See how to make them smarter, faster, and more helpful. ✨ More chapters coming soon—watch the repo for updates! Companion App: Interact with History Experience the power of generative AI in action through the companion web app—where you can chat with historical figures and witness how JavaScript brings AI to life in real time. Conclusion Generative AI for Beginners with JavaScript is more than a course—it’s an adventure into how storytelling, coding, and AI can come together to create something fun and educational. Whether you're here to upskill, experiment, or build the next big thing, this repository is your all-in-one resource to get started with confidence. 🔗 Jump into the future of development—check out the repo and start building with AI today!Orchestrate multimodal AI insights within your healthcare data estate (Public Preview)
In today’s healthcare landscape, there is an increasing emphasis on leveraging artificial intelligence (AI) to extract meaningful insights from diverse datasets to improve patient care and drive clinical research. However, incorporating AI into your healthcare data estate often brings significant costs and challenges, especially when dealing with siloed and unstructured data. Healthcare organizations produce and consume data that is not only vast but also varied in format—ranging from structured EHR entries to unstructured clinical notes and imaging data. Traditional methods require manual effort to prepare and harmonize this data for AI, specify the AI output format, set up API calls, store the AI outputs, integrate the AI outputs, and analyze the AI outputs for each AI model or service you decide to use. Orchestrate multimodal AI insights is designed to streamline and scale healthcare AI within your data estate by building off of the data transformations in healthcare data solutions in Microsoft Fabric. This capability provides a framework to generate AI insights by connecting your multimodal healthcare data to an ecosystem of AI services and models and integrating structured AI-generated insights back into your data estate. When you combine these AI-generated insights with the existing healthcare data in your data estate, you can power advanced analytics scenarios for your organization and patient population. Key features: Metadata store lakehouse acts as a central repository for the metadata for AI orchestration to effectively capture and manage enrichment definitions, view definitions, and contextual information for traceability purposes. Execution notebooks define the enrichment view and enrichment definition based on the model configuration and input mappings. They also specify the model processor and transformer. The model processor calls the model API, and the transformer produces the standardized output while saving the output in the bronze lakehouse in the Ingest folder. Transformation pipeline to ingest AI-generated insights through the healthcare data solutions medallion lakehouse layers and persist the insights in an enrichment store within the silver layer. Conceptual architecture: The data transformations in healthcare data solutions in Microsoft Fabric allow you ingest, store, and analyze multimodal data. With the orchestrate multimodal AI insights capability, this standardized data serves as the input for healthcare AI models. The model results are stored in a standardized format and provide new insights from your data. The diagram below shows the flow of integrating AI generated insights into the data estate, starting as raw data in the bronze lakehouse and being transformed to delta tables in the silver lakehouse. This capability simplifies AI integration across modalities for data-driven research and care, currently supporting: Text Analytics for health in Azure AI Language to extract medical entities such as conditions and medications from unstructured clinical notes. This utilizes the data in the DocumentReference FHIR resource. MedImageInsight healthcare AI model in Azure AI Foundry to generate medical image embeddings from imaging data. This model leverages the data in the ImagingStudy FHIR resource. MedImageParse healthcare AI model in Azure AI Foundry to enable segmentation, detection, and recognition from imaging data across numerous object types and imaging modalities. This model uses the data in the ImagingStudy FHIR resource. By using orchestrate multimodal AI insights to leverage the data in healthcare data solutions for these models and integrate the results into the data estate, you can analyze your existing data alongside AI enrichments. This allows you to explore use cases such as creating image segmentations and combining with your existing imaging metadata and clinical data to enable quick insights and disease progression trends for clinical research at the patient level. Get started today! This capability is now available in public preview, and you can use the in-product sample data to test this feature with any of the three models listed above. For more information and to learn how to deploy the capability, please refer to the product documentation. We will dive deeper into more detailed aspects of the capability, such as the enrichment store and custom AI use cases, in upcoming blogs. Medical device disclaimer: Microsoft products and services (1) are not designed, intended or made available as a medical device, and (2) are not designed or intended to be a substitute for professional medical advice, diagnosis, treatment, or judgment and should not be used to replace or as a substitute for professional medical advice, diagnosis, treatment, or judgment. Customers/partners are responsible for ensuring solutions comply with applicable laws and regulations. FHIR® is the registered trademark of HL7 and is used with permission of HL7.1.2KViews2likes0CommentsUnleashing the Power of AI Agents: Transforming Business Operations
Let "Get Started with AI Agents," in this short blog I want explore the evolution, capabilities, and applications of AI agents, highlighting their potential to enhance productivity and efficiency. We take a peak into the challenges of developing AI agents and introduce powerful tools like Azure AI Foundry and Azure AI Agent Service that empower developers to build, deploy, and scale AI agents securely and efficiently. In today's rapidly evolving technological landscape, the integration of AI agents into business processes is becoming increasingly essential. Lets delve into the transformative potential of AI agents and how they can revolutionize various aspects of our operations. We begin by exploring the evolution of LLM-based solutions, tracing the journey from no agents to sophisticated multi-agent systems. This progression highlights the growing complexity and capabilities of AI agents, which are now poised to handle wide-scope, complex use cases requiring diverse skills. Lets now look at agentic AI capabilities. AI agents can significantly enhance employee productivity and process efficiency, making our operations faster and more effective. Lets examine the key applications of AI agents across industries, such as travel booking and expense management, employee onboarding, personalized customer support, and data analytics and reporting. However, developing AI agents is not without its challenges. Some of the primary considerations, including tool integration, interoperability, scalability, real-time processing, maintenance, flexibility, error handling, and security. These challenges underscore the need for robust platforms that enable rapid development and secure deployment of AI agents. To this end, we introduce Azure AI Foundry and Azure AI Agent Service. These tools empower developers to build, deploy, and scale AI agents securely and efficiently. Azure AI Foundry offers a comprehensive suite of tools, including model catalogs, content safety features, and machine learning capabilities. The Azure AI Agent Service, currently in public preview, provides flexible model selection, extensive data connections, enterprise-grade security, and rapid development and automation capabilities. When building multi agent or agentic based systems there is a huge importance of multi-agent orchestration. Tools like AutoGen and Semantic Kernel facilitate the orchestration of multi-agent systems, enabling seamless integration and collaboration between different AI agents. In conclusion, the transformative potential of AI agents in driving productivity, efficiency, and innovation. By leveraging the capabilities of Azure AI Foundry and Azure AI Agent Service, we can overcome the challenges of AI agent development and unlock new opportunities for growth and success. Resources Azure AI Discord - https://aka.ms/AzureAI/Discord Global AI community - https://globalai.community Generative AI for beginners – https://aka.ms/genai-beginners AI Agents for beginners - https://aka.ms/ai-agents-beginners Attend one of the Global AI Bootcamp near you - https://globalai.community/bootcamp/ Build AI Tour open content - https://aka.ms/aitour/repos Build your first Agent with Azure AI Agent Service - Slide deck and code - https://github.com/microsoft/aitour-build-your-first-agent-with-azure-ai-agent-service1.2KViews2likes0CommentsIs AI Foundry in new exam for DP-100
A 25-30% of the DP-100 exam is now dedicated to Optimizing Language Models for AI Applications - is this requiring Azure AI Foundry? It doesn't say specifically in the study guide: https://learn.microsoft.com/en-us/credentials/certifications/resources/study-guides/dp-100 Also, the videos could benefit from being updated to cover the changes as of 16 January 2025.344Views2likes1CommentTiny But Mighty: Unleashing the Power of Small Language Models 🚀
While Large Language Models (LLMs) like GPT-4 dominate headlines with their extensive capabilities, they often come at the cost of high computational requirements and complexity. For developers and organizations looking to implement AI solutions on edge devices or with limited resources, Small Language Models (SLMs) are emerging as a practical alternative. SLMs are not just "smaller" versions of their larger counterparts—they're designed to be faster, more efficient, and adaptable for specific tasks. With fewer parameters and lower computational needs, SLMs open the door to deploying AI on mobile devices, IoT systems, and edge environments without compromising performance. What You Stand to Learn 🧠 Introduction to Microsoft's AI Ecosystem Discover Microsoft's end-to-end AI development tools, from Azure AI Services to ONNX Runtime, enabling efficient and secure deployment of AI models across cloud and edge environments. The Advantages of SLMs over LLMs SLMs are game-changers for edge AI applications, providing faster training and inference times, reduced energy costs, and scalability across diverse devices. Hands-On with Phi-3 and ONNX Runtime Experience live demonstrations of SLMs in action with tools like Phi-3 and ONNX Runtime, showcasing how to fine-tune and deploy models on mobile devices, IoT, and hybrid cloud environments. Responsible AI Practices Understand how to safeguard your AI applications with Microsoft's Responsible AI toolkit, ensuring ethical and trustworthy deployments. Watch the Full Session 👨💻 📅 Date: December 12, 2024 ⏰ Time: 4 PM GMT | 5 PM CEST | 8 AM PT | 11 AM ET | 7 PM EAT A session packed with live demos, practical examples, and Q&A opportunities. Register NOW | Events | Microsoft Reactor Agenda 🔍 Introduction (5 min) A brief overview of the session and its focus on SLMs and LLMs. Microsoft AI Tooling (5 min) Explore the latest tools like Azure AI Services, Azure Machine Learning, and Responsible AI Tooling. How to Choose the Right Model (10 min) Key considerations such as performance, customizability, and ethical implications. Comparing SLMs vs LLMs (10 min) The strengths, weaknesses, and best use cases for both Small and Large Language Models. Deploying Models at the Edge (10 min) Insights into optimizing AI for mobile, IoT, and edge devices. Q&A Addressing participant questions about AI development and deployment.431Views2likes0CommentsHow do I give my agent access to tools?
Welcome back to Agent Support—a developer advice column for those head-scratching moments when you’re building an AI agent! Each post answers a real question from the community with simple, practical guidance to help you build smarter agents. Today’s question comes from someone trying to move beyond chat-only agents into more capable, action-driven ones: 💬 Dear Agent Support I want my agent to do more than just respond with text. Ideally, it could look up information, call APIs, or even run code—but I’m not sure where to start. How do I give my agent access to tools? This is exactly where agents start to get interesting! Giving your agent tools is one of the most powerful ways to expand what it can do. But before we get into the “how,” let’s talk about what tools actually mean in this context—and how Model Context Protocol (MCP) helps you use them in a standardized, agent-friendly way. 🛠️ What Do We Mean by “Tools”? In agent terms, a tool is any external function or capability your agent can use to complete a task. That might be: A web search function A weather lookup API A calculator A database query A custom Python script When you give an agent tools, you’re giving it a way to take action—not just generate text. Think of tools as buttons the agent can press to interact with the outside world. ⚡ Why Give Your Agent Tools? Without tools, an agent is limited to what it “knows” from its training data and prompt. It can guess, summarize, and predict, but it can’t do. Tools change that! With the right tools, your agent can: Pull live data from external systems Perform dynamic calculations Trigger workflows in real time Make decisions based on changing conditions It’s the difference between an assistant that can answer trivia questions vs. one that can book your travel or manage your calendar. 🧩 So… How Does This Work? Enter Model Context Protocol (MCP). MCP is a simple, open protocol that helps agents use tools in a consistent way—whether those tools live in your app, your cloud project, or on a server you built yourself. Here’s what MCP does: Describes tools in a standardized format that models can understand Wraps the function call + input + output into a predictable schema Lets agents request tools as needed (with reasoning behind their choices) This makes it much easier to plug tools into your agent workflow without reinventing the wheel every time! 🔌 How to Connect an Agent to Tools Wiring tools into your agent might sound complex, but it doesn’t have to be! If you’ve already got a MCP server in mind, there’s a straightforward way within the AI Toolkit to expose it as a tool your agent can use. Here’s how to do it: Open the Agent Builder from the AI Toolkit panel in Visual Studio Code. Click the + New Agent button and provide a name for your agent. Select a Model for your agent. Within the Tools section, click + MCP Server. In the wizard that appears, click + Add Server. From there, you can select one of the MCP servers built my Microsoft, connect to an existing server that’s running, or even create your own using a template! After giving the server a Server ID, you’ll be given the option to select which tools from the server to add for your agent. Once connected, your agent can call tools dynamically based on the task at hand. 🧪 Test Before You Build Once you’ve connected your agent to an MCP server and added tools, don’t jump straight into full integration. It’s worth taking time to test whether the agent is calling the right tool for the job. You can do this directly in the Agent Builder: enter a test prompt that should trigger a tool in the User Prompt field, click Run, and observe how the model responds. This gives you a quick read on tool call accuracy. If the agent selects the wrong tool, it’s a sign that your system prompt might need tweaking before you move forward. However, if the agent calls the correct tool but the output still doesn’t look right, take a step back and check both sides of the interaction. It might be that the system prompt isn’t clearly guiding the agent on how to use or interpret the tool’s response. But it could also be an issue with the tool itself—whether that’s a bug in the logic, unexpected behavior, or a mismatch between input and expected output. Testing both the tool and the prompt in isolation can help you pinpoint where things are going wrong before you move on to full integration. 🔁 Recap Here’s a quick rundown of what we covered: Tools = external functions your agent can use to take action MCP = a protocol that helps your agent discover and use those tools reliably If the agent calls the wrong tool—or uses the right tool incorrectly—check your system prompt and test the tool logic separately to isolate the issue. 📺 Want to Go Deeper? Check out my latest video on how to connect your agent to a MCP server—it’s part of the Build an Agent Series, where I walk through the building blocks of turning an idea into a working AI agent. The MCP for Beginners curriculum covers all the essentials—MCP architecture, creating and debugging servers, and best practices for developing, testing, and deploying MCP servers and features in production environments. It also includes several hands-on exercises across .NET, Java, TypeScript, JavaScript and Python. 👉 Explore the full curriculum: aka.ms/AITKmcp And for all your general AI and AI agent questions, join us in the Azure AI Foundry Discord! You can find me hanging out there answering your questions about the AI Toolkit. I'm looking forward to chatting with you there! Whether you’re building a productivity agent, a data assistant, or a game bot—tools are how you turn your agent from smart to useful.