genaiops
17 TopicsMicrosoft Foundry: An End-to-End Platform for Building, Governing, and Scaling AI
Microsoft Foundry: What It Is and How to Get Started As organizations accelerate their adoption of AI, one challenge consistently emerges: how to move from experimentation to production at scale in a secure, responsible, and efficient way. Microsoft Foundry exists to address exactly that challenge. What Is Microsoft Foundry? Microsoft Foundry is an end-to-end platform experience that brings together Microsoft’s AI development, deployment, and governance capabilities into a unified environment. It enables developers, data scientists, and enterprises to build, customize, deploy, and operate AI solutions, including generative AI, using Microsoft and partner models, tools, and services. Rather than being a single product, Foundry is a curated and integrated experience that spans model access, tooling, orchestration, evaluation, and enterprise-grade controls. Why Microsoft Foundry Exists AI innovation is moving fast, but enterprise adoption requires more than just access to models. Customers need: A consistent way to work with multiple models, including Microsoft, OpenAI, and open-source models Built-in security, compliance, and responsible AI capabilities Tooling that supports the full AI lifecycle, not just prototyping Seamless integration with existing data platforms, applications, and cloud operations Microsoft Foundry was created to reduce friction between experimentation and real-world deployment while aligning AI development with enterprise standards, governance, and scale. What’s Included in Microsoft Foundry? Microsoft Foundry brings together several key capabilities: 1. Model Choice and Flexibility Access to leading foundation models, including Azure OpenAI models and selected open-source models Ability to evaluate and select models based on performance, cost, and use case 2. AI Development and Orchestration Tools Prompt engineering, fine-tuning, and grounding with enterprise data Tools for building copilots, chat experiences, and AI-powered applications Orchestration across tools, APIs, and workflows 3. Evaluation, Safety, and Responsible AI Built-in evaluation for quality, latency, and cost Content safety, monitoring, and governance controls Alignment with Microsoft’s Responsible AI principles 4. Enterprise-Grade Platform Integration Native integration with Azure, Microsoft Fabric, Power Platform, and developer tools Identity, security, and compliance through Microsoft Entra and Azure controls Observability and lifecycle management for production workloads How to Get Started Getting started with Microsoft Foundry is straightforward: Start in Azure - Use Azure as the control plane to access Foundry experiences and services. Explore models and tools - Experiment with available models, build prompts, and prototype AI workflows. Ground with your data - Connect enterprise data securely to create more relevant and contextual AI experiences. Evaluate and deploy - Use built-in evaluation and safety tools, then deploy AI solutions into production with confidence. Scale responsibly - Apply governance, monitoring, and cost controls as adoption grows. Final Thoughts Microsoft Foundry represents Microsoft’s vision for enterprise AI done right. It offers flexible model choice, strong development tooling, and built-in trust. By unifying AI development and operations into a single experience, Foundry helps organizations move faster while staying secure, compliant, and future-ready. Whether you are just starting with generative AI or scaling existing solutions, Microsoft Foundry provides a practical foundation to build on.533Views0likes0CommentsWhat is trending in Hugging Face on Microsoft Foundry? Feb, 2, 2026
Open‑source AI is moving fast, with important breakthroughs in reasoning, agentic systems, multimodality, and efficiency emerging every day. Hugging Face has been a leading platform where researchers, startups, and developers share and discover new models. Microsoft Foundry brings these trending Hugging Face models into a production‑ready experience, where developers can explore, evaluate, and deploy them within their Azure environment. Our weekly Model Monday’s series highlights Hugging Face models available in Foundry, focusing on what matters most to developers: why a model is interesting, where it fits, and how to put it to work quickly. This week’s Model Mondays edition highlights three Hugging Face models, including a powerful Mixture-of-Experts model from Z. AI designed for lightweight deployment, Meta’s unified foundation model for image and video segmentation, and MiniMax’s latest open-source agentic model optimized for complex workflows. Models of the week Z.AI’s GLM-4.7-flash Model Basics Model name: zai-org/GLM-4.7-Flash Parameters / size: 30B total -3B active Default settings: 131,072 max new tokens Primary task: Agentic, Reasoning and Coding Why this model matters Why it’s interesting: It utilizes a Mixture-of-Experts (MoE) architecture (30B total parameters and 3B active parameters) to offer a new option for lightweight deployment. It demonstrates strong performance on logic and reasoning benchmarks, outperforming similar sized models like gpt-oss-20b on AIME 25 and GPQA benchmarks. It supports advanced inference features like "Preserved Thinking" mode for multi-turn agentic tasks. Best‑fit use cases: Lightweight local deployment, multi-turn agentic tasks, and logical reasoning applications. What’s notable: From the Foundry catalog, users can deploy on a A100 instance or unsloth/GLM-4.7-Flash-GGUF on a CPU. ource SOTA scores among models of comparable size. Additionally, compared to similarly sized models, GLM-4.7-Flash demonstrates superior frontend and backend development capabilities. Click to see more: https://docs.z.ai Try it Use case Best‑practice prompt pattern Agentic coding (multi‑step repo work, debugging, refactoring) Treat the model as an autonomous coding agent, not a snippet generator. Explicitly require task decomposition and step‑by‑step execution, then a single consolidated result. Long‑context agent workflows (local or low‑cost autonomous agents) Call out long‑horizon consistency and context preservation. Instruct the model to retain earlier assumptions and decisions across turns. Now that you know GLM‑4.7‑Flash works best when you give it a clear goal and let it reason through a bounded task, here’s an example prompt that a product or engineering team might use to identify risks and propose mitigations: You are a software reliability analyst for a mid‑scale SaaS platform. Review recent incident reports, production logs, and customer issues to uncover edge‑case failures outside normal usage (e.g., rare inputs, boundary conditions, timing/concurrency issues, config drift, or unexpected feature interactions). Prioritize low‑frequency, high‑impact risks that standard testing misses. Recommend minimal, low‑cost fixes (validation, guardrails, fallback logic, or documentation). Deliver a concise executive summary with sections: Observed Edge Cases, Root Causes, User Impact, Recommended Lightweight Fixes, and Validation Steps. Meta's Segment Anything 3 (SAM3) Model Basics Model name: facebook/sam3 Parameters / size: 0.9B Primary task: Mask Generation, Promptable Concept Segmentation (PCS) Why this model matters Why it’s interesting: It handles a vastly larger set of open-vocabulary prompts than SAM 2, and unifies image and video segmentation capabilities. It includes a "SAM 3 Tracker" mode that acts as a drop-in replacement for SAM 2 workflows with improved performance. Best‑fit use cases: Open-vocabulary object detection, video object tracking, and automatic mask generation What’s notable: Introduces Promptable Concept Segmentation (PCS), allowing users to find all matching objects (e.g., "dial") via text prompt rather than just single instances. Try it This model enables users to identify specific objects within video footage and isolate them over extended periods. With just one line of code, it is possible to detect multiple similar objects simultaneously. The accompanying GIF demonstrates how SAM3 efficiently highlights players wearing white on the field as they appear and disappear from view. Additional examples are available at the following repository: https://github.com/facebookresearch/sam3/blob/main/assets/player.gif Use case Best‑practice prompt pattern Agentic coding (multi‑step repo work, debugging, refactoring) Treat SAM 3 as a concept detector, not an interactive click tool. Use short, concrete noun‑phrase concept prompts instead of describing the scene or asking questions. Example prompt: “yellow school bus” or “shipping containers”. Avoid verbs or full sentences. Video segmentation + object tracking Specify the same concept prompt once, then apply it across the video sequence. Do not restate the prompt per frame. Let the model maintain identity continuity. Example: “person wearing a red jersey”. Hard‑to‑name or visually subtle objects Use exemplar‑based prompts (image region or box) when text alone is ambiguous. Optionally combine positive and negative exemplars to refine the concept. Avoid over‑constraining with long descriptions. Using the GIF above as a leading example, here is a prompt that shows how SAM 3 turns raw sports footage into structured, reusable data. By identifying and tracking players based on visual concepts like jersey color so that sports leagues can turn tracked data into interactive experiences where automated player identification can relay stats, fun facts, etc when built into a larger application. Here is a prompt that will allow you to start identifying specific players across video: Act as a sports analytics operator analyzing football match footage. Segment and track all football players wearing blue jerseys across the video. Generate pixel‑accurate segmentation masks for each player and assign persistent instance IDs that remain stable during camera movement, zoom, and player occlusion. Exclude referees, opposing team jerseys, sidelines, and crowd. Output frame‑level masks and tracking metadata suitable for overlays, player statistics, and downstream analytics pipelines. MiniMax AI's MiniMax-M2.1 Model Basics Model name: MiniMaxAI/MiniMax-M2.1 Parameters / size: 229B-10B Active Default settings: 200,000 max new tokens Primary task: Agentic and Coding Why this model matters Why it’s interesting: It is optimized for robustness in coding, tool use, and long-horizon planning, outperforming Claude Sonnet 4.5 in multilingual scenarios. It excels in full-stack application development, capable of architecting apps "from zero to one”. Previous coding models focused on Python optimization, M2.1 brings enhanced capabilities in Rust, Java, Golang, C++, Kotlin, Objective-C, TypeScript, JavaScript, and other languages. The model delivers exceptional stability across various coding agent frameworks. Best‑fit use cases: Lightweight local deployment, multi-turn agentic tasks, and logical reasoning applications. What’s notable: The release of open-source weights for M2.1 delivers a massive leap over M2 on software engineering leaderboards. https://www.minimax.io/ Try it Use case Best‑practice prompt pattern End‑to‑end agentic coding (multi‑file edits, run‑fix loops) Treat the model as an autonomous coding agent, not a snippet generator. Explicitly require task decomposition and step‑by‑step execution, then a single consolidated result. Long‑horizon tool‑using agents (shell, browser, Python) Explicitly request stepwise planning and sequential tool use. M2.1’s interleaved thinking and improved instruction‑constraint handling are designed for complex, multi‑step analytical tasks that require evidence tracking and coherent synthesis, not conversational back‑and‑forth. Long‑context reasoning & analysis (large documents / logs) Declare the scope and desired output structure up front. MiniMax‑M2.1 performs best when the objective and final artifact are clear, allowing it to manage long context and maintain coherence. Because MiniMax‑M2.1 is designed to act as a long‑horizon analytical agent, it shines when you give it a clear end goal and let it work through large volumes of information—here’s a prompt a risk or compliance team could use in practice: You are a financial risk analysis agent. Analyze the following transaction logs and compliance policy documents to identify potential regulatory violations and systemic risk patterns. Plan your approach before executing. Work through the data step by step, referencing evidence where relevant. Deliver a final report with the following sections: Key Risk Patterns Identified, Supporting Evidence, Potential Regulatory Impact, Recommended Mitigations. Your response should be a complete, executive-ready report, not a conversational draft. Getting started You can deploy open‑source Hugging Face models directly in Microsoft Foundry by browsing the Hugging Face collection in the Foundry model catalog and deploying to managed endpoints in just a few clicks. You can also start from the Hugging Face Hub. First, select any supported model and then choose "Deploy on Microsoft Foundry", which brings you straight into Azure with secure, scalable inference already configured. Learn how to discover models and deploy them using Microsoft Foundry documentation. Follow along the Model Mondays series and access the GitHub to stay up to date on the latest Read Hugging Face on Azure docs Learn about one-click deployments from the Hugging Face Hub on Microsoft Foundry Explore models in Microsoft Foundry985Views0likes0CommentsImplementing MCP Remote Servers with Azure Function App and GitHub Copilot Integration
Introduction In the evolving landscape of AI-driven applications, the ability to seamlessly connect large language models (LLMs) with external tools and data sources is becoming a cornerstone of intelligent system design. Model Context Protocol (MCP) — a specification that enables AI agents to discover and invoke tools dynamically, based on context. While MCP is powerful, implementing it from scratch can be daunting !!! That’s where Azure Functions comes in handy. With its event-driven, serverless architecture, Azure Functions now supports a preview extension for MCP, allowing developers to build remote MCP servers that are scalable, secure, and cloud-native. Further, In VS Code, GitHub Copilot Chat in Agent Mode can connect to your deployed Azure Function App acting as an MCP server. This connection allows Copilot to leverage the tools and services exposed by your function app. Why Use Azure Functions for MCP? Serverless Simplicity: Deploy MCP endpoints without managing infrastructure. Secure by Design: Leverage HTTPS, system keys, and OAuth via EasyAuth or API Management. Language Flexibility: Build in .NET, Python, or Node.js using QuickStart templates. AI Integration: Enable GitHub Copilot, VS Code, or other AI agents to invoke your tools via SSE endpoints. Prerequisites Python version 3.11 or higher Azure Functions Core Tools >= 4.0.7030 Azure Developer CLI To use Visual Studio Code to run and debug locally: Visual Studio Code Azure Functions extension An storage emulator is needed when developing azure function app in VScode. you can deploy Azurite extension in VScode to meet this requirement. Press enter or click to view image in full size You can run the Azurite in VS Code as shown below. C:\Program Files\Microsoft Visual Studio\2022\Enterprise\Common7\IDE\Extensions\Microsoft\Azure Storage Emulator> .\azurite.exe Press enter or click to view image in full size alternatively, you can also run Azurite in docker container as shown below. docker run -p 10000:10000 -p 10001:10001 -p 10002:10002 \ mcr.microsoft.com/azure-storage/azurite For more information about setting up Azurite, visit Use Azurite emulator for local Azure Storage development | Microsoft Learn Github Repositories Following Github repos are needed to setup this PoC. Repository for MCP server using Azure Function App https://github.com/mafzal786/mcp-azure-functions-python.git Repository for AI Foundry agent as MCP Client https://github.com/mafzal786/ai-foundry-agent-with-remote-mcp-using-azure-functionapp.git Clone the repository Run the following command to clone the repository to start building your MCP server using Azure function app. git clone https://github.com/mafzal786/mcp-azure-functions-python.git Run the MCP server in VS Code Once cloned. Open the folder in VS Code. Create a virtual environment in VS Code. Change directory to “src” in a new terminal window, install the python dependencies and start the function host locally as shown below. cd src pip install -r requirements.txt func start Note: by default this will use the webhooks route: /runtime/webhooks/mcp/sse. Later we will use this in Azure to set the key on client/host calls: /runtime/webhooks/mcp/sse?code=<system_key> Press enter or click to view image in full size MCP Inspector In a new terminal window, install and run MCP Inspector. npx @modelcontextprotocol/inspector Click to load the MCP inspector. Also provide the generated proxy session token. http://127.0.0.1:6274/#resources In the URL type and click “Connect”: http://localhost:7071/runtime/webhooks/mcp/sse Once connected, click List Tools under Tools and select “hello_mcp” tool and click “Run Tool” for testing as shown below. Press enter or click to view image in full size Select another tool such as get_stockprice and run it as shown below. Press enter or click to view image in full size Deploy Function App to Azure from VS Code For deploying function app to azure from vs code, make sure you have Azure Tools extension enabled in VS Code. To learn more about Azure Tools extension, visit the following Azure Extensions if your VS code environment is not setup for Azure development, follow Configure Visual Studio Code for Azure development with .NET — .NET | Microsoft Learn Once Azure Tools are setup, sign in to Azure account with Azure Tools Press enter or click to view image in full size Once Sign-in is completed, you should be able to see all of your existing resources in the Resources view. These resources can be managed directly in VS Code. Look for Function App in Resource, right click and click “Deploy to Function App”. Press enter or click to view image in full size If you already have it deployed, you will get the following pop-up. Click “Deploy” Press enter or click to view image in full size This will start deploying your function app to Azure. In VS Code, Azure tab will display the following. Press enter or click to view image in full size Once the deployment is completed, you can view the function app and all the tools in Azure portal under function app as shown below. Press enter or click to view image in full size Get the mcp_extension key from Functions → App Keys in Function App. Press enter or click to view image in full size This mcp_extension key would be needed in mcp.json file in VS code, if you would like to test the MCP server using Github Copilot in VS Code. Your entries in mcp.json file will look like as below for example. { "inputs": [ { "type": "promptString", "id": "functions-mcp-extension-system-key", "description": "Azure Functions MCP Extension System Key", "password": true }, { "type": "promptString", "id": "functionapp-name", "description": "Azure Functions App Name" } ], "servers": { "remote-mcp-function": { "type": "sse", "url": "https://${input:functionapp-name}.azurewebsites.net/runtime/webhooks/mcp/sse", "headers": { "x-functions-key": "${input:functions-mcp-extension-system-key}" } }, "local-mcp-function": { "type": "sse", "url": "http://0.0.0.0:7071/runtime/webhooks/mcp/sse" } } } Test Azure Function MCP Server in MCP Inspector Launch MCP Inspector and provide the Azure Function in MCP inspector URL. Provide authentication as shown below. Bearer token is mcp_extension key. Testing an MCP server with GitHub Copilot Testing an MCP server with GitHub Copilot involves configuring and utilizing the server within your development environment to provide enhanced context and capabilities to Copilot Chat. Steps to Test an MCP Server with GitHub Copilot: Ensure Agent Mode is Enabled: Open Copilot Chat in Visual Studio Code and select “Agent” mode. This mode allows Copilot to interact with external tools and services, including MCP servers. Add the MCP Server: Open the Command Palette (Ctrl+Shift+P or Cmd+Shift+P) and run the command MCP: Add Server. Press enter or click to view image in full size Follow the prompts to configure the server. You can choose to add it to your workspace settings (creating a .vscode/mcp.json file) . Select HTTP or Server-Sent events Press enter or click to view image in full size Specify the URL and click Enter Press enter or click to view image in full size Provide a name of your choice Press enter or click to view image in full size Select scope as Global or workspace. I selected Workspace Press enter or click to view image in full size This will generate mcp.json file in .vscode or create a new entry if mcp.json already exists as shown below. Click Start to “start” the server. Also make sure your Azure function app is locally running with func start command. Press enter or click to view image in full size Now Type the prompt as shown below. Press enter or click to view image in full size Try another tool as below. Press enter or click to view image in full size VS code terminal output for reference. Press enter or click to view image in full size Testing an MCP server with Claude Desktop Claude Desktop is a standalone AI application that allows users to interact with Claude AI models directly from their desktop, providing a seamless and efficient experience. you can download Claude desktop at Download Claude In this article, I have added another tool to utilize to test your MCP server running in Azure Function app. Modify claude_desktop_config.json with the following. you can find this file in window environment at C:\Users\<username>\AppData\Roaming\Claude { "mcpServers": { "my mcp": { "command": "npx", "args": [ "mcp-remote", "http://localhost:7071/runtime/webhooks/mcp/sse" ] } } } Note: If claude_desktop_config.json does not exists, click on setting in Claude desktop under user and visit developer tab. You will see you MCP server in Claude Desktop as shown below. Press enter or click to view image in full size Type the prompt such as “What is the stock price of Tesla” . After submitting, you will notice that it is invoking the tool “get_stockprice” from the MCP server running locally and configured in the .json earlier. Click Allow once or Allow always as shown below. Following output will be displayed. Press enter or click to view image in full size Now lets try weather related prompt. As you can see, it has invoked “get_weatheralerts” tool from MCP server. Press enter or click to view image in full size Azure AI Foundry agent as MCP Client Use the following Github repo to set up Azure AI Foundry agent as MCP client. git clone https://github.com/mafzal786/ai-foundry-agent-with-remote-mcp-using-azure-functionapp.git Open the code in VS code and follow the instructions mentioned in README.md file at Github repo. Once you execute the code, following output will show up in VS code. Press enter or click to view image in full size In this code, message is hard coded. Change the content to “what is weather advisory for Florida” and rerun the program. It will call get_weatheralerts tool and output will look like as below. Press enter or click to view image in full size Conclusion The integration of Model Context Protocol (MCP) with Azure Functions marks a pivotal step in democratizing AI agent development. By leveraging Azure’s serverless architecture, developers can now build remote MCP servers that scale automatically, integrate seamlessly with other Azure services, and expose modular tools to intelligent agents like GitHub Copilot. This setup not only simplifies the deployment and management of MCP servers but also enhances the developer experience — allowing tools to be invoked contextually by AI agents in environments like VS Code, GitHub Codespaces, or Copilot Studio[2]. Whether you’re building a tool to query logs, calculate metrics, or manage data, Azure Functions provides the flexibility, security, and scalability needed to bring your AI-powered workflows to life. As the MCP spec continues to evolve, and GitHub Copilot expands its agentic capabilities, this architecture positions you to stay ahead — offering a robust foundation for cloud-native AI tooling that’s both powerful and future-proof.1.5KViews1like1CommentThe Future of AI: The paradigm shifts in Generative AI Operations
Dive into the transformative world of Generative AI Operations (GenAIOps) with Microsoft Azure. Discover how businesses are overcoming the challenges of deploying and scaling generative AI applications. Learn about the innovative tools and services Azure AI offers, and how they empower developers to create high-quality, scalable AI solutions. Explore the paradigm shift from MLOps to GenAIOps and see how continuous improvement practices ensure your AI applications remain cutting-edge. Join us on this journey to harness the full potential of generative AI and drive operational excellence.7.6KViews1like1CommentThe Future of AI: Maximize your fine-tuned model performance with the new Azure AI Evaluation SDK
In this article, we will explore how to effectively evaluate fine-tuned AI models using the new Azure AI Evaluation SDK. This comprehensive guide is the fourth part of our series on making large language model distillation easier. We delve into the importance of model evaluation, outline a systematic process for assessing the performance of a distilled student model against a baseline model, and demonstrate the use of advanced metrics provided by Azure's SDK. Join us as we navigate the intricacies of AI evaluation and provide insights for continuous model improvement and operational efficiency.1.8KViews1like0CommentsThe Future Of AI: Deconstructing Contoso Chat - Learning GenAIOps in practice
How can AI engineers build applied knowledge for GenAIOps practices? By deconstructing working samples! In this multi-part series, we deconstruct Contoso Chat (a RAG-based retail copilot sample) and use it to learn the tools and workflows to streamline out end-to-end developer journey using Azure AI Foundry.1KViews0likes0CommentsThe Future of AI: Reduce AI Provisioning Effort - Jumpstart your solutions with AI App Templates
In the previous post, we introduced Contoso Chat – an open-source RAG-based retail chat sample for Azure AI Foundry, that serves as both an AI App template (for builders) and the basis for a hands-on workshop (for learners). And we briefly talked about five stages in the developer workflow (provision, setup, ideate, evaluate, deploy) that take them from the initial prompt to a deployed product. But how can that sample help you build your app? The answer lies in developer tools and AI App templates that jumpstart productivity by giving you a fast start and a solid foundation to build on. In this post, we answer that question with a closer look at Azure AI App templates - what they are, and how we can jumpstart our productivity with a reuse-and-extend approach that builds on open-source samples for core application architectures.534Views0likes0CommentsThe Future of AI: Developing Lacuna - an agent for Revealing Quiet Assumptions in Product Design
A conversational agent named Lacuna is helping product teams uncover hidden assumptions embedded in design decisions. Built with Copilot Studio and powered by Azure AI Foundry, Lacuna analyzes product documents to identify speculative beliefs and assess their risk using design analysis lenses: impact, confidence, and reversibility. By surfacing cognitive biases and prompting reflection, Lacuna encourages teams to validate assumptions through lightweight evidence-gathering methods. This experiment in human-AI collaboration explores how agents can foster epistemic humility and transform static documents into dynamic conversations.648Views1like1CommentThe Future of AI: Creating a Web Application with Vibe Coding
Discover how vibe coding with GPT-5 in Azure AI Foundry transforms web development. This post walks through building a Translator API-powered web app using natural language instructions in Visual Studio Code. Learn how adaptive translation, tone and gender customization, and Copilot agent collaboration redefine the developer experience.1.2KViews0likes0Comments