mcp
27 TopicsFrom Cloud to Chip: Building Smarter AI at the Edge with Windows AI PCs
As AI engineers, we’ve spent years optimizing models for the cloud, scaling inference, wrangling latency, and chasing compute across clusters. But the frontier is shifting. With the rise of Windows AI PCs and powerful local accelerators, the edge is no longer a constraint it’s now a canvas. Whether you're deploying vision models to industrial cameras, optimizing speech interfaces for offline assistants, or building privacy-preserving apps for healthcare, Edge AI is where real-world intelligence meets real-time performance. Why Edge AI, Why Now? Edge AI isn’t just about running models locally, it’s about rethinking the entire lifecycle: - Latency: Decisions in milliseconds, not round-trips to the cloud. - Privacy: Sensitive data stays on-device, enabling HIPAA/GDPR compliance. - Resilience: Offline-first apps that don’t break when the network does. - Cost: Reduced cloud compute and bandwidth overhead. With Windows AI PCs powered by Intel and Qualcomm NPUs and tools like ONNX Runtime, DirectML, and Olive, developers can now optimize and deploy models with unprecedented efficiency. What You’ll Learn in Edge AI for Beginners The Edge AI for Beginners curriculum is a hands-on, open-source guide designed for engineers ready to move from theory to deployment. Multi-Language Support This content is available in over 48 languages, so you can read and study in your native language. What You'll Master This course takes you from fundamental concepts to production-ready implementations, covering: Small Language Models (SLMs) optimized for edge deployment Hardware-aware optimization across diverse platforms Real-time inference with privacy-preserving capabilities Production deployment strategies for enterprise applications Why EdgeAI Matters Edge AI represents a paradigm shift that addresses critical modern challenges: Privacy & Security: Process sensitive data locally without cloud exposure Real-time Performance: Eliminate network latency for time-critical applications Cost Efficiency: Reduce bandwidth and cloud computing expenses Resilient Operations: Maintain functionality during network outages Regulatory Compliance: Meet data sovereignty requirements Edge AI Edge AI refers to running AI algorithms and language models locally on hardware, close to where data is generated without relying on cloud resources for inference. It reduces latency, enhances privacy, and enables real-time decision-making. Core Principles: On-device inference: AI models run on edge devices (phones, routers, microcontrollers, industrial PCs) Offline capability: Functions without persistent internet connectivity Low latency: Immediate responses suited for real-time systems Data sovereignty: Keeps sensitive data local, improving security and compliance Small Language Models (SLMs) SLMs like Phi-4, Mistral-7B, Qwen and Gemma are optimized versions of larger LLMs, trained or distilled for: Reduced memory footprint: Efficient use of limited edge device memory Lower compute demand: Optimized for CPU and edge GPU performance Faster startup times: Quick initialization for responsive applications They unlock powerful NLP capabilities while meeting the constraints of: Embedded systems: IoT devices and industrial controllers Mobile devices: Smartphones and tablets with offline capabilities IoT Devices: Sensors and smart devices with limited resources Edge servers: Local processing units with limited GPU resources Personal Computers: Desktop and laptop deployment scenarios Course Modules & Navigation Course duration. 10 hours of content Module Topic Focus Area Key Content Level Duration 📖 00 Introduction to EdgeAI Foundation & Context EdgeAI Overview • Industry Applications • SLM Introduction • Learning Objectives Beginner 1-2 hrs 📚 01 EdgeAI Fundamentals Cloud vs Edge AI comparison EdgeAI Fundamentals • Real World Case Studies • Implementation Guide • Edge Deployment Beginner 3-4 hrs 🧠 02 SLM Model Foundations Model families & architecture Phi Family • Qwen Family • Gemma Family • BitNET • μModel • Phi-Silica Beginner 4-5 hrs 🚀 03 SLM Deployment Practice Local & cloud deployment Advanced Learning • Local Environment • Cloud Deployment Intermediate 4-5 hrs ⚙️ 04 Model Optimization Toolkit Cross-platform optimization Introduction • Llama.cpp • Microsoft Olive • OpenVINO • Apple MLX • Workflow Synthesis Intermediate 5-6 hrs 🔧 05 SLMOps Production Production operations SLMOps Introduction • Model Distillation • Fine-tuning • Production Deployment Advanced 5-6 hrs 🤖 06 AI Agents & Function Calling Agent frameworks & MCP Agent Introduction • Function Calling • Model Context Protocol Advanced 4-5 hrs 💻 07 Platform Implementation Cross-platform samples AI Toolkit • Foundry Local • Windows Development Advanced 3-4 hrs 🏭 08 Foundry Local Toolkit Production-ready samples Sample applications (see details below) Expert 8-10 hrs Each module includes Jupyter notebooks, code samples, and deployment walkthroughs, perfect for engineers who learn by doing. Developer Highlights - 🔧 Olive: Microsoft's optimization toolchain for quantization, pruning, and acceleration. - 🧩 ONNX Runtime: Cross-platform inference engine with support for CPU, GPU, and NPU. - 🎮 DirectML: GPU-accelerated ML API for Windows, ideal for gaming and real-time apps. - 🖥️ Windows AI PCs: Devices with built-in NPUs for low-power, high-performance inference. Local AI: Beyond the Edge Local AI isn’t just about inference, it’s about autonomy. Imagine agents that: - Learn from local context - Adapt to user behavior - Respect privacy by design With tools like Agent Framework, Azure AI Foundry and Windows Copilot Studio, and Foundry Local developers can orchestrate local agents that blend LLMs, sensors, and user preferences, all without cloud dependency. Try It Yourself Ready to get started? Clone the Edge AI for Beginners GitHub repo, run the notebooks, and deploy your first model to a Windows AI PC or IoT devices Whether you're building smart kiosks, offline assistants, or industrial monitors, this curriculum gives you the scaffolding to go from prototype to production.¡Curso oficial y gratuito de GenAI y Python! 🚀
¿Quieres aprender a usar modelos de IA generativa en tus aplicaciones de Python?Estamos organizando una serie de nueve transmisiones en vivo, en inglés y español, totalmente dedicadas a la IA generativa. Vamos a cubrir modelos de lenguaje (LLMs), modelos de embeddings, modelos de visión, y también técnicas como RAG, function calling y structured outputs. Además, te mostraremos cómo construir Agentes y servidores MCP, y hablaremos sobre seguridad en IA y evaluaciones, para asegurarnos de que tus modelos y aplicaciones generen resultados seguros. 🔗 Regístrate para toda la serie. Además de las transmisiones en vivo, puedes unirte a nuestras office hours semanales en el AI Foundry Discord de para hacer preguntas que no se respondan durante el chat. ¡Nos vemos en los streams! 👋🏻 Here’s your HTML converted into clean, readable text format (perfect for a newsletter, blog post, or social media caption): Modelos de Lenguaje 📅 7 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor ¡Únete a la primera sesión de nuestra serie de Python + IA! En esta sesión, hablaremos sobre los Modelos de Lenguaje (LLMs), los modelos que impulsan ChatGPT y GitHub Copilot. Usaremos Python para interactuar con LLMs utilizando paquetes como el SDK de OpenAI y Langchain. Experimentaremos con prompt engineering y ejemplos few-shot para mejorar los resultados. También construiremos una aplicación full stack impulsada por LLMs y explicaremos la importancia de la concurrencia y el streaming en apps de IA orientadas al usuario. 👉 Si querés seguir los ejemplos en vivo, asegurate de tener una cuenta de GitHub. Embeddings Vectoriales 📅 8 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor En la segunda sesión de Python + IA, exploraremos los embeddings vectoriales, una forma de codificar texto o imágenes como arrays de números decimales. Estos modelos permiten realizar búsquedas por similitud en distintos tipos de contenido. Usaremos modelos como la serie text-embedding-3 de OpenAI, visualizaremos resultados en Python y compararemos métricas de distancia. También veremos cómo aplicar cuantización y cómo usar modelos multimodales de embedding. 👉 Si querés seguir los ejemplos en vivo, asegurate de tener una cuenta de GitHub. Recuperación-Aumentada Generación (RAG) 📅 9 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor En la tercera sesión, exploraremos RAG, una técnica que envía contexto al LLM para obtener respuestas más precisas dentro de un dominio específico. Usaremos distintas fuentes de datos —CSVs, páginas web, documentos, bases de datos— y construiremos una app RAG full-stack con Azure AI Search. Modelos de Visión 📅 14 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor ¡La cuarta sesión trata sobre modelos de visión como GPT-4o y 4o-mini! Estos modelos pueden procesar texto e imágenes, generando descripciones, extrayendo datos, respondiendo preguntas o clasificando contenido. Usaremos Python para enviar imágenes a los modelos, crear una app de chat con imágenes e integrarlos en flujos RAG. 👉 Si querés seguir los ejemplos en vivo, asegurate de tener una cuenta de GitHub. Salidas Estructuradas 📅 15 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor En la quinta sesión aprenderemos a hacer que los LLMs generen respuestas estructuradas según un esquema. Exploraremos el modo structured outputs de OpenAI y cómo aplicarlo para extracción de entidades, clasificación y flujos con agentes. 👉 Si querés seguir los ejemplos en vivo, asegurate de tener una cuenta de GitHub. Calidad y Seguridad 📅 16 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor En la sexta sesión hablaremos sobre cómo usar IA de manera segura y evaluar la calidad de las salidas. Mostraremos cómo configurar Azure AI Content Safety, manejar errores en código Python y evaluar resultados con el SDK de Evaluación de Azure AI. Tool Calling 📅 21 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor En la última semana de la serie, nos enfocamos en tool calling (function calling), la base para construir agentes de IA. Aprenderemos a definir herramientas en Python o JSON, manejar respuestas de los modelos y habilitar llamadas paralelas y múltiples iteraciones. 👉 Si querés seguir los ejemplos en vivo, asegurate de tener una cuenta de GitHub. Agentes de IA 📅 22 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor ¡En la penúltima sesión construiremos agentes de IA! Usaremos frameworks como Langgraph, Semantic Kernel, Autogen, y Pydantic AI. Empezaremos con ejemplos simples y avanzaremos a arquitecturas más complejas como round-robin, supervisor, graphs y ReAct. Model Context Protocol (MCP) 📅 23 de octubre, 2025 | 10:00 PM - 11:00 PM (UTC) 🔗 Regístrate para la transmisión en Reactor Cerramos la serie con Model Context Protocol (MCP), la tecnología abierta más candente de 2025. Aprenderás a usar FastMCP para crear un servidor MCP local y conectarlo a chatbots como GitHub Copilot. También veremos cómo integrar MCP con frameworks de agentes como Langgraph, Semantic Kernel y Pydantic AI. Y, por supuesto, hablaremos sobre los riesgos de seguridad y las mejores prácticas para desarrolladores. ¿Querés que lo reformatee para publicación en Markdown (para blogs o repos) o en texto plano con emojis y separadores estilo redes sociales?How to Master GitHub Copilot: Build, Prompt, Deploy Smarter
Mastering GitHub Copilot: Build, Prompt, Deploy Smarter is a free, hands-on workshop designed to help developers go beyond autocomplete and unlock the true power of AI-assisted coding. Instead of toy examples, this course walks you through real-world software engineering challenges: messy codebases, multi-language projects, cloud deployments, and legacy system upgrades. You’ll learn practical skills like prompt engineering, advanced Copilot features, and AI pair programming techniques that make you faster, sharper, and more creative. Whether you’re a junior developer or a seasoned architect, mastering GitHub Copilot will help you: Reduce cognitive load and focus on system design Accelerate onboarding for new engineers Write cleaner, more consistent code Automate repetitive tasks to free up time for innovation AI coding tools like GitHub Copilot are no longer optional—they’re essential. This workshop gives you the skills to collaborate with Copilot effectively and stay competitive in the age of AI-powered development.Use Copilot and MCP to query Microsoft Learn Docs
Are you ready to take your Azure development workflow to the next level? In this post, we’ll walk through how to use GitHub Copilot in Agent Mode—paired with MCP (Model Context Protocol) servers—to get trusted, grounded answers from Microsoft Learn Docs, right inside your coding workspace. Whether you’re tired of switching tabs to search documentation or want to ensure your AI assistant’s answers are always accurate, this guide will show you how to streamline your workflow and boost your productivity.It's time to secure your MCP servers. Here's how.
The Model Context Protocol (MCP) provides a powerful, standardized way for LLMs to interact with external tools. But as soon as you move from a local demo to a real-world application, a critical question arises: How do you secure it? Exposing an MCP server without security is like leaving the front door of your house wide open. Anyone could walk in and use your tools, access your data, or cause havoc. This guide will walk you through securing a Node.js MCP server from the ground up using JSON Web Tokens (JWT). We'll cover authentication (who are you?) and authorization (what are you allowed to do?), with practical code samples based on this project that can be found at Azure-Samples/mcp-container-ts. The Goal: From Unprotected to Fully Secured Our goal is to take a basic MCP server and add a robust security layer that: Authenticates every request to ensure it comes from a known user. Authorizes the user, granting them specific permissions based on their role (e.g., admin vs. readonly). Protects individual tools, so only authorized users can access them. Why JWT is Perfect for MCP Security JWT is the industry standard for securing APIs, and it's an ideal fit for MCP servers for a few key reasons: Stateless: Each JWT contains all the information needed to verify a user. The server doesn't need to store session information, which makes it highly scalable—perfect for handling many concurrent requests from AI agents. Self-Contained: A JWT can carry user details, their role, and specific permissions directly within its payload. Tamper-Proof: JWTs are digitally signed. If a token is modified in any way, the signature becomes invalid, and the server will reject it. Portable: A single JWT can be used to access multiple secured services, which is common in microservice architectures. Visualizing the Security Flow For visual learners, this sequence diagram illustrates the complete authentication and authorization flow: A Note on MCP Specification Compliance! It's important to note that this guide provides a practical, real-world implementation for securing an MCP server, but it does not fully implement the official MCP authorization specification. This implementation focuses on a robust, stateless, and widely understood pattern using traditional JWTs and role-based access control (RBAC), which is sufficient for many use cases. However, for full compliance with the MCP specification, you would need to implement additional features. In a future post, we may explore how to extend our JWT implementation to fully align with the MCP specification. We recommend staring the GitHub repository to stay updated and receive notifications about future improvements. Step 1: Defining Roles and Permissions Before writing any code, we must define our security rules. What roles exist? What can each role do? This is the foundation of our authorization system. In our src/auth/authorization.ts file, we define UserRole and Permission enums. This makes our code clear, readable, and less prone to typos. // src/auth/authorization.ts export enum UserRole { ADMIN = "admin", USER = "user", READONLY = "readonly", } export enum Permission { CREATE_TODOS = "create:todos", READ_TODOS = "read:todos", UPDATE_TODOS = "update:todos", DELETE_TODOS = "delete:todos", LIST_TOOLS = "list:tools", } // This interface defines the structure of our authenticated user export interface AuthenticatedUser { id: string; role: UserRole; permissions: Permission[]; } // A simple map to assign default permissions to each role const rolePermissions: Record<UserRole, Permission[]> = { [UserRole.ADMIN]: Object.values(Permission), // Admin gets all permissions [UserRole.USER]: [ Permission.CREATE_TODOS, Permission.READ_TODOS, Permission.UPDATE_TODOS, Permission.LIST_TOOLS, ], [UserRole.READONLY]: [Permission.READ_TODOS, Permission.LIST_TOOLS], }; Step 2: Creating a JWT Service Next, we need a centralized service to handle all JWT-related logic: creating new tokens for testing and, most importantly, verifying incoming tokens. This keeps our security logic clean and in one place. Here is the complete src/auth/jwt.ts file. It uses the jsonwebtoken library to do the heavy lifting. // src/auth/jwt.ts import * as jwt from "jsonwebtoken"; import { AuthenticatedUser, getPermissionsForRole, UserRole, } from "./authorization.js"; // These values should come from environment variables for security const JWT_SECRET = process.env.JWT_SECRET!; const JWT_AUDIENCE = process.env.JWT_AUDIENCE!; const JWT_ISSUER = process.env.JWT_ISSUER!; const JWT_EXPIRY = process.env.JWT_EXPIRY || "2h"; if (!JWT_SECRET || !JWT_AUDIENCE || !JWT_ISSUER) { throw new Error("JWT environment variables are not set!"); } /** * Generates a new JWT for a given user payload. * Useful for testing or generating tokens on demand. */ export function generateToken( user: Partial<AuthenticatedUser> & { id: string } ): string { const payload = { id: user.id, role: user.role || UserRole.USER, permissions: user.permissions || getPermissionsForRole(user.role || UserRole.USER), }; return jwt.sign(payload, JWT_SECRET, { algorithm: "HS256", expiresIn: JWT_EXPIRY, audience: JWT_AUDIENCE, issuer: JWT_ISSUER, }); } /** * Verifies an incoming JWT and returns the authenticated user payload if valid. */ export function verifyToken(token: string): AuthenticatedUser { try { const decoded = jwt.verify(token, JWT_SECRET, { algorithms: ["HS256"], audience: JWT_AUDIENCE, issuer: JWT_ISSUER, }) as jwt.JwtPayload; // Ensure the decoded token has the fields we expect if (typeof decoded.id !== "string" || typeof decoded.role !== "string") { throw new Error("Token payload is missing required fields."); } return { id: decoded.id, role: decoded.role as UserRole, permissions: decoded.permissions || [], }; } catch (error) { // Log the specific error for debugging, but return a generic message console.error("JWT verification failed:", error.message); if (error instanceof jwt.TokenExpiredError) { throw new Error("Token has expired."); } if (error instanceof jwt.JsonWebTokenError) { throw new Error("Invalid token."); } throw new Error("Could not verify token."); } } Step 3: Building the Authentication Middleware A "middleware" is a function that runs before your main request handler. It's the perfect place to put our security check. This middleware will inspect every incoming request, look for a JWT in the Authorization header, and verify it. If the token is valid, it attaches the user's information to the request object for later use. If not, it immediately sends a 401 Unauthorized error and stops the request from proceeding further. To make this type-safe, we'll also extend Express's Request interface to include our user object. // src/server-middlewares.ts import { Request, Response, NextFunction } from "express"; import { verifyToken, AuthenticatedUser } from "./auth/jwt.js"; // Extend the global Express Request interface to add our custom 'user' property declare global { namespace Express { interface Request { user?: AuthenticatedUser; } } } export function authenticateJWT( req: Request, res: Response, next: NextFunction ): void { const authHeader = req.headers.authorization; if (!authHeader || !authHeader.startsWith("Bearer ")) { res.status(401).json({ error: "Authentication required", message: "Authorization header with 'Bearer' scheme must be provided.", }); return; } const token = authHeader.substring(7); // Remove "Bearer " try { const userPayload = verifyToken(token); req.user = userPayload; // Attach user payload to the request next(); // Proceed to the next middleware or request handler } catch (error) { res.status(401).json({ error: "Invalid token", message: error.message, }); } } Step 4: Protecting the MCP Server Now we have all the pieces. Let's put them together to protect our server. First, we apply our authenticateJWT middleware to the main MCP endpoint in src/index.ts. This ensures every request to /mcp must have a valid JWT. // src/index.ts // ... other imports import { authenticateJWT } from "./server-middlewares.js"; // ... const MCP_ENDPOINT = "/mcp"; const app = express(); // Apply security middleware ONLY to the MCP endpoint app.use(MCP_ENDPOINT, authenticateJWT); // ... rest of the file Next, we'll enforce our fine-grained permissions. Let's secure the ListTools handler in src/server.ts. We'll modify it to check if the authenticated user has the Permission.LIST_TOOLS permission before returning the list of tools. // src/server.ts // ... other imports import { hasPermission, Permission } from "./auth/authorization.js"; // ... inside the StreamableHTTPServer class private setupServerRequestHandlers() { this.server.setRequestHandler(ListToolsRequestSchema, async (request) => { // The user is attached to the request by our middleware const user = this.currentUser; // 1. Check for an authenticated user if (!user) { return this.createRPCErrorResponse("Authentication required."); } // 2. Check if the user has the specific permission to list tools if (!hasPermission(user, Permission.LIST_TOOLS)) { return this.createRPCErrorResponse( "Insufficient permissions to list tools." ); } // 3. If checks pass, filter tools based on user's permissions const allowedTools = TodoTools.filter((tool) => { const requiredPermissions = this.getToolRequiredPermissions(tool.name); // The user must have at least one of the permissions required for the tool return requiredPermissions.some((p) => hasPermission(user, p)); }); return { jsonrpc: "2.0", tools: allowedTools, }; }); // ... other request handlers } With this change, a user with a readonly role can list tools, but a user without the LIST_TOOLS permission would be denied access. Conclusion and Next Steps Congratulations! You've successfully implemented a robust authentication and authorization layer for your MCP server. By following these steps, you have: Defined clear roles and permissions. Created a centralized service for handling JWTs. Built a middleware to protect all incoming requests. Enforced granular permissions at the tool level. Your MCP server is no longer an open door—it's a secure service. From here, you can expand on these concepts by adding more roles, more permissions, and even more complex business logic to your authorization system. Star our GitHub repository to stay updated and receive notifications about future improvements.Model Mondays S2E01 Recap: Advanced Reasoning Session
About Model Mondays Want to know what Reasoning models are and how you can build advanced reasoning scenarios like a Deep Research agent using Azure AI Foundry? Check out this recap from Model Mondays Season 2 Ep 1. Model Mondays is a weekly series to help you build your model IQ in three steps: 1. Catch the 5-min Highlights on Monday, to get up to speed on model news 2. Catch the 15-min Spotlight on Monday, for a deep-dive into a model or tool 3. Catch the 30-min AMA on Friday, for a Q&A session with subject matter experts Want to follow along? Register Here- to watch upcoming livestreams for Season 2 Visit The Forum- to see the full AMA schedule for Season 2 Register Here - to join the AMA on Friday Jun 20 Spotlight On: Advanced Reasoning This week, the Model Mondays spotlight was on Advanced Reasoning with subject matter expert Marlene Mhangami. In this blog post, I'll talk about my five takeaways from this episode: Why Are Reasoning Models Important? What Is an Advanced Reasoning Scenario? How Can I Get Started with Reasoning Models ? Spotlight: My Aha Moment Highlights: What’s New in Azure AI 1. Why Are Reasoning Models Important? In today's fast-evolving AI landscape, it's no longer enough for models to just complete text or summarize content. We need AI that can: Understand multi-step tasks Make decisions based on logic Plan sequences of actions or queries Connect context across turns Reasoning models are large language models (LLMs) trained with reinforcement learning techniques to "think" before they answer. Rather than simply generating a response based on probability, these models follow an internal thought process producing a chain of reasoning before responding. This makes them ideal for complex problem-solving tasks. And they’re the foundation of building intelligent, context-aware agents. They enable next-gen AI workflows in everything from customer support to legal research and healthcare diagnostics. Reason: They allow AI to go beyond surface-level response and deliver solutions that reflect understanding, not just language patterning. 2. What does Advanced Reasoning involve? An advanced reasoning scenario is one where a model: Breaks a complex prompt into smaller steps Retrieves relevant external data Uses logic to connect dots Outputs a structured, reasoned answer Example: A user asks: What are the financial and operational risks of expanding a startup to Southeast Asia in 2025? This is the kind of question that requires extensive research and analysis. A reasoning model might tackle this by: Retrieving reports on Southeast Asia market conditions Breaking down risks into financial, political, and operational buckets Cross-referencing data with recent trends Returning a reasoned, multi-part answer 3. How Can I Get Started with Reasoning Models? To get started, you need to visit a catalog that has examples of these models. Try the GitHub Models Marketplace and look for the reasoning category in the filter. Try the Azure AI Foundry model catalog and look for reasoning models by name. Example: The o-series of models from Azure Open AI The DeepSeek-R1 models The Grok 3 models The Phi-4 reasoning models Next, you can use SDKs or Playground for exploring the model capabiliies. 1. Try Lab 331 - for a beginner-friendly guide. 2. Try Lab 333 - for an advanced project. 3. Try the GitHub Model Playground - to compare reasoning and GPT models. 4. Try the Deep Research Agent using LangChain - sample as a great starting project. Have questions or comments? Join the Friday AMA on Azure AI Foundry Discord: 4. Spotlight: My Aha Moment Before this session, I thought reasoning meant longer or more detailed responses. But this session helped me realize that reasoning means structured thinking — models now plan, retrieve, and respond with logic. This inspired me to think about building AI agents that go beyond chat and actually assist users like a teammate. It also made me want to dive deeper into LangChain + Azure AI workflows to build mini-agents for real-world use. 5. Highlights: What’s New in Azure AI Here’s what’s new in the Azure AI Foundry: Direct From Azure Models - Try hosted models like OpenAI GPT on PTU plans SORA Video Playground - Generate video from prompts via SORA models Grok 3 Models - Now available for secure, scalable LLM experiences DeepSeek R1-0528 - A reasoning-optimized, Microsoft-tuned open-source model These are all available in the Azure Model Catalog and can be tried with your Azure account. Did You Know? Your first step is to find the right model for your task. But what if you could have the model automatically selected for you_ based on the prompt you provide? That's the magic of Model Router a deployable AI chat model that dynamically selects the best LLM based on your prompt. Instead of choosing one model manually, the Router makes that choice in real time. Currently, this works with a fixed set of Azure OpenAI models, including a reasoning model option. Keep an eye on the documentation for more updates. Why it’s powerful: Saves cost by switching between models based on complexity Optimizes performance by selecting the right model for the task Lets you test and compare model outputs quickly Try it out in Azure AI Foundry or read more in the Model Catalog Coming Up Next Next week, we dive into Model Context Protocol, an open protocol that empowers agentic AI applications by making it easier to discover and integrate knowledge and action tools with your model choices. Register Here to get reminded - and join us live on Monday! Join The Community Great devs don't build alone! In a fast-pased developer ecosystem, there's no time to hunt for help. That's why we have the Azure AI Developer Community. Join us today and let's journey together! Join the Discord - for real-time chats, events & learning Explore the Forum - for AMA recaps, Q&A, and help! About Me. I'm Sharda, a Gold Microsoft Learn Student Ambassador interested in cloud and AI. Find me on Github, Dev.to,, Tech Community and Linkedin. In this blog series I have summarizef my takeaways from this week's Model Mondays livestream .380Views0likes0CommentsModel Mondays S2:E6 Understanding Research & Innovation with SeokJin Han and Saumil Shrivastava
In this week's blog post, we dive into the cutting-edge research happening at Azure AI Foundry Labs. From the MCP Server that makes it easy to experiment with new models and tools, to Magentic-UI that brings human-centered agent workflows to life, there’s a lot to unpack!151Views0likes0CommentsThe fantastic duo: How to build your modern APIs
🧠 Core Concept The article introduces a Chat Playground System designed to streamline AI development by managing multiple chat scenarios (e.g., technical support, creative writing) from a single dashboard. 🔧 Key Features Scenario-Aware Sessions: Launch pre-configured chat contexts with one click. Dual Access Architecture: FastAPI for RESTful web apps. MCP (Model Context Protocol) for AI tool integration. Streamlit Integration: Wrapped with MCP to allow seamless interaction with AI tools. Automatic Resource Management: Smart port allocation and process cleanup. Context Passing: Uses environment variables and temp JSON files to transfer session data. 🚧 Challenges & Solutions Bridging MCP and Streamlit: Created a wrapper to translate protocol calls and maintain session state. Process Management: Built an async manager to handle multiple Streamlit sessions reliably. Context Transfer: Developed a hybrid system for passing rich context between processes. User Experience: Simplified interface with real-time feedback and intuitive controls. 💡 Lessons Learned Innovation thrives at protocol boundaries. Supporting both REST and MCP broadens adoption. Start simple, scale gradually. Process lifecycle management is critical. Contextual awareness enhances AI utility. Developer experience drives product success. 🔮 Future Directions24Views0likes0CommentsLevel Up Your Python Game with Generative AI Free Livestream Series This October!
If you've been itching to go beyond basic Python scripts and dive into the world of AI-powered applications, this is your moment. Join Pamela Fox and Gwyneth Peña-Siguenza Gwthrilled to announce a brand-new free livestream series running throughout October, focused on Python + Generative AI and this time, we’re going even deeper with Agents and the Model Context Protocol (MCP). Whether you're just starting out with LLMs or you're refining your multi-agent workflows, this series is designed to meet you where you are and push your skills to the next level. 🧠 What You’ll Learn Each session is packed with live coding, hands-on demos, and real-world examples you can run in GitHub Codespaces. Here's a taste of what we’ll cover: 🎥 Why Join? Live coding: No slides-only sessions — we build together, step by step. All code shared: Clone and run in GitHub Codespaces or your local setup. Community support: Join weekly office hours and our AI Discord for Q&A and deeper dives. Modular learning: Each session stands alone, so you can jump in anytime. 🔗 Register for the full series 🌍 ¿Hablas español? We’ve got you covered! Gwyneth Peña-Siguenza will be leading a parallel series in Spanish, covering the same topics with localized examples and demos. 🔗 Regístrese para la serie en español Whether you're building your first AI app or architecting multi-agent systems, this series is your launchpad. Come for the code, stay for the community — and leave with a toolkit that scales. Let’s build something brilliant together. 💡 Join the discussions and share your exprience at the Azure AI Discord Community