slm
23 TopicsEssential Microsoft Resources for MVPs & the Tech Community from the AI Tour
Unlock the power of Microsoft AI with redeliverable technical presentations, hands-on workshops, and open-source curriculum from the Microsoft AI Tour! Whether you’re a Microsoft MVP, Developer, or IT Professional, these expertly crafted resources empower you to teach, train, and lead AI adoption in your community. Explore top breakout sessions covering GitHub Copilot, Azure AI, Generative AI, and security best practices—designed to simplify AI integration and accelerate digital transformation. Dive into interactive workshops that provide real-world applications of AI technologies. Take it a step further with Microsoft’s Open-Source AI Curriculum, offering beginner-friendly courses on AI, Machine Learning, Data Science, Cybersecurity, and GitHub Copilot—perfect for upskilling teams and fostering innovation. Don’t just learn—lead. Access these resources, host impactful training sessions, and drive AI adoption in your organization. Start sharing today! Explore now: Microsoft AI Tour Resources.Getting Started with the AI Dev Gallery
March Update: The Gallery is now available on the Microsoft Store! The AI Dev Gallery is a new open-source project designed to inspire and support developers in integrating on-device AI functionality into their Windows apps. It offers an intuitive UX for exploring and testing interactive AI samples powered by local models. Key features include: Quickly explore and download models from well-known sources on GitHub and HuggingFace. Test different models with interactive samples over 25 different scenarios, including text, image, audio, and video use cases. See all relevant code and library references for every sample. Switch between models that run on CPU and GPU depending on your device capabilities. Quickly get started with your own projects by exporting any sample to a fresh Visual Studio project that references the same model cache, preventing duplicate downloads. Part of the motivation behind the Gallery was exposing developers to the host of benefits that come with on-device AI. Some of these benefits include improved data security and privacy, increased control and parameterization, and no dependence on an internet connection or third-party cloud provider. Requirements Device Requirements Minimum OS Version: Windows 10, version 1809 (10.0; Build 17763) Architecture: x64, ARM64 Memory: At least 16 GB is recommended Disk Space: At least 20GB free space is recommended GPU: 8GB of VRAM is recommended for running samples on the GPU Using the Gallery The AI Dev Gallery has can be navigated in two ways: The Samples View The Models View Navigating Samples In this view, samples are broken up into categories (Text, Code, Image, etc.) and then into more specific samples, like in the Translate Text pictured below: On clicking a sample, you will be prompted to choose a model to download if you haven’t run this sample before: Next to the model you can see the size of the model, whether it will run on CPU or GPU, and the associated license. Pick the model that makes the most sense for your machine. You can also download new models and change the model for a sample later from the sample view. Just click the model drop down at the top of the sample: The last thing you can do from the Sample pane is view the sample code and export the project to Visual Studio. Both buttons are found in the top right corner of the sample, and the code view will look like this: Navigating Models If you would rather navigate by models instead of samples, the Gallery also provides the model view: The model view contains a similar navigation menu on the right to navigate between models based on category. Clicking on a model will allow you to see a description of the model, the versions of it that are available to download, and the samples that use the model. Clicking on a sample will take back over to the samples view where you can see the model in action. Deleting and Managing Models If you need to clear up space or see download details for the models you are using, you can head over the Settings page to manage your downloads: From here, you can easily see every model you have downloaded and how much space on your drive they are taking up. You can clear your entire cache for a fresh start or delete individual models that you are no longer using. Any deleted model can be redownload through either the models or samples view. Next Steps for the Gallery The AI Dev Gallery is still a work in progress, and we plan on adding more samples, models, APIs, and features, and we are evaluating adding support for NPUs to take the experience even further If you have feedback, noticed a bug, or any ideas for features or samples, head over to the issue board and submit an issue. We also have a discussion board for any other topics relevant to the Gallery. The Gallery is an open-source project, and we would love contribution, feedback, and ideation! Happy modeling!6.3KViews5likes3CommentsAI Toolkit for Visual Studio Code: October 2024 Update Highlights
The AI Toolkit’s October 2024 update revolutionizes Visual Studio Code with game-changing features for developers, researchers, and enthusiasts. Explore multi-model integration, including GitHub Models, ONNX, and Google Gemini, alongside custom model support. Dive into multi-modal capabilities for richer AI testing and seamless multi-platform compatibility across Windows, macOS, and Linux. Tailored for productivity, the enhanced Model Catalog simplifies choosing the best tools for your projects. Try it now and share feedback to shape the future of AI in VS Code!2.9KViews4likes0CommentsGetting Started - Generative AI with Phi-3-mini: A Guide to Inference and Deployment
Getting started with Microsoft Phi-3-mini - Inference Phi-3-mini models, Discover how Phi-3-mini, a new series of models from Microsoft, enables deployment of Large Language Models (LLMs) on edge devices and IoT devices. Learn how to use Semantic Kernel, Ollama/LlamaEdge, and ONNX Runtime to access and infer phi3-mini models, and explore the possibilities of generative AI in various application scenarios50KViews4likes13CommentsAccelerate Phi-3 use on macOS: A Beginner's Guide to Using Apple MLX Framework
Learn how to use macOS and Apple Silicon to speed up machine learning models with this easy guide. We’ll cover the Apple MLX Framework, a tool that helps you run and fine-tune models like Phi-3-mini right on your Mac. First, install MLX by running pip install mlx-lm in your terminal. You can then use commands to run or fine-tune models. Apple's Metal Performance Shaders make this possible by using your Mac's GPU. We'll also show you how to use LoRA for better fine-tuning results and compare the performance of different models.13KViews2likes0CommentsGetting Started - Generative AI with Phi-3-mini: Running Phi-3-mini in Intel AI PC
In 2024, with the empowerment of AI, we will enter the era of AI PC. On May 20, Microsoft also released the concept of Copilot + PC, which means that PC can run SLM/LLM more efficiently with the support of NPU. We can use models from different Phi-3 family combined with the new AI PC to build a simple personalized Copilot application for individuals. This content will combine Intel's AI PC, use Intel's OpenVINO, NPU Acceleration Library, and Microsoft's DirectML to create a local Copilot.32KViews2likes2CommentsGetting started with Microsoft Phi-3-mini - Try running the Phi-3-mini on iPhone with ONNX Runtime
In this article, we explore how to deploy generative AI applications to mobile devices, specifically on iPhone, using ONNX Runtime. We cover the steps to compile ONNX Runtime for iOS and then create an App application in Xcode. We also show you how to copy the ONNX quantized INT4 model to the project and add the C++ API to generate text. This is a preliminary exploration of deploying generative AI on mobile devices, but it provides a good starting point for further development.19KViews2likes2CommentsFrom Cloud to Chip: Building Smarter AI at the Edge with Windows AI PCs
As AI engineers, we’ve spent years optimizing models for the cloud, scaling inference, wrangling latency, and chasing compute across clusters. But the frontier is shifting. With the rise of Windows AI PCs and powerful local accelerators, the edge is no longer a constraint it’s now a canvas. Whether you're deploying vision models to industrial cameras, optimizing speech interfaces for offline assistants, or building privacy-preserving apps for healthcare, Edge AI is where real-world intelligence meets real-time performance. Why Edge AI, Why Now? Edge AI isn’t just about running models locally, it’s about rethinking the entire lifecycle: - Latency: Decisions in milliseconds, not round-trips to the cloud. - Privacy: Sensitive data stays on-device, enabling HIPAA/GDPR compliance. - Resilience: Offline-first apps that don’t break when the network does. - Cost: Reduced cloud compute and bandwidth overhead. With Windows AI PCs powered by Intel and Qualcomm NPUs and tools like ONNX Runtime, DirectML, and Olive, developers can now optimize and deploy models with unprecedented efficiency. What You’ll Learn in Edge AI for Beginners The Edge AI for Beginners curriculum is a hands-on, open-source guide designed for engineers ready to move from theory to deployment. Multi-Language Support This content is available in over 48 languages, so you can read and study in your native language. What You'll Master This course takes you from fundamental concepts to production-ready implementations, covering: Small Language Models (SLMs) optimized for edge deployment Hardware-aware optimization across diverse platforms Real-time inference with privacy-preserving capabilities Production deployment strategies for enterprise applications Why EdgeAI Matters Edge AI represents a paradigm shift that addresses critical modern challenges: Privacy & Security: Process sensitive data locally without cloud exposure Real-time Performance: Eliminate network latency for time-critical applications Cost Efficiency: Reduce bandwidth and cloud computing expenses Resilient Operations: Maintain functionality during network outages Regulatory Compliance: Meet data sovereignty requirements Edge AI Edge AI refers to running AI algorithms and language models locally on hardware, close to where data is generated without relying on cloud resources for inference. It reduces latency, enhances privacy, and enables real-time decision-making. Core Principles: On-device inference: AI models run on edge devices (phones, routers, microcontrollers, industrial PCs) Offline capability: Functions without persistent internet connectivity Low latency: Immediate responses suited for real-time systems Data sovereignty: Keeps sensitive data local, improving security and compliance Small Language Models (SLMs) SLMs like Phi-4, Mistral-7B, Qwen and Gemma are optimized versions of larger LLMs, trained or distilled for: Reduced memory footprint: Efficient use of limited edge device memory Lower compute demand: Optimized for CPU and edge GPU performance Faster startup times: Quick initialization for responsive applications They unlock powerful NLP capabilities while meeting the constraints of: Embedded systems: IoT devices and industrial controllers Mobile devices: Smartphones and tablets with offline capabilities IoT Devices: Sensors and smart devices with limited resources Edge servers: Local processing units with limited GPU resources Personal Computers: Desktop and laptop deployment scenarios Course Modules & Navigation Course duration. 10 hours of content Module Topic Focus Area Key Content Level Duration 📖 00 Introduction to EdgeAI Foundation & Context EdgeAI Overview • Industry Applications • SLM Introduction • Learning Objectives Beginner 1-2 hrs 📚 01 EdgeAI Fundamentals Cloud vs Edge AI comparison EdgeAI Fundamentals • Real World Case Studies • Implementation Guide • Edge Deployment Beginner 3-4 hrs 🧠 02 SLM Model Foundations Model families & architecture Phi Family • Qwen Family • Gemma Family • BitNET • μModel • Phi-Silica Beginner 4-5 hrs 🚀 03 SLM Deployment Practice Local & cloud deployment Advanced Learning • Local Environment • Cloud Deployment Intermediate 4-5 hrs ⚙️ 04 Model Optimization Toolkit Cross-platform optimization Introduction • Llama.cpp • Microsoft Olive • OpenVINO • Apple MLX • Workflow Synthesis Intermediate 5-6 hrs 🔧 05 SLMOps Production Production operations SLMOps Introduction • Model Distillation • Fine-tuning • Production Deployment Advanced 5-6 hrs 🤖 06 AI Agents & Function Calling Agent frameworks & MCP Agent Introduction • Function Calling • Model Context Protocol Advanced 4-5 hrs 💻 07 Platform Implementation Cross-platform samples AI Toolkit • Foundry Local • Windows Development Advanced 3-4 hrs 🏭 08 Foundry Local Toolkit Production-ready samples Sample applications (see details below) Expert 8-10 hrs Each module includes Jupyter notebooks, code samples, and deployment walkthroughs, perfect for engineers who learn by doing. Developer Highlights - 🔧 Olive: Microsoft's optimization toolchain for quantization, pruning, and acceleration. - 🧩 ONNX Runtime: Cross-platform inference engine with support for CPU, GPU, and NPU. - 🎮 DirectML: GPU-accelerated ML API for Windows, ideal for gaming and real-time apps. - 🖥️ Windows AI PCs: Devices with built-in NPUs for low-power, high-performance inference. Local AI: Beyond the Edge Local AI isn’t just about inference, it’s about autonomy. Imagine agents that: - Learn from local context - Adapt to user behavior - Respect privacy by design With tools like Agent Framework, Azure AI Foundry and Windows Copilot Studio, and Foundry Local developers can orchestrate local agents that blend LLMs, sensors, and user preferences, all without cloud dependency. Try It Yourself Ready to get started? Clone the Edge AI for Beginners GitHub repo, run the notebooks, and deploy your first model to a Windows AI PC or IoT devices Whether you're building smart kiosks, offline assistants, or industrial monitors, this curriculum gives you the scaffolding to go from prototype to production.Make Phi-4-mini-reasoning more powerful with industry reasoning on edge devices
In situations with limited computing, Phi-4-mini-reasoning will is an excellent model choice. We can use Microsoft Olive or Apple MLX Framework to quantize Phi-4-mini-reasoning and deploy it on edge terminals such as IoT, Laotop and mobile devices. Quantization In order to solve the problem that the model is difficult to deploy directly to specific hardware, we need to reduce the complexity of the model through model quantization. Undertaking the quantization process will inevitably cause precision loss. Quantize Phi-4-mini-reasoning using Microsoft Olive Microsoft Olive is an AI model optimization toolkit for ONNX Runtime. Given a model and target hardware, Olive (short for Onnx LIVE) will combine the most appropriate optimization techniques to output the most efficient ONNX model for inference in the cloud or on the edge. We can combine Microsoft Olive and Phi-4-mini-reasoning on Azure AI Foundry's Model Catalog to quantize Phi-4-mini-reasoning to an ONNX format model. Create your Notebook on Azure ML Install Microsoft Olive pip install git+https://github.com/Microsoft/Olive.git Quantize using Microsoft Olive olive auto-opt --model_name_or_path {Azure Model Catalog path ,such as azureml://registries/azureml/models/Phi-4-mini-reasoning/versions/1 }--device cpu --provider CPUExecutionProvider --use_model_builder --precision int4 --output_path ./phi-4-14b-reasoninig-onnx --log_level 1 Register your quantized Model ! python -m mlx_lm.generate --model ./phi-4-mini-reasoning --adapter-path ./adapters --max-token 4096 --prompt "A 54-year-old construction worker with a long history of smoking presents with swelling in his upper extremity and face, along with dilated veins in this region. After conducting a CT scan and venogram of the neck, what is the most likely diagnosis for the cause of these symptoms?" --extra-eos-token "<|end|>" Download to local and run Download the onnx model to local device ml_client.models.download("phi-4-mini-onnx-int4-cpu", 1) Running onnx model with onnxruntime-genai Install onnxruntime-genai (This is CPU version) pip install onnxruntime-genai Run it import onnxruntime_genai as og model_folder = "Your ONNX Model Path" model = og.Model(model_folder) tokenizer = og.Tokenizer(model) tokenizer_stream = tokenizer.create_stream() search_options = {} search_options['max_length'] = 32768 chat_template = "<|user|>{input}<|end|><|assistant|>" text = 'A school arranges dormitories for students. If each dormitory accommodates 5 people, 4 people cannot live there; if each dormitory accommodates 6 people, one dormitory only has 4 people, and two dormitories are empty. Find the number of students in this grade and the number of dormitories.' prompt = f'{chat_template.format(input=text)}' input_tokens = tokenizer.encode(prompt) params = og.GeneratorParams(model) params.set_search_options(**search_options) generator = og.Generator(model, params) generator.append_tokens(input_tokens) while not generator.is_done(): generator.generate_next_token() new_token = generator.get_next_tokens()[0] print(tokenizer_stream.decode(new_token), end='', flush=True) Get Notebook from Phi Cookbook : https://aka.ms/phicookbook Quantize Phi-4-mini-reasoning model using Apple MLX Install Apple MLX Framework pip install -U mlx-lm Convert Phi-4-mini-reasoning model through Apple MLX quantization python -m mlx_lm.convert --hf-path {Phi-4-mini-reasoning Hugging face id} -q Run Phi-4-mini-reasoning with Apple MLX in terminal python -m mlx_lm.generate --model ./mlx_model --max-token 2048 --prompt "A school arranges dormitories for students. If each dormitory accommodates 5 people, 4 people cannot live there; if each dormitory accommodates 6 people, one dormitory only has 4 people, and two dormitories are empty. Find the number of students in this grade and the number of dormitories." --extra-eos-token "<|end|>" --temp 0.0 Fine-tuning We can fine-tune the CoT data of different scenarios to enable Phi-4-mini-reasoning to have reasoning capabilities for different scenarios. Here we use the Medical CoT data from a public Huggingface datasets as our example (this is just an example. If you need rigorous medical reasoning, please seek more professional data support) We can fine-tune our CoT data in Azure ML Fine-tune Phi-4-mini-reasoning using Microsoft Olive in Azure ML Note- Please use Standard_NC24ads_A100_v4 to run this sample Get Data from Hugging face datasets pip install datasets run this script to get train data from datasets import load_dataset def formatting_prompts_func(examples): inputs = examples["Question"] cots = examples["Complex_CoT"] outputs = examples["Response"] texts = [] for input, cot, output in zip(inputs, cots, outputs): text = prompt_template.format(input, cot, output) + "<|end|>" # text = prompt_template.format(input, cot, output) + "<|endoftext|>" texts.append(text) return { "text": texts, } # Create the English dataset dataset = load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT","en", split = "train",trust_remote_code=True) dataset = dataset.map(formatting_prompts_func, batched = True,remove_columns=["Question", "Complex_CoT", "Response"]) dataset.to_json("en_dataset.jsonl") Fine-tuning with Microsoft Olive olive finetune \ --method lora \ --model_name_or_path {Azure Model Catalog path , azureml://registries/azureml/models/Phi-4-mini-reasoning/versions/1} \ --trust_remote_code \ --data_name json \ --data_files ./en_dataset.jsonl \ --train_split "train[:16000]" \ --eval_split "train[16000:19700]" \ --text_field "text" \ --max_steps 100 \ --logging_steps 10 \ --output_path {Your fine-tuning save path} \ --log_level 1 Convert the model to ONNX with Microsoft Olive olive capture-onnx-graph \ --model_name_or_path {Azure Model Catalog path , azureml://registries/azureml/models/Phi-4-mini-reasoning/versions/1} \ --adapter_path {Your fine-tuning adapter path} \ --use_model_builder \ --output_path {Your save onnx path} \ --log_level 1 olive generate-adapter \ --model_name_or_path {Your save onnx path} \ --output_path {Your save onnx adapter path} \ --log_level 1 Run the model with onnxruntime-genai-cuda Install onnxruntime-genai-cuda SDK import onnxruntime_genai as og import numpy as np import os model_folder = "./models/phi-4-mini-reasoning/adapter-onnx/model/" model = og.Model(model_folder) adapters = og.Adapters(model) adapters.load('./models/phi-4-mini-reasoning/adapter-onnx/model/adapter_weights.onnx_adapter', "en_medical_reasoning") tokenizer = og.Tokenizer(model) tokenizer_stream = tokenizer.create_stream() search_options = {} search_options['max_length'] = 200 search_options['past_present_share_buffer'] = False search_options['temperature'] = 1 search_options['top_k'] = 1 prompt_template = """<|user|>{}<|end|><|assistant|><think>""" question = """ A 33-year-old woman is brought to the emergency department 15 minutes after being stabbed in the chest with a screwdriver. Given her vital signs of pulse 110\/min, respirations 22\/min, and blood pressure 90\/65 mm Hg, along with the presence of a 5-cm deep stab wound at the upper border of the 8th rib in the left midaxillary line, which anatomical structure in her chest is most likely to be injured? """ prompt = prompt_template.format(question, "") input_tokens = tokenizer.encode(prompt) params = og.GeneratorParams(model) params.set_search_options(**search_options) generator = og.Generator(model, params) generator.set_active_adapter(adapters, "en_medical_reasoning") generator.append_tokens(input_tokens) while not generator.is_done(): generator.generate_next_token() new_token = generator.get_next_tokens()[0] print(tokenizer_stream.decode(new_token), end='', flush=True) inference model with onnxruntime-genai cuda olive finetune \ --method lora \ --model_name_or_path {Azure Model Catalog path , azureml://registries/azureml/models/Phi-4-mini-reasoning/versions/1} \ --trust_remote_code \ --data_name json \ --data_files ./en_dataset.jsonl \ --train_split "train[:16000]" \ --eval_split "train[16000:19700]" \ --text_field "text" \ --max_steps 100 \ --logging_steps 10 \ --output_path {Your fine-tuning save path} \ --log_level 1 Fine-tune Phi-4-mini-reasoning using Apple MLX locally on MacOS Note- we recommend that you use devices with a minimum of 64GB Memory and Apple Silicon devices Get the DataSet from Hugging face datasets pip install datasets run this script to get train and valid data from datasets import load_dataset prompt_template = """<|user|>{}<|end|><|assistant|><think>{}</think>{}<|end|>""" def formatting_prompts_func(examples): inputs = examples["Question"] cots = examples["Complex_CoT"] outputs = examples["Response"] texts = [] for input, cot, output in zip(inputs, cots, outputs): # text = prompt_template.format(input, cot, output) + "<|end|>" text = prompt_template.format(input, cot, output) + "<|endoftext|>" texts.append(text) return { "text": texts, } dataset = load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT","en", trust_remote_code=True) split_dataset = dataset["train"].train_test_split(test_size=0.2, seed=200) train_dataset = split_dataset['train'] validation_dataset = split_dataset['test'] train_dataset = train_dataset.map(formatting_prompts_func, batched = True,remove_columns=["Question", "Complex_CoT", "Response"]) train_dataset.to_json("./data/train.jsonl") validation_dataset = validation_dataset.map(formatting_prompts_func, batched = True,remove_columns=["Question", "Complex_CoT", "Response"]) validation_dataset.to_json("./data/valid.jsonl") Fine-tuning with Apple MLX python -m mlx_lm.lora --model ./phi-4-mini-reasoning --train --data ./data --iters 100 Running the model ! python -m mlx_lm.generate --model ./phi-4-mini-reasoning --adapter-path ./adapters --max-token 4096 --prompt "A 54-year-old construction worker with a long history of smoking presents with swelling in his upper extremity and face, along with dilated veins in this region. After conducting a CT scan and venogram of the neck, what is the most likely diagnosis for the cause of these symptoms?" --extra-eos-token "<|end|>" Get Notebook from Phi Cookbook : https://aka.ms/phicookbook We hope this sample has inspired you to use Phi-4-mini-reasoning and Phi-4-reasoning to complete industry reasoning for our own scenarios. Related resources Phi4-mini-reasoning Tech Report https://aka.ms/phi4-mini-reasoning/techreport Phi-4-Mini-Reasoning technical Report· microsoft/Phi-4-mini-reasoning Phi-4-mini-reasoning on Azure AI Foundry https://aka.ms/phi4-mini-reasoning/azure Phi-4 Reasoning Blog https://aka.ms/phi4-mini-reasoning/blog Phi Cookbook https://aka.ms/phicookbook Showcasing Phi-4-Reasoning: A Game-Changer for AI Developers | Microsoft Community Hub Models Phi-4 Reasoning https://huggingface.co/microsoft/Phi-4-reasoning Phi-4 Reasoning Plus https://huggingface.co/microsoft/Phi-4-reasoning-plus Phi-4-mini-reasoning Hugging Face https://aka.ms/phi4-mini-reasoning/hf Phi-4-mini-reasoning on Azure AI Foundry https://aka.ms/phi4-mini-reasoning/azure Microsoft (Microsoft) Models on Hugging Face Phi-4 Reasoning Models Azure AI Foundry Models Access Phi-4-reasoning models Phi Models at Azure AI Foundry Models Phi Models on Hugging Face Phi Models on GitHub Marketplace ModelsBuild AI Agents with MCP Tool Use in Minutes with AI Toolkit for VSCode
We’re excited to announce Agent Builder, the newest evolution of what was formerly known as Prompt Builder, now reimagined and supercharged for intelligent app development. This powerful tool in AI Toolkit enables you to create, iterate, and optimize agents—from prompt engineering to tool integration—all in one seamless workflow. Whether you're designing simple chat interactions or complex task-performing agents with tool access, Agent Builder simplifies the journey from idea to integration. Why Agent Builder? Agent Builder is designed to empower developers and prompt engineers to: 🚀 Generate starter prompts with natural language 🔁 Iterate and refine prompts based on model responses 🧩 Break down tasks with prompt chaining and structured outputs 🧪 Test integrations with real-time runs and tool use such as MCP servers 💻 Generate production-ready code for rapid app development And a lot of features are coming soon, stay tuned for: 📝 Use variables in prompts �� Run agent with test cases to test your agent easily 📊 Evaluate the accuracy and performance of your agent with built-in or your custom metrics ☁️ Deploy your agent to cloud Build Smart Agents with Tool Use (MCP Servers) Agents can now connect to external tools through MCP (Model Control Protocol) servers, enabling them to perform real-world actions like querying a database, accessing APIs, or executing custom logic. Connect to an Existing MCP Server To use an existing MCP server in Agent Builder: In the Tools section, select + MCP Server. Choose a connection type: Command (stdio) – run a local command that implements the MCP protocol HTTP (server-sent events) – connect to a remote server implementing the MCP protocol If the MCP server supports multiple tools, select the specific tool you want to use. Enter your prompts and click Run to test the agent's interaction with the tool. This integration allows your agents to fetch live data or trigger custom backend services as part of the conversation flow. Build and Scaffold a New MCP Server Want to create your own tool? Agent Builder helps you scaffold a new MCP server project: In the Tools section, select + MCP Server. Choose MCP server project. Select your preferred programming language: Python or TypeScript. Pick a folder to create your server project. Name your project and click Create. Agent Builder generates a scaffolded implementation of the MCP protocol that you can extend. Use the built-in VS Code debugger: Press F5 or click Debug in Agent Builder Test with prompts like: System: You are a weather forecast professional that can tell weather information based on given location. User: What is the weather in Shanghai? Agent Builder will automatically connect to your running server and show the response, making it easy to test and refine the tool-agent interaction. AI Sparks from Prototype to Production with AI Toolkit Building AI-powered applications from scratch or infusing intelligence into existing systems? AI Sparks is your go-to webinar series for mastering the AI Toolkit (AITK) from foundational concepts to cutting-edge techniques. In this bi-weekly, hands-on series, we’ll cover: 🚀SLMs & Local Models – Test and deploy AI models and applications efficiently on your own terms locally, to edge devices or to the cloud 🔍 Embedding Models & RAG – Supercharge retrieval for smarter applications using existing data. 🎨 Multi-Modal AI – Work with images, text, and beyond. 🤖 Agentic Frameworks – Build autonomous, decision-making AI systems. Watch on Demand Share your feedback Get started with the latest version, share your feedback, and let us know how these new features help you in your AI development journey. As always, we’re here to listen, collaborate, and grow alongside our amazing user community. Thank you for being a part of this journey—let’s build the future of AI together! Join our Microsoft Azure AI Foundry Discord channel to continue the discussion 🚀