get started
90 TopicsAnnouncing MSGraph Provider Public Preview and the Microsoft Terraform VSCode Extension
We are thrilled to announce two exciting developments in the Microsoft ecosystem for Terraform infrastructure-as-code (IaC) practitioners: the public preview of the Terraform Microsoft Graph (MSGraph) provider and the release of the Microsoft Terraform Visual Studio Code (VSCode) extension. These innovations are designed to streamline your workflow, empower your automation, and make managing Microsoft cloud resources easier than ever. Public Preview: Terraform Microsoft Graph (MSGraph) Provider The Terraform MSGraph provider empowers you to manage Entra APIs like privileged identity management as well as M365 Graph APIs like SharePoint sites from day 0 by leveraging the power and flexibility of HashiCorp Configuration Language (HCL) in Terraform. resource "msgraph_resource" "application" { url = "applications" body = { displayName = "My Application" } response_export_values = { all = "@" app_id = "appId" } } output "app_id" { value = msgraph_resource.application.output.app_id } output "all" { // it will output the whole response value = msgraph_resource.application.output.all } Historically, Terraform users could utilize the `azuread` provider to manage Entra features like users, groups, service principals, and applications. The new `msgraph` provider also supports these features and extends functionality to all beta and v1 Microsoft Graph endpoints. Querying role assignments for a service principal The below example shows how to use the `msgraph` provider to grant app permissions to a service principal: locals { MicrosoftGraphAppId = "00000003-0000-0000-c000-000000000000" # AppRoleAssignment userReadAllAppRoleId = one([for role in data.msgraph_resource.servicePrincipal_msgraph.output.all.value[0].appRoles : role.id if role.value == "User.Read.All"]) userReadWriteRoleId = one([for role in data.msgraph_resource.servicePrincipal_msgraph.output.all.value[0].oauth2PermissionScopes : role.id if role.value == "User.ReadWrite"]) # ServicePrincipal MSGraphServicePrincipalId = data.msgraph_resource.servicePrincipal_msgraph.output.all.value[0].id TestApplicationServicePrincipalId = msgraph_resource.servicePrincipal_application.output.all.id } data "msgraph_resource" "servicePrincipal_msgraph" { url = "servicePrincipals" query_parameters = { "$filter" = ["appId eq '${local.MicrosoftGraphAppId}'"] } response_export_values = { all = "@" } } resource "msgraph_resource" "application" { url = "applications" body = { displayName = "My Application" requiredResourceAccess = [ { resourceAppId = local.MicrosoftGraphAppId resourceAccess = [ { id = local.userReadAllAppRoleId type = "Scope" }, { id = local.userReadWriteRoleId type = "Scope" } ] } ] } response_export_values = { appId = "appId" } } resource "msgraph_resource" "servicePrincipal_application" { url = "servicePrincipals" body = { appId = msgraph_resource.application.output.appId } response_export_values = { all = "@" } } resource "msgraph_resource" "appRoleAssignment" { url = "servicePrincipals/${local.MSGraphServicePrincipalId}/appRoleAssignments" body = { appRoleId = local.userReadAllAppRoleId principalId = local.TestApplicationServicePrincipalId resourceId = local.MSGraphServicePrincipalId } } SharePoint & Outlook Notifications With your service principals properly configured, you can set up M365 endpoint workflows such an outlook notification template list as shown below. The actual service principal setup has been omitted from this code sample for the sake of brevity, but you will need Sites.Manage.All, Sites.ReadWrite.All, User.Read, and User.Read.All permissions for this example to work: data "msgraph_resource" "sharepoint_site_by_path" { url = "sites/microsoft.sharepoint.com:/sites/msgraphtest:" response_export_values = { full_response = "@" site_id = "id || ''" } } resource "msgraph_resource" "notification_templates_list" { url = "sites/${msgraph_resource.sharepoint_site_by_path.output.site_id}/lists" body = { displayName = "DevOps Notification Templates" description = "Centrally managed email templates for DevOps automation" template = "genericList" columns = [ { name = "TemplateName" text = { allowMultipleLines = false appendChangesToExistingText = false linesForEditing = 1 maxLength = 255 } }, { name = "Subject" text = { allowMultipleLines = false appendChangesToExistingText = false linesForEditing = 1 maxLength = 500 } }, { name = "HtmlBody" text = { allowMultipleLines = true appendChangesToExistingText = false linesForEditing = 10 maxLength = 10000 } }, { name = "Recipients" text = { allowMultipleLines = true appendChangesToExistingText = false linesForEditing = 3 maxLength = 1000 } }, { name = "TriggerConditions" text = { allowMultipleLines = true appendChangesToExistingText = false linesForEditing = 5 maxLength = 2000 } } ] } response_export_values = { list_id = "id" list_name = "displayName" web_url = "webUrl" } } The MSGraph provider is to AzureAD as the AzAPI provider is to AzureRM. Since support for resource types is automatic, you can access the latest features and functionality as soon as they're released via the provider. AzureAD will continue to serve as the convenience layer implementation of a subset of Entra APIs. We invite you to try the new provider today: - Deploy your first msgraph resources - Check out the registry page - Visit the provider GitHub Introducing the Microsoft Terraform VSCode Extension The new official Microsoft Terraform extension for Visual Studio Code consolidates AzureRM, AzAPI, and MSGraph VSCode support into a single powerful extension. The extension supports exporting Azure resources as Terraform code, as well as IntelliSense, syntax highlighting, and code sample generation. It replaces the Azure Terraform and AzAPI VSCode extensions and adds some new features. Installation & Migration New users can install the extension by searching “Microsoft Terraform” within Visual Studio Marketplace or their “Extensions” tab. Users can also click this link to the Visual Studio marketplace. Users of the “Azure Terraform” extension can navigate to “Extensions” tab and selecting the old extension. Select the “Migrate” button to move to the new extension. Users of the “Terraform AzAPI Provider” extension will be directed to the new extension: New Features Export Azure Resources As Terraform This feature allows you to export existing Azure resources as Terraform configuration blocks using Azure Export for Terraform. This helps you migrate existing Azure resources to Terraform-managed infrastructure. Open the Command Palette (Command+Shift+P on macOS and Ctrl+Shift+P on Windows/Linux). Search for and select the command Microsoft Terraform: Export Azure Resource as Terraform. Follow the prompts to select the Azure subscription and resource group containing the resources you want to export. Select the azurerm provider or the azapi provider to export the resources. The extension will generate the Terraform configuration blocks for the selected resources and display them in a new editor tab. Support for MSGraph The new extension comes fully equipped with intellisense, code completion, and code samples just like the AzAPI provider. See the next section for recorded examples of these features within the AzureRM & AzAPI providers. Preexisting Features Intelligent Code Completion: Benefit from context-aware suggestions, like property names or resource types. Code Samples: Quickly insert code samples for your resources: Paste as AzAPI: Copy your existing resource JSON or ARM Templates into VSCode with the Microsoft Terraform extension, and it will automatically convert your code into AzAPI. The below example takes a resource JSON from the Azure Portal and pastes it into VSCode as AzAPI: Migrate AzureRM to AzAPI: Move existing AzureRM code to the AzAPI provider whenever you wish to. Read more in the Guide to migrate AzureRM resources to AzAPI Feedback We value your feedback! You can share your experience with the Microsoft Terraform extension by running the command Microsoft Terraform: Show Survey from the Command Palette. Your input helps us improve the extension and better serve your needs. Conclusion Whether you are managing traditional Azure resources, modern Microsoft Graph environments, or a combination of both, the new MSGraph provider and Microsoft Terraform VS Code extension are designed to help you deliver robust, reliable infrastructure—faster and with greater confidence. Stay tuned for further updates, workshops, and community events as we continue to evolve these offerings. Your feedback and participation are invaluable as we build the next generation of infrastructure automation together.3.3KViews3likes1CommentStrategic Solutions for Seamless Integration of Third-Party SaaS
Modern systems must be modular and interoperable by design. Integration is no longer a feature, it’s a requirement. Developers are expected to build architectures that connect easily with third-party platforms, but too often, core systems are designed in isolation. This disconnect creates friction for downstream teams and slows delivery. At Microsoft, SaaS platforms like SAP SuccessFactors and Eightfold support Talent Acquisition by handling functions such as requisition tracking, application workflows, and interview coordination. These tools help reduce costs and free up engineering focus for high-priority areas like Azure and AI. The real challenge is integrating them with internal systems such as Demand Planning, Offer Management, and Employee Central. This blog post outlines a strategy centered around two foundational components: an Integration and Orchestration Layer, and a Messaging Platform. Together, these enable real-time communication, consistent data models, and scalable integration. While Talent Acquisition is the use case here, the architectural patterns apply broadly across domains. Whether you're embedding AI pipelines, managing edge deployments, or building platform services, thoughtful integration needs to be built into the foundation, not bolted on later.JS AI Build‑a‑thon: Wrapping Up an Epic June 2025!
After weeks of building, testing, and learning — we’re officially wrapping up the first-ever JS AI Build-a-thon 🎉. This wasn't your average coding challenge. This was a hands-on journey where JavaScript and TypeScript developers dove deep into real-world AI concepts — from local GenAI prototyping to building intelligent agents and deploying production-ready apps. Whether you joined from the start or hopped on midway, you built something that matters — and that’s worth celebrating. Replay the Journey No worries if you joined late or want to revisit any part of the journey. The JS AI Build-a-thon was designed to let you learn at your own pace, so whether you're starting now or polishing up your final project, here’s your complete quest map: Build-a-thon set up guide: https://aka.ms/JSAIBuildathonSetup Quest 1: 🔧 Build your first GenAI app locally with GitHub Models 👉🏽 https://aka.ms/JSAIBuildathonQuest1 Quest 2: ☁️ Move your AI prototype to Azure AI Foundry 👉🏽 https://aka.ms/JSAIBuildathonQuest Quest 3: 🎨 Add a chat UI using Vite + Lit 👉🏽 https://aka.ms/JSAIBuildathonQuest3 Quest 4: 📄 Enhance your app with RAG (Chat with Your Data) 👉🏽 https://aka.ms/JSAIBuildathonQuest4 Quest 5: 🧠 Add memory and context to your AI app 👉🏽 https://aka.ms/JSAIBuildathonQuest5 Quest 6: ⚙️ Build your first AI Agent using AI Foundry 👉🏽 https://aka.ms/JSAIBuildathonQuest6 Quest 7: 🧩 Equip your agent with tools from an MCP server 👉🏽 https://aka.ms/JSAIBuildathonQuest7 Quest 8: 💬 Ground your agent with real-time search using Bing 👉🏽 https://aka.ms/JSAIBuildathonQuest8 Quest 9: 🚀 Build a real-world AI project with full-stack templates 👉🏽 https://aka.ms/JSAIBuildathonQuest9 Link to our space in the AI Discord Community: https://aka.ms/JSAIonDiscord Project Submission Guidelines 📌 Quest 9 is where it all comes together. Participants chose a problem, picked a template, customized it, submitted it, and rallied their community for support! 🏅 Claim Your Badge! Whether you completed select quests or went all the way, we celebrate your learning. If you participated in the June 2025 JS AI Build-a-thon, make sure to Submit the Participation Form to receive your participation badge recognizing your commitment to upskilling in AI with JavaScript/ TypeScript. What’s Next? We’re not done. In fact, we’re just getting started. We’re already cooking up JS AI Build-a-thon v2, which will introduce: Running everything locally with Foundry Local Real-world RAG with vector databases Advanced agent patterns with remote MCPs And much more based on your feedback Want to shape what comes next? Drop your ideas in the participation form and in our Discord. In the meantime, add these resources to your JavaScript + AI Dev Pack: 🔗 Microsoft for JavaScript developers 📚 Generative AI for Beginners with JavaScript Wrap-Up This build-a-thon showed what’s possible when developers are empowered to learn by doing. You didn’t just follow tutorials — you shipped features, connected services, and created working AI experiences. We can’t wait to see what you build next. 👉 Bookmark the repo 👉 Join the community on Join the Azure AI Foundry Discord Server! 👉 Stay building Until next time — keep coding, keep shipping!Quest 4 - I want to connect my AI prototype to external data using RAG
In Quest 4 of the JS AI Build-a-thon, you’ll integrate Retrieval-Augmented Generation (RAG) to give your AI apps access to external data like PDFs. You’ll explore embeddings, vector stores, and how to use the pdf-parse library in JavaScript to build more context-aware apps — with challenges to push you even further.Introducing Azure AI Travel Agents: A Flagship MCP-Powered Sample for AI Travel Solutions
We are excited to introduce AI Travel Agents, a sample application with enterprise functionality that demonstrates how developers can coordinate multiple AI agents (written in multiple languages) to explore travel planning scenarios. It's built with LlamaIndex.TS for agent orchestration, Model Context Protocol (MCP) for structured tool interactions, and Azure Container Apps for scalable deployment. TL;DR: Experience the power of MCP and Azure Container Apps with The AI Travel Agents! Try out live demo locally on your computer for free to see real-time agent collaboration in action. Share your feedback on our community forum. We’re already planning enhancements, like new MCP-integrated agents, enabling secure communication between the AI agents and MCP servers and more. NOTE: This example uses mock data and is intended for demonstration purposes rather than production use. The Challenge: Scaling Personalized Travel Planning Travel agencies grapple with complex tasks: analyzing diverse customer needs, recommending destinations, and crafting itineraries, all while integrating real-time data like trending spots or logistics. Traditional systems falter with latency, scalability, and coordination, leading to delays and frustrated clients. The AI Travel Agents tackles these issues with a technical trifecta: LlamaIndex.TS orchestrates six AI agents for efficient task handling. MCP equips agents with travel-specific data and tools. Azure Container Apps ensures scalable, serverless deployment. This architecture delivers operational efficiency and personalized service at scale, transforming chaos into opportunity. LlamaIndex.TS: Orchestrating AI Agents The heart of The AI Travel Agents is LlamaIndex.TS, a powerful agentic framework that orchestrates multiple AI agents to handle travel planning tasks. Built on a Node.js backend, LlamaIndex.TS manages agent interactions in a seamless and intelligent manner: Task Delegation: The Triage Agent analyzes queries and routes them to specialized agents, like the Itinerary Planning Agent, ensuring efficient workflows. Agent Coordination: LlamaIndex.TS maintains context across interactions, enabling coherent responses for complex queries, such as multi-city trip plans. LLM Integration: Connects to Azure OpenAI, GitHub Models or any local LLM using Foundy Local for advanced AI capabilities. LlamaIndex.TS’s modular design supports extensibility, allowing new agents to be added with ease. LlamaIndex.TS is the conductor, ensuring agents work in sync to deliver accurate, timely results. Its lightweight orchestration minimizes latency, making it ideal for real-time applications. MCP: Fueling Agents with Data and Tools The Model Context Protocol (MCP) empowers AI agents by providing travel-specific data and tools, enhancing their functionality. MCP acts as a data and tool hub: Real-Time Data: Supplies up-to-date travel information, such as trending destinations or seasonal events, via the Web Search Agent using Bing Search. Tool Access: Connects agents to external tools, like the .NET-based customer queries analyzer for sentiment analysis, the Python-based itinerary planning for trip schedules or destination recommendation tools written in Java. For example, when the Destination Recommendation Agent needs current travel trends, MCP delivers via the Web Search Agent. This modularity allows new tools to be integrated seamlessly, future-proofing the platform. MCP’s role is to enrich agent capabilities, leaving orchestration to LlamaIndex.TS. Azure Container Apps: Scalability and Resilience Azure Container Apps powers The AI Travel Agents sample application with a serverless, scalable platform for deploying microservices. It ensures the application handles varying workloads with ease: Dynamic Scaling: Automatically adjusts container instances based on demand, managing booking surges without downtime. Polyglot Microservices: Supports .NET (Customer Query), Python (Itinerary Planning), Java (Destination Recommandation) and Node.js services in isolated containers. Observability: Integrates tracing, metrics, and logging enabling real-time monitoring. Serverless Efficiency: Abstracts infrastructure, reducing costs and accelerating deployment. Azure Container Apps' global infrastructure delivers low-latency performance, critical for travel agencies serving clients worldwide. The AI Agents: A Quick Look While MCP and Azure Container Apps are the stars, they support a team of multiple AI agents that drive the application’s functionality. Built and orchestrated with Llamaindex.TS via MCP, these agents collaborate to handle travel planning tasks: Triage Agent: Directs queries to the right agent, leveraging MCP for task delegation. Customer Query Agent: Analyzes customer needs (emotions, intents), using .NET tools. Destination Recommendation Agent: Suggests tailored destinations, using Java. Itinerary Planning Agent: Crafts efficient itineraries, powered by Python. Web Search Agent: Fetches real-time data via Bing Search. These agents rely on MCP’s real-time communication and Azure Container Apps’ scalability to deliver responsive, accurate results. It's worth noting though this sample application uses mock data for demonstration purpose. In real worl scenario, the application would communicate with an MCP server that is plugged in a real production travel API. Key Features and Benefits The AI Travel Agents offers features that showcase the power of MCP and Azure Container Apps: Real-Time Chat: A responsive Angular UI streams agent responses via MCP’s SSE, ensuring fluid interactions. Modular Tools: MCP enables tools like analyze_customer_query to integrate seamlessly, supporting future additions. Scalable Performance: Azure Container Apps ensures the UI, backend and the MCP servers handle high traffic effortlessly. Transparent Debugging: An accordion UI displays agent reasoning providing backend insights. Benefits: Efficiency: LlamaIndex.TS streamlines operations. Personalization: MCP’s data drives tailored recommendations. Scalability: Azure ensures reliability at scale. Thank You to Our Contributors! The AI Travel Agents wouldn’t exist without the incredible work of our contributors. Their expertise in MCP development, Azure deployment, and AI orchestration brought this project to life. A special shoutout to: Pamela Fox – Leading the developement of the Python MCP server. Aaron Powell and Justin Yoo – Leading the developement of the .NET MCP server. Rory Preddy – Leading the developement of the Java MCP server. Lee Stott and Kinfey Lo – Leading the developement of the Local AI Foundry Anthony Chu and Vyom Nagrani – Leading Azure Container Apps roadmap Matt Soucoup and Julien Dubois – Leading the ACA DevRel strategy Wassim Chegham – Architected MCP and backend orchestration. And many more! See the GitHub repository for all contributors. Thank you for your dedication to pushing the boundaries of AI and cloud technology! Try It Out Experience the power of MCP and Azure Container Apps with The AI Travel Agents! Try out live demo locally on your computer for free to see real-time agent collaboration in action. Conclusion Developers can explore today the open-source project on GitHub, with setup and deployment instructions. Share your feedback on our community forum. We’re already planning enhancements, like new MCP-integrated agents, enabling secure communication between the AI agents and MCP servers and more. This is still a work in progress and we also welcome all kind of contributions. Please fork and star the repo to stay tuned for updates! ◾️We would love your feedback and continue the discussion in the Azure AI Foundry Discord aka.ms/foundry/discord On behalf of Microsoft DevRel Team.JS AI Build-a-thon Setup in 5 Easy Steps
🔥 TL;DR — You’re 5 Steps Away from an AI Adventure Set up your project repo, follow the quests, build cool stuff, and level up. Everything’s automated, community-backed, and designed to help you actually learn AI — using the skills you already have. Let’s build the future. One quest at a time. 👉 Join the Build-a-thon | Chat on DiscordGetting Started with Azure MCP Server: A Guide for Developers
The world of cloud computing is growing rapidly, and Azure is at the forefront of this innovation. If you're a student developer eager to dive into Azure and learn about Model Context Protocol (MCP), the Azure MCP Server is your perfect starting point. This tool, currently in Public Preview, empowers AI agents to seamlessly interact with Azure services like Azure Storage, Cosmos DB, and more. Let's explore how you can get started! 🎯 Why Use the Azure MCP Server? The Azure MCP Server revolutionizes how AI agents and developers interact with Azure services. Here's a glimpse of what it offers: Exploration Made Easy: List storage accounts, databases, resource groups, tables, and more with natural language commands. Advanced Operations: Manage configurations, query analytics, and execute complex tasks like building Azure applications. Streamlined Integration: From JSON communication to intelligent auto-completion, the Azure MCP Server ensures smooth operations. Whether you're querying log analytics or setting up app configurations, this server simplifies everything. ✨ Installation Guide: One-Click and Manual Methods Prerequisites: Before you begin, ensure the following: Install either the Stable or Insiders release of VS Code. Add the GitHub Copilot and GitHub Copilot Chat extensions. Option 1: One-Click Install You can install the Azure MCP Server in VS Code or VS Code Insiders using NPX. Here's how: Simply run: npx -y /mcp@latest server start Option 2: Manual Install If you'd prefer manual setup, follow these steps: Create a .vscode/mcp.json file in your VS Code project directory. Add the following configuration: { "servers": { "Azure MCP Server": { "command": "npx", "args": ["-y", "@azure/mcp@latest", "server", "start"] } } } Here an example of the settings.json file Now, launch GitHub Copilot in Agent Mode to activate the Azure MCP Server. 🚀 Supercharging Azure Development Once installed, the Azure MCP Server unlocks an array of capabilities: Azure Cosmos DB: List, query, manage databases and containers. Azure Storage: Query blob containers, metadata, and tables. Azure Monitor: Use KQL to analyze logs and monitor your resources. App Configuration: Handle key-value pairs and labeled configurations. Test prompts like: "List my Azure Storage containers" "Query my Log Analytics workspace" "Show my key-value pairs in App Config" These commands let your agents harness the power of Azure services effortlessly. 🛡️ Security & Authentication The Azure MCP Server simplifies authentication using Azure Identity. Your login credentials are handled securely, with support for mechanisms like: Visual Studio credentials Azure CLI login Interactive Browser login For advanced scenarios, enable production credentials with: export AZURE_MCP_INCLUDE_PRODUCTION_CREDENTIALS=true Always perform a security review when integrating MCP servers to ensure compliance with regulations and standards. 🌟 Why Join the Azure MCP Community? As a developer, you're invited to contribute to the Azure MCP Server project. Whether it's fixing bugs, adding features, or enhancing documentation, your contributions are valued. Explore the Contributing Guide for details on getting involved. The Azure MCP Server is your gateway to leveraging Azure services with cutting-edge technology. Dive in, experiment, and bring your projects to life! What Azure project are you excited to build with the MCP Server? Let’s brainstorm ideas together!4.7KViews4likes2CommentsMake Phi-4-mini-reasoning more powerful with industry reasoning on edge devices
In situations with limited computing, Phi-4-mini-reasoning will is an excellent model choice. We can use Microsoft Olive or Apple MLX Framework to quantize Phi-4-mini-reasoning and deploy it on edge terminals such as IoT, Laotop and mobile devices. Quantization In order to solve the problem that the model is difficult to deploy directly to specific hardware, we need to reduce the complexity of the model through model quantization. Undertaking the quantization process will inevitably cause precision loss. Quantize Phi-4-mini-reasoning using Microsoft Olive Microsoft Olive is an AI model optimization toolkit for ONNX Runtime. Given a model and target hardware, Olive (short for Onnx LIVE) will combine the most appropriate optimization techniques to output the most efficient ONNX model for inference in the cloud or on the edge. We can combine Microsoft Olive and Phi-4-mini-reasoning on Azure AI Foundry's Model Catalog to quantize Phi-4-mini-reasoning to an ONNX format model. Create your Notebook on Azure ML Install Microsoft Olive pip install git+https://github.com/Microsoft/Olive.git Quantize using Microsoft Olive olive auto-opt --model_name_or_path {Azure Model Catalog path ,such as azureml://registries/azureml/models/Phi-4-mini-reasoning/versions/1 }--device cpu --provider CPUExecutionProvider --use_model_builder --precision int4 --output_path ./phi-4-14b-reasoninig-onnx --log_level 1 Register your quantized Model ! python -m mlx_lm.generate --model ./phi-4-mini-reasoning --adapter-path ./adapters --max-token 4096 --prompt "A 54-year-old construction worker with a long history of smoking presents with swelling in his upper extremity and face, along with dilated veins in this region. After conducting a CT scan and venogram of the neck, what is the most likely diagnosis for the cause of these symptoms?" --extra-eos-token "<|end|>" Download to local and run Download the onnx model to local device ml_client.models.download("phi-4-mini-onnx-int4-cpu", 1) Running onnx model with onnxruntime-genai Install onnxruntime-genai (This is CPU version) pip install onnxruntime-genai Run it import onnxruntime_genai as og model_folder = "Your ONNX Model Path" model = og.Model(model_folder) tokenizer = og.Tokenizer(model) tokenizer_stream = tokenizer.create_stream() search_options = {} search_options['max_length'] = 32768 chat_template = "<|user|>{input}<|end|><|assistant|>" text = 'A school arranges dormitories for students. If each dormitory accommodates 5 people, 4 people cannot live there; if each dormitory accommodates 6 people, one dormitory only has 4 people, and two dormitories are empty. Find the number of students in this grade and the number of dormitories.' prompt = f'{chat_template.format(input=text)}' input_tokens = tokenizer.encode(prompt) params = og.GeneratorParams(model) params.set_search_options(**search_options) generator = og.Generator(model, params) generator.append_tokens(input_tokens) while not generator.is_done(): generator.generate_next_token() new_token = generator.get_next_tokens()[0] print(tokenizer_stream.decode(new_token), end='', flush=True) Get Notebook from Phi Cookbook : https://aka.ms/phicookbook Quantize Phi-4-mini-reasoning model using Apple MLX Install Apple MLX Framework pip install -U mlx-lm Convert Phi-4-mini-reasoning model through Apple MLX quantization python -m mlx_lm.convert --hf-path {Phi-4-mini-reasoning Hugging face id} -q Run Phi-4-mini-reasoning with Apple MLX in terminal python -m mlx_lm.generate --model ./mlx_model --max-token 2048 --prompt "A school arranges dormitories for students. If each dormitory accommodates 5 people, 4 people cannot live there; if each dormitory accommodates 6 people, one dormitory only has 4 people, and two dormitories are empty. Find the number of students in this grade and the number of dormitories." --extra-eos-token "<|end|>" --temp 0.0 Fine-tuning We can fine-tune the CoT data of different scenarios to enable Phi-4-mini-reasoning to have reasoning capabilities for different scenarios. Here we use the Medical CoT data from a public Huggingface datasets as our example (this is just an example. If you need rigorous medical reasoning, please seek more professional data support) We can fine-tune our CoT data in Azure ML Fine-tune Phi-4-mini-reasoning using Microsoft Olive in Azure ML Note- Please use Standard_NC24ads_A100_v4 to run this sample Get Data from Hugging face datasets pip install datasets run this script to get train data from datasets import load_dataset def formatting_prompts_func(examples): inputs = examples["Question"] cots = examples["Complex_CoT"] outputs = examples["Response"] texts = [] for input, cot, output in zip(inputs, cots, outputs): text = prompt_template.format(input, cot, output) + "<|end|>" # text = prompt_template.format(input, cot, output) + "<|endoftext|>" texts.append(text) return { "text": texts, } # Create the English dataset dataset = load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT","en", split = "train",trust_remote_code=True) dataset = dataset.map(formatting_prompts_func, batched = True,remove_columns=["Question", "Complex_CoT", "Response"]) dataset.to_json("en_dataset.jsonl") Fine-tuning with Microsoft Olive olive finetune \ --method lora \ --model_name_or_path {Azure Model Catalog path , azureml://registries/azureml/models/Phi-4-mini-reasoning/versions/1} \ --trust_remote_code \ --data_name json \ --data_files ./en_dataset.jsonl \ --train_split "train[:16000]" \ --eval_split "train[16000:19700]" \ --text_field "text" \ --max_steps 100 \ --logging_steps 10 \ --output_path {Your fine-tuning save path} \ --log_level 1 Convert the model to ONNX with Microsoft Olive olive capture-onnx-graph \ --model_name_or_path {Azure Model Catalog path , azureml://registries/azureml/models/Phi-4-mini-reasoning/versions/1} \ --adapter_path {Your fine-tuning adapter path} \ --use_model_builder \ --output_path {Your save onnx path} \ --log_level 1 olive generate-adapter \ --model_name_or_path {Your save onnx path} \ --output_path {Your save onnx adapter path} \ --log_level 1 Run the model with onnxruntime-genai-cuda Install onnxruntime-genai-cuda SDK import onnxruntime_genai as og import numpy as np import os model_folder = "./models/phi-4-mini-reasoning/adapter-onnx/model/" model = og.Model(model_folder) adapters = og.Adapters(model) adapters.load('./models/phi-4-mini-reasoning/adapter-onnx/model/adapter_weights.onnx_adapter', "en_medical_reasoning") tokenizer = og.Tokenizer(model) tokenizer_stream = tokenizer.create_stream() search_options = {} search_options['max_length'] = 200 search_options['past_present_share_buffer'] = False search_options['temperature'] = 1 search_options['top_k'] = 1 prompt_template = """<|user|>{}<|end|><|assistant|><think>""" question = """ A 33-year-old woman is brought to the emergency department 15 minutes after being stabbed in the chest with a screwdriver. Given her vital signs of pulse 110\/min, respirations 22\/min, and blood pressure 90\/65 mm Hg, along with the presence of a 5-cm deep stab wound at the upper border of the 8th rib in the left midaxillary line, which anatomical structure in her chest is most likely to be injured? """ prompt = prompt_template.format(question, "") input_tokens = tokenizer.encode(prompt) params = og.GeneratorParams(model) params.set_search_options(**search_options) generator = og.Generator(model, params) generator.set_active_adapter(adapters, "en_medical_reasoning") generator.append_tokens(input_tokens) while not generator.is_done(): generator.generate_next_token() new_token = generator.get_next_tokens()[0] print(tokenizer_stream.decode(new_token), end='', flush=True) inference model with onnxruntime-genai cuda olive finetune \ --method lora \ --model_name_or_path {Azure Model Catalog path , azureml://registries/azureml/models/Phi-4-mini-reasoning/versions/1} \ --trust_remote_code \ --data_name json \ --data_files ./en_dataset.jsonl \ --train_split "train[:16000]" \ --eval_split "train[16000:19700]" \ --text_field "text" \ --max_steps 100 \ --logging_steps 10 \ --output_path {Your fine-tuning save path} \ --log_level 1 Fine-tune Phi-4-mini-reasoning using Apple MLX locally on MacOS Note- we recommend that you use devices with a minimum of 64GB Memory and Apple Silicon devices Get the DataSet from Hugging face datasets pip install datasets run this script to get train and valid data from datasets import load_dataset prompt_template = """<|user|>{}<|end|><|assistant|><think>{}</think>{}<|end|>""" def formatting_prompts_func(examples): inputs = examples["Question"] cots = examples["Complex_CoT"] outputs = examples["Response"] texts = [] for input, cot, output in zip(inputs, cots, outputs): # text = prompt_template.format(input, cot, output) + "<|end|>" text = prompt_template.format(input, cot, output) + "<|endoftext|>" texts.append(text) return { "text": texts, } dataset = load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT","en", trust_remote_code=True) split_dataset = dataset["train"].train_test_split(test_size=0.2, seed=200) train_dataset = split_dataset['train'] validation_dataset = split_dataset['test'] train_dataset = train_dataset.map(formatting_prompts_func, batched = True,remove_columns=["Question", "Complex_CoT", "Response"]) train_dataset.to_json("./data/train.jsonl") validation_dataset = validation_dataset.map(formatting_prompts_func, batched = True,remove_columns=["Question", "Complex_CoT", "Response"]) validation_dataset.to_json("./data/valid.jsonl") Fine-tuning with Apple MLX python -m mlx_lm.lora --model ./phi-4-mini-reasoning --train --data ./data --iters 100 Running the model ! python -m mlx_lm.generate --model ./phi-4-mini-reasoning --adapter-path ./adapters --max-token 4096 --prompt "A 54-year-old construction worker with a long history of smoking presents with swelling in his upper extremity and face, along with dilated veins in this region. After conducting a CT scan and venogram of the neck, what is the most likely diagnosis for the cause of these symptoms?" --extra-eos-token "<|end|>" Get Notebook from Phi Cookbook : https://aka.ms/phicookbook We hope this sample has inspired you to use Phi-4-mini-reasoning and Phi-4-reasoning to complete industry reasoning for our own scenarios. Related resources Phi4-mini-reasoning Tech Report https://aka.ms/phi4-mini-reasoning/techreport Phi-4-Mini-Reasoning technical Report· microsoft/Phi-4-mini-reasoning Phi-4-mini-reasoning on Azure AI Foundry https://aka.ms/phi4-mini-reasoning/azure Phi-4 Reasoning Blog https://aka.ms/phi4-mini-reasoning/blog Phi Cookbook https://aka.ms/phicookbook Showcasing Phi-4-Reasoning: A Game-Changer for AI Developers | Microsoft Community Hub Models Phi-4 Reasoning https://huggingface.co/microsoft/Phi-4-reasoning Phi-4 Reasoning Plus https://huggingface.co/microsoft/Phi-4-reasoning-plus Phi-4-mini-reasoning Hugging Face https://aka.ms/phi4-mini-reasoning/hf Phi-4-mini-reasoning on Azure AI Foundry https://aka.ms/phi4-mini-reasoning/azure Microsoft (Microsoft) Models on Hugging Face Phi-4 Reasoning Models Azure AI Foundry Models Access Phi-4-reasoning models Phi Models at Azure AI Foundry Models Phi Models on Hugging Face Phi Models on GitHub Marketplace ModelsWeek 4 . Microsoft Agents Hack Online Events and Readiness Resources
Readiness and skilling events for Week 4: Microsoft AI Agents Hack Register Now at https://aka.ms/agentshack 2025 is the year of AI agents! But what exactly is an agent, and how can you build one? Whether you're a seasoned developer or just starting out, this FREE three-week virtual hackathon is your chance to dive deep into AI agent development. 🔥 Learn from expert-led sessions streamed live on YouTube, covering top frameworks like Semantic Kernel, Autogen, the new Azure AI Agents SDK and the Microsoft 365 Agents SDK. Week 4: April 28th-30th Live and On Demand Topic Track Irresponsible AI Agents Java Securing AI agents on Azure Python The Art of AI Embodiment: Real-Time Interactive Experiences with Azure OpenAI GPT-4o and 3D Avatars Python Evaluating Agents Python 🌟 Join the Conversation on Azure AI Foundry Discussions! 🌟 Have ideas, questions, or insights about AI? Don't keep them to yourself! Share your thoughts, engage with experts, and connect with a community that’s shaping the future of artificial intelligence. 🧠✨ 👉 Click here to join the discussion!