PHI
2 TopicsEdge AI for Beginners : Getting Started with Foundry Local
In Module 08 of the EdgeAI for Beginners course, Microsoft introduces Foundry Local a toolkit that helps you deploy and test Small Language Models (SLMs) completely offline. In this blog, I’ll share how I installed Foundry Local, ran the Phi-3.5-mini model on my windows laptop, and what I learned through the process. What Is Foundry Local? Foundry Local allows developers to run AI models locally on their own hardware. It supports text generation, summarization, and code completion — all without sending data to the cloud. Unlike cloud-based systems, everything happens on your computer, so your data never leaves your device. Prerequisites Before starting, make sure you have: Windows 10 or 11 Python 3.10 or newer Git Internet connection (for the first-time model download) Foundry Local installed Step 1 — Verify Installation After installing Foundry Local, open Command Prompt and type: foundry --version If you see a version number, Foundry Local is installed correctly. Step 2 — Start the Service Start the Foundry Local service using: foundry service start You should see a confirmation message that the service is running. Step 3 — List Available Models To view the models supported by your system, run: foundry model list You’ll get a list of locally available SLMs. Here’s what I saw on my machine: Note: Model availability depends on your device’s hardware. For most laptops, phi-3.5-mini works smoothly on CPU. Step 4 — Run the Phi-3.5 Model Now let’s start chatting with the model: foundry model run phi-3.5-mini-instruct-generic-cpu:1 Once it loads, you’ll enter an interactive chat mode. Try a simple prompt: Hello! What can you do? The model replies instantly — right from your laptop, no cloud needed. To exit, type: /exit How It Works Foundry Local loads the model weights from your device and performs inference locally.This means text generation happens using your CPU (or GPU, if available). The result: complete privacy, no internet dependency, and instant responses. Benefits for Students For students beginning their journey in AI, Foundry Local offers several key advantages: No need for high-end GPUs or expensive cloud subscriptions. Easy setup for experimenting with multiple models. Perfect for class assignments, AI workshops, and offline learning sessions. Promotes a deeper understanding of model behavior by allowing step-by-step local interaction. These factors make Foundry Local a practical choice for learning environments, especially in universities and research institutions where accessibility and affordability are important. Why Use Foundry Local Running models locally offers several practical benefits compared to using AI Foundry in the cloud. With Foundry Local, you do not need an internet connection, and all computations happen on your personal machine. This makes it faster for small models and more private since your data never leaves your device. In contrast, AI Foundry runs entirely on the cloud, requiring internet access and charging based on usage. For students and developers, Foundry Local is ideal for quick experiments, offline testing, and understanding how models behave in real-time. On the other hand, AI Foundry is better suited for large-scale or production-level scenarios where models need to be deployed at scale. In summary, Foundry Local provides a flexible and affordable environment for hands-on learning, especially when working with smaller models such as Phi-3, Qwen2.5, or TinyLlama. It allows you to experiment freely, learn efficiently, and better understand the fundamentals of Edge AI development. Optional: Restart Later Next time you open your laptop, you don’t have to reinstall anything. Just run these two commands again: foundry service start foundry model run phi-3.5-mini-instruct-generic-cpu:1 What I Learned Following the EdgeAI for Beginners Study Guide helped me understand: How edge AI applications work How small models like Phi 3.5 can run on a local machine How to test prompts and build chat apps with zero cloud usage Conclusion Running the Phi-3.5-mini model locally with Foundry Localgave me hands-on insight into edge AI. It’s an easy, private, and cost-free way to explore generative AI development. If you’re new to Edge AI, start with the EdgeAI for Beginners course and follow its Study Guide to get comfortable with local inference and small language models. Resources: EdgeAI for Beginners GitHub Repo Foundry Local Official Site Phi Model Link124Views0likes0CommentsTransforming Android Development: Unveiling MediaTek’s latest chipset with Microsoft's Phi models
Imagine running advanced AI applications—like intelligent copilots and Retrieval-Augmented Generation (RAG)—directly on Android devices, completely offline. With the rapid evolution of Neural Processing Units (NPUs), this is no longer a future vision—it’s happening now. Optimized AI at the Edge: Phi-4-mini on MediaTek Thanks to MediaTek’s conversion and quantization tools, Microsoft’s Phi-4-mini and Phi-4-mini-reasoning models are now optimized for MediaTek NPUs. This collaboration empowers developers to build fast, responsive, and privacy-preserving AI experiences on Android—without needing cloud connectivity. MediaTek’s flagship Dimensity 9400 and 9400+ platform with Dimensity GenAI Toolkit 2.0 delivers excellent performance with the Phi-4 mini (3.8B) model where prefill speed is >800 tokens/sec and decode speed is >21 tokens/sec. Unlock Enhanced Performance: Introducing MediaTek's NeuroPilot SDK The MediaTek NeuroPilot SDK is a robust software development toolkit designed to accelerate AI application development and deployment across MediaTek’s hardware ecosystem. It provides developers with advanced optimization tools and cross-platform compatibility, enabling efficient implementation of neural networks while balancing performance, power efficiency, and resource utilization. Comprehensive toolchain and documentation support The NeuroPilot platform offers a complete toolchain, including SDKs, APIs, and documentation, for model quantization/conversion, compilation, and integration. Developers can leverage these tools to optimize neural networks, significantly improving on-device performance while reducing power consumption and memory usage. MediaTek’s Dimensity GenAI Toolkit 2.0 now supports the Phi-4 series and provides best practices. Users can convert and quantize Phi-4 mini models in just a few steps, enabling seamless deployment on Dimensity series platforms. A key advantage is that developers do not require specialized hardware expertise to rapidly prototype and deploy customized AI solutions. One-time coding, cross-platform deployment The MediaTek NeuroPilot SDK supports all AI-capable MediaTek hardware, empowering developers to adopt a "code once, deploy everywhere" strategy across smartphones, tablets, automotive, smart home devices, IoT products, and future platforms. This aligns with MediaTek’s corporate philosophy of bringing AI to everyone. This unified approach streamlines development, reduces costs, and accelerates time-to-market. The SDK integrates with Android and Linux ecosystems, providing complete compiler suites, analyzers, and application libraries to ensure compatibility and optimize performance. Demo 1: Deploying Phi-4-mini-reasoning with NeuroPilot SDK In this demo, developers are shown how to use the NeuroPilot SDK to deploy the Phi-4-mini-reasoning model on edge devices. The SDK enables efficient conversion and optimization, making it possible to bring advanced reasoning capabilities to smartphones and other local hardware. The Phi-4-mini-reasoning model brings logical and problem-solving capabilities to the edge. With MediaTek’s advanced conversion tools, this new model can be transformed for MediaTek’s DLA, enabling a new class of intelligent applications on mobile devices. Bringing reasoning capabilities to the edge allows developers to build faster, more responsive AI experiences—without relying on cloud access. Demo 2: Deploying Phi-4-mini with NeuroPilot SDK This video demonstrates how to convert and run the Phi-4-mini model using the NeuroPilot SDK. With a focus on instruction-following tasks, this deployment empowers developers to build responsive, embedded AI assistants that understand and execute user commands locally. Whether it’s productivity tools or context-sensitive automation, Phi-4-mini brings natural interaction and reliability directly to the device. Imagine the possibilities: Real world scenarios Intelligent information access with on-device RAG Picture this: your application intelligently accesses and reasons over on-device documents, like PDFs or internal knowledge bases, using an advanced embedding model paired with the MediaTek optimized Phi-4-mini. This enables developers to create: Personalized Assistants: Apps that understand user context from their own documents. Offline Knowledge Hubs: Providing instant access to relevant information without needing cloud connectivity. Enhanced Productivity Tools: Smart summarization, Q&A, and content generation based on local data. Demo 3: Private RAG chatbot on device People are on their mobile devices every day—saving new documents, sending messages, taking notes, and more. With how much we’re able to store on our phones and laptops, it can get hard to find specific files or pieces of information when we need them most. What if you could implement a personal assistant that understands your question and fetches exactly what you’re looking for, without you needing to dig through your device? This demo showcases a Retrieval-Augmented Generation (RAG) implementation of the Phi model embedded directly on a smartphone. The chatbot allows users to ask natural language questions and instantly retrieve relevant information from local files. Because the model runs on-device, there's no need for a cloud connection—ensuring your data stays private while still offering intelligent, context-aware result RAG based Phi-4-mini solution, so that when you searched your device, it parsed through every document to help you find the exact document you are looking for. Stay ahead of the curve: If you're eager to explore the Phi-4 family of models on edge devices and master building next-gen apps with MediaTek's powerful NPU, don't miss the key sessions at Microsoft Build and Computex Taipei happening this week. This is your chance to get direct insights from the experts. Microsoft Build 2025: Uncover the latest on Azure AI Foundry on May 20 during “Unveiling Latest Innovations in Azure AI Foundry Model Catalog” If you are in person on May 20 th , catch the second lab “Fine-Tune End-to-End Distillation Models with Azure AI Foundry Models” Learn about Phi on Windows devices in on May 20 th for “Enable seamless deployment across Intel Copilot+ AI PCs and Azure” Computex 2025 : MediaTek Booth (M0806) on May 20-23. See MediaTek 's AI vision and hardware innovations firsthand. Resources Explore the Phi-4 Model Family on Azure AI Foundry and HuggingFace Get access to the Phi Cookbook: Your practical guide and code repository for building with Phi models. Learn more about Mediatek NeuroPilot Connect with the MediaTek Developer Application785Views0likes0Comments