Imagine running advanced AI applications—like intelligent copilots and Retrieval-Augmented Generation (RAG)—directly on Android devices, completely offline. With the rapid evolution of Neural Processing Units (NPUs), this is no longer a future vision—it’s happening now.
Optimized AI at the Edge: Phi-4-mini on MediaTek
Thanks to MediaTek’s conversion and quantization tools, Microsoft’s Phi-4-mini and Phi-4-mini-reasoning models are now optimized for MediaTek NPUs. This collaboration empowers developers to build fast, responsive, and privacy-preserving AI experiences on Android—without needing cloud connectivity.
MediaTek’s flagship Dimensity 9400 and 9400+ platform with Dimensity GenAI Toolkit 2.0 delivers excellent performance with the Phi-4 mini (3.8B) model where prefill speed is >800 tokens/sec and decode speed is >21 tokens/sec.
Unlock Enhanced Performance: Introducing MediaTek's NeuroPilot SDK
The MediaTek NeuroPilot SDK is a robust software development toolkit designed to accelerate AI application development and deployment across MediaTek’s hardware ecosystem. It provides developers with advanced optimization tools and cross-platform compatibility, enabling efficient implementation of neural networks while balancing performance, power efficiency, and resource utilization.
Comprehensive toolchain and documentation support
The NeuroPilot platform offers a complete toolchain, including SDKs, APIs, and documentation, for model quantization/conversion, compilation, and integration. Developers can leverage these tools to optimize neural networks, significantly improving on-device performance while reducing power consumption and memory usage.
MediaTek’s Dimensity GenAI Toolkit 2.0 now supports the Phi-4 series and provides best practices. Users can convert and quantize Phi-4 mini models in just a few steps, enabling seamless deployment on Dimensity series platforms. A key advantage is that developers do not require specialized hardware expertise to rapidly prototype and deploy customized AI solutions.
One-time coding, cross-platform deployment
The MediaTek NeuroPilot SDK supports all AI-capable MediaTek hardware, empowering developers to adopt a "code once, deploy everywhere" strategy across smartphones, tablets, automotive, smart home devices, IoT products, and future platforms. This aligns with MediaTek’s corporate philosophy of bringing AI to everyone. This unified approach streamlines development, reduces costs, and accelerates time-to-market. The SDK integrates with Android and Linux ecosystems, providing complete compiler suites, analyzers, and application libraries to ensure compatibility and optimize performance.
Demo 1: Deploying Phi-4-mini-reasoning with NeuroPilot SDK
In this demo, developers are shown how to use the NeuroPilot SDK to deploy the Phi-4-mini-reasoning model on edge devices. The SDK enables efficient conversion and optimization, making it possible to bring advanced reasoning capabilities to smartphones and other local hardware.
The Phi-4-mini-reasoning model brings logical and problem-solving capabilities to the edge. With MediaTek’s advanced conversion tools, this new model can be transformed for MediaTek’s DLA, enabling a new class of intelligent applications on mobile devices. Bringing reasoning capabilities to the edge allows developers to build faster, more responsive AI experiences—without relying on cloud access.
Demo 2: Deploying Phi-4-mini with NeuroPilot SDK
This video demonstrates how to convert and run the Phi-4-mini model using the NeuroPilot SDK. With a focus on instruction-following tasks, this deployment empowers developers to build responsive, embedded AI assistants that understand and execute user commands locally. Whether it’s productivity tools or context-sensitive automation, Phi-4-mini brings natural interaction and reliability directly to the device.
Imagine the possibilities: Real world scenarios
Intelligent information access with on-device RAG
Picture this: your application intelligently accesses and reasons over on-device documents, like PDFs or internal knowledge bases, using an advanced embedding model paired with the MediaTek optimized Phi-4-mini. This enables developers to create:
- Personalized Assistants: Apps that understand user context from their own documents.
- Offline Knowledge Hubs: Providing instant access to relevant information without needing cloud connectivity.
- Enhanced Productivity Tools: Smart summarization, Q&A, and content generation based on local data.
Demo 3: Private RAG chatbot on device
People are on their mobile devices every day—saving new documents, sending messages, taking notes, and more. With how much we’re able to store on our phones and laptops, it can get hard to find specific files or pieces of information when we need them most. What if you could implement a personal assistant that understands your question and fetches exactly what you’re looking for, without you needing to dig through your device? This demo showcases a Retrieval-Augmented Generation (RAG) implementation of the Phi model embedded directly on a smartphone. The chatbot allows users to ask natural language questions and instantly retrieve relevant information from local files. Because the model runs on-device, there's no need for a cloud connection—ensuring your data stays private while still offering intelligent, context-aware result RAG based Phi-4-mini solution, so that when you searched your device, it parsed through every document to help you find the exact document you are looking for.
Stay ahead of the curve:
If you're eager to explore the Phi-4 family of models on edge devices and master building next-gen apps with MediaTek's powerful NPU, don't miss the key sessions at Microsoft Build and Computex Taipei happening this week. This is your chance to get direct insights from the experts.
- Microsoft Build 2025:
- Uncover the latest on Azure AI Foundry on May 20 during “Unveiling Latest Innovations in Azure AI Foundry Model Catalog”
- If you are in person on May 20th, catch the second lab “Fine-Tune End-to-End Distillation Models with Azure AI Foundry Models”
- Learn about Phi on Windows devices in on May 20th for “Enable seamless deployment across Intel Copilot+ AI PCs and Azure”
 
- Computex 2025 :
- MediaTek Booth (M0806) on May 20-23. See MediaTek 's AI vision and hardware innovations firsthand.
 
Resources
- Explore the Phi-4 Model Family on Azure AI Foundry and HuggingFace
- Get access to the Phi Cookbook: Your practical guide and code repository for building with Phi models.
- Learn more about Mediatek NeuroPilot
- Connect with the MediaTek Developer Application