onnx
28 TopicsGetting Started - Generative AI with Phi-3-mini: Running Phi-3-mini in Intel AI PC
In 2024, with the empowerment of AI, we will enter the era of AI PC. On May 20, Microsoft also released the concept of Copilot + PC, which means that PC can run SLM/LLM more efficiently with the support of NPU. We can use models from different Phi-3 family combined with the new AI PC to build a simple personalized Copilot application for individuals. This content will combine Intel's AI PC, use Intel's OpenVINO, NPU Acceleration Library, and Microsoft's DirectML to create a local Copilot.32KViews2likes2CommentsBuilding Retrieval Augmented Generation on VSCode & AI Toolkit
LLMs usually have limited knowledge about specific domains. Retrieval Augmented Generation (RAG) helps LLMs be more accurate and give relevant output to specific domains and datasets. We will see how we can do this for local models using AI Toolkit,Getting Started Using Phi-3-mini-4k-instruct-onnx for Text Generation with NLP Techniques
In this tutorial, we'll cover how to use the Phi-3 mini models for text generation using NLP techniques. Whether you're a beginner or an experienced AI developer, you'll learn how to download and run these powerful tools on your own computer. From setting up the Python environment to generating responses with the generate() API, we'll provide clear instructions and code examples throughout the tutorial. So, let's get started and see what the Phi-3 mini models can do!11KViews1like1CommentUse WebGPU + ONNX Runtime Web + Transformer.js to build RAG applications by Phi-3-mini
Learn how to harness the power of WebGPU, ONNX Runtime, and Web Transformer.js to create cutting-edge Retrieval-Augmented Generation (RAG) models. Dive into this technical guide and build intelligent applications that combine retrieval and generation seamlessly.8.2KViews2likes0CommentsRunning Phi-3-vision via ONNX on Jetson Platform
Unlock the potential of NVIDIA's Jetson platform by running the Phi-3-vision model in ONNX format. Dive into the seamless process of compiling onnxruntime-genai, setting up the environment, and executing high-performance inference tasks on low-power devices like Jetson Orin Nano. Discover how to utilize quantized models efficiently, enabling robust image and text dialogue tasks, all while keeping your GPU workload-optimized. Whether you’re working with FP16 or Int 4 models, this guide will walk you through each step, ensuring you harness the full capabilities of edge AI on Jetson.6.4KViews2likes18CommentsGetting Started with the AI Dev Gallery
March Update: The Gallery is now available on the Microsoft Store! The AI Dev Gallery is a new open-source project designed to inspire and support developers in integrating on-device AI functionality into their Windows apps. It offers an intuitive UX for exploring and testing interactive AI samples powered by local models. Key features include: Quickly explore and download models from well-known sources on GitHub and HuggingFace. Test different models with interactive samples over 25 different scenarios, including text, image, audio, and video use cases. See all relevant code and library references for every sample. Switch between models that run on CPU and GPU depending on your device capabilities. Quickly get started with your own projects by exporting any sample to a fresh Visual Studio project that references the same model cache, preventing duplicate downloads. Part of the motivation behind the Gallery was exposing developers to the host of benefits that come with on-device AI. Some of these benefits include improved data security and privacy, increased control and parameterization, and no dependence on an internet connection or third-party cloud provider. Requirements Device Requirements Minimum OS Version: Windows 10, version 1809 (10.0; Build 17763) Architecture: x64, ARM64 Memory: At least 16 GB is recommended Disk Space: At least 20GB free space is recommended GPU: 8GB of VRAM is recommended for running samples on the GPU Using the Gallery The AI Dev Gallery has can be navigated in two ways: The Samples View The Models View Navigating Samples In this view, samples are broken up into categories (Text, Code, Image, etc.) and then into more specific samples, like in the Translate Text pictured below: On clicking a sample, you will be prompted to choose a model to download if you haven’t run this sample before: Next to the model you can see the size of the model, whether it will run on CPU or GPU, and the associated license. Pick the model that makes the most sense for your machine. You can also download new models and change the model for a sample later from the sample view. Just click the model drop down at the top of the sample: The last thing you can do from the Sample pane is view the sample code and export the project to Visual Studio. Both buttons are found in the top right corner of the sample, and the code view will look like this: Navigating Models If you would rather navigate by models instead of samples, the Gallery also provides the model view: The model view contains a similar navigation menu on the right to navigate between models based on category. Clicking on a model will allow you to see a description of the model, the versions of it that are available to download, and the samples that use the model. Clicking on a sample will take back over to the samples view where you can see the model in action. Deleting and Managing Models If you need to clear up space or see download details for the models you are using, you can head over the Settings page to manage your downloads: From here, you can easily see every model you have downloaded and how much space on your drive they are taking up. You can clear your entire cache for a fresh start or delete individual models that you are no longer using. Any deleted model can be redownload through either the models or samples view. Next Steps for the Gallery The AI Dev Gallery is still a work in progress, and we plan on adding more samples, models, APIs, and features, and we are evaluating adding support for NPUs to take the experience even further If you have feedback, noticed a bug, or any ideas for features or samples, head over to the issue board and submit an issue. We also have a discussion board for any other topics relevant to the Gallery. The Gallery is an open-source project, and we would love contribution, feedback, and ideation! Happy modeling!6.2KViews5likes3CommentsTrain your Model on Spark/Databricks, score it on ADX
Are you using Spark/Databricks to build Machine Learning models? Do you need to score new data that is streamed into Azure Data Explorer? If this is your scenario please read on! In this blog we show how to train an ML model on Azure Databricks, export it to ADX, and score new samples directly on ADX, in near real time, using inline Python code embedded in KQL query.6.1KViews2likes4CommentsGPU compute within Windows Subsystem for Linux 2 supports AI and ML workloads
Adding GPU compute support to WSL has been our #1 most requested feature since the first release. Over the last few years, the WSL, Virtualization, DirectX, Windows Driver, Windows AI teams, and our silicon partners have been working hard to deliver this capability.6.1KViews2likes0Comments