small language model
4 TopicsAI Agents in Production: From Prototype to Reality - Part 10
This blog post, the tenth and final installment in a series on AI agents, focuses on deploying AI agents to production. It covers evaluating agent performance, addressing common issues, and managing costs. The post emphasizes the importance of a robust evaluation system, providing potential solutions for performance issues, and outlining cost management strategies such as response caching, using smaller models, and implementing router models.1KViews2likes1CommentTiny But Mighty: Unleashing the Power of Small Language Models 🚀
While Large Language Models (LLMs) like GPT-4 dominate headlines with their extensive capabilities, they often come at the cost of high computational requirements and complexity. For developers and organizations looking to implement AI solutions on edge devices or with limited resources, Small Language Models (SLMs) are emerging as a practical alternative. SLMs are not just "smaller" versions of their larger counterparts—they're designed to be faster, more efficient, and adaptable for specific tasks. With fewer parameters and lower computational needs, SLMs open the door to deploying AI on mobile devices, IoT systems, and edge environments without compromising performance. What You Stand to Learn 🧠 Introduction to Microsoft's AI Ecosystem Discover Microsoft's end-to-end AI development tools, from Azure AI Services to ONNX Runtime, enabling efficient and secure deployment of AI models across cloud and edge environments. The Advantages of SLMs over LLMs SLMs are game-changers for edge AI applications, providing faster training and inference times, reduced energy costs, and scalability across diverse devices. Hands-On with Phi-3 and ONNX Runtime Experience live demonstrations of SLMs in action with tools like Phi-3 and ONNX Runtime, showcasing how to fine-tune and deploy models on mobile devices, IoT, and hybrid cloud environments. Responsible AI Practices Understand how to safeguard your AI applications with Microsoft's Responsible AI toolkit, ensuring ethical and trustworthy deployments. Watch the Full Session 👨💻 📅 Date: December 12, 2024 ⏰ Time: 4 PM GMT | 5 PM CEST | 8 AM PT | 11 AM ET | 7 PM EAT A session packed with live demos, practical examples, and Q&A opportunities. Register NOW | Events | Microsoft Reactor Agenda 🔍 Introduction (5 min) A brief overview of the session and its focus on SLMs and LLMs. Microsoft AI Tooling (5 min) Explore the latest tools like Azure AI Services, Azure Machine Learning, and Responsible AI Tooling. How to Choose the Right Model (10 min) Key considerations such as performance, customizability, and ethical implications. Comparing SLMs vs LLMs (10 min) The strengths, weaknesses, and best use cases for both Small and Large Language Models. Deploying Models at the Edge (10 min) Insights into optimizing AI for mobile, IoT, and edge devices. Q&A Addressing participant questions about AI development and deployment.435Views2likes0CommentsA better Phi Family is coming - multi-language support, better vision, intelligence MOEs
After the release of Phi-3 at Microsoft Build 2024, it has received different attention, especially the application of Phi-3-mini and Phi-3-vision on edge devices. In the June update, we improved Benchmark and System role support by adjusting high-quality data training. In the August update, based on community and customer feedback, we brought Phi-3.5-mini-128k-instruct multi-language support, Phi-3.5-vision-128k with multi-frame image input, and provided Phi-3.5 MOE newly added for AI Agent. Next, let's take a look Multi-language support In previous versions, Phi-3-mini had good English corpus support, but weak support for non-English languages. When we tried to ask questions in Chinese, there were often some wrong questions, such as Obviously, this is a wrong answer But in the new version, we can have better understanding and corpus support with the new Chinese prediction support You can also try the enhancements in different languages, or in the scenario without fine-tuning and RAG, it is also a good model. Code Sample: https://github.com/microsoft/Phi-3CookBook/blob/main/code/09.UpdateSamples/Aug/phi3-instruct-demo.ipynb Better vision Phi-3.5-Vision enables Phi-3 to not only understand text and complete dialogues, but also have visual capabilities (OCR, object recognition, and image analysis, etc.). However, in actual application scenarios, we need to analyze multiple images to find associations, such as videos, PPTs, books, etc. In the new Phi-3-Vision, multi-frame or multi-image input is supported, so we can better complete the inductive analysis of videos, PPTs, and books in visual scenes. As shown in this video We can use OpenCV to extract key frames. We can extract 21 key frame images from the video and store them in an array. images = [] placeholder = "" for i in range(1,22): with open("../output/keyframe_"+str(i)+".jpg", "rb") as f: images.append(Image.open("../output/keyframe_"+str(i)+".jpg")) placeholder += f"<|image_{i}|>\n" Combined with Phi-3.5-Vision's chat template, we can perform a comprehensive analysis of multiple frames. This allows us to more efficiently perform dynamic vision-based work, especially in edge scenarios. Code Sample: https://github.com/microsoft/Phi-3CookBook/blob/main/code/09.UpdateSamples/Aug/phi3-vision-demo.ipynb Intelligence MOEs In order to achieve higher performance of the model, in addition to computing power, model size is one of the key factors to improve model performance. Under a limited computing resource budget, training a larger model with fewer training steps is often better than training a smaller model with more steps. Mixture of Experts Models (MoEs) have the following characteristics: Faster pre-training speed than dense models Faster inference speed than models with the same number of parameters Requires a lot of video memory because all expert systems need to be loaded into memory There are many challenges in fine-tuning, but recent research shows that instruction tuning for mixed expert models has great potential. Now there are a lot of AI Agents applications, we can use MOEs to empower AI Agents. In multi-task scenarios, the response is faster. We can explore a simple scenario where we want to use AI to help us write Twitter based on some content and translate it into Chinese and publish it to social networks. We can combine Phi-3.5 MOEs to complete this. We can use Prompt to set and arrange tasks, such as blog content publishing, translated content, and the best answer. """ sys_msg = """You are a helpful AI assistant, you are an agent capable of using a variety of tools to answer a question. Here are a few of the tools available to you: - Blog: This tool helps you describe a certain knowledge point and content, and finally write it into Twitter or Facebook style content - Translate: This is a tool that helps you translate into any language, using plain language as required - Final Answer: the final answer tool must be used to respond to the user. You must use this when you have decided on an answer. To use these tools you must always respond in JSON format containing `"tool_name"` and `"input"` key-value pairs. For example, to answer the question, "Build Muliti Agents with MOE models" you must use the calculator tool like so: { "tool_name": "Blog", "input": "Build Muliti Agents with MOE models" } Or to translate the question "can you introduce yourself in Chinese" you must respond: { "tool_name": "Search", "input": "can you introduce yourself in Chinese" } Remember just output the final result, ouput in JSON format containing `"agentid"`,`"tool_name"` , `"input"` and `"output"` key-value pairs .: [ { "agentid": "step1", "tool_name": "Blog", "input": "Build Muliti Agents with MOE models", "output": "........." }, { "agentid": "step2", "tool_name": "Search", "input": "can you introduce yourself in Chinese", "output": "........." }, { "agentid": "final" "tool_name": "Result", "output": "........." } ] The users answer is as follows. """ We can see that by telling the model what skills it needs to have and the arrangement of tasks, we can assign them to different tasks through Phi-3.5 MOEs to complete related work. Write something about Generative AI with MOEs , translate it to Chinese The result : [ { "agentid": "step1", "tool_name": "Blog", "input": "Generative AI with MOE", "output": "Generative AI with MOE (Mixture of Experts) is a powerful approach that combines the strengths of generative models and the flexibility of MOE architecture. This hybrid model can generate high-quality, diverse, and contextually relevant content, making it suitable for various applications such as content creation, data augmentation, and more." }, { "agentid": "step2", "tool_name": "Translate", "input": "Generative AI with MOE is a powerful approach that combines the strengths of generative models and the flexibility of MOE architecture. This hybrid model can generate high-quality, diverse, and contextually relevant content, making it suitable for various applications such as content creation, data augmentation, and more.", "output": "基于生成AI的MOE(Mixture of Experts)是一种强大的方法,它结合了生成模型的优势和MOE架构的灵活性。这种混合模型可以生成高质量、多样化且上下文相关的内容,使其适用于各种应用,如内容创建、数据增强等。" }, { "agentid": "final", "tool_name": "Result", "output": "基于生成AI的MOE(Mixture of Experts)是一种强大的方法,它结合了生成模型的优势和MOE架构的灵活性。这种混合模型可以生成高质量、多样化且上下文相关的内容,使其适用于各种应用,如内容创建、数据增强等。" } ] If conditions permit, we can more smoothly integrate the Phi-3 MOEs model into frameworks such as AutoGen, Semantic Kernel, and Langchain. Code Sample: https://github.com/microsoft/Phi-3CookBook/blob/main/code/09.UpdateSamples/Aug/phi3_moe_demo.ipynb Thoughts on SLMs SLMs do not replace LLMs but give GenAI a broader scenario. The update of Phi-3 allows more edge devices to have better support, including text, chat, and vision. In modern AI Agents application scenarios, we hope to have more efficient task execution efficiency. In addition to computing power, MoEs are the key to solving problems. Phi-3 is still iterating, and I hope everyone will pay more attention and give us better feedback. Resources 1. Download Microsoft Phi-3 Family https://huggingface.co/collections/microsoft/phi-3-6626e15e9585a200d2d761e3 2. Read the Phi-3 Cookbook https://aka.ms/phi-3cookbook 3. Learn about MOEs https://huggingface.co/blog/moe](https://huggingface.co/blog/moe6.7KViews1like0CommentsUnlocking the Potential of Phi-3 and C# in AI Development
Unlocking the Potential of Phi-3 and C# in AI Development: A Must-Attend Session for Technical Students Are you a technical student eager to dive into the world of AI and software development? Look no further! We're excited to invite you to an enlightening session that explores the integration of Phi-3 models with C#, presented by the cloud advocates team at Microsoft, featuring Bruno Capuano and Kinfey Lo2.1KViews2likes0Comments