ml catalog
11 TopicsGenRec Direct Learning: Moving Ranking from Feature Pipelines to Token-Native Sequence Modeling
Authors: Chunlong Yu, Han Zheng, Jie Zhu, I-Hong Jhuo, Li Xia, Lin Zhu, Sawyer Shen TL;DR Most modern ranking stacks rely on large generative models as feature extractors, flattening their outputs into vectors that are then fed into downstream rankers. While effective, this pattern introduces additional pipeline complexity and often dilutes token‑level semantics. GenRec Direct Learning (DirL) explores a different direction: using a generative, token‑native sequential model as the ranking engine itself. In this formulation, ranking becomes an end‑to‑end sequence modeling problem over user behavior, context, and candidate items—without an explicit feature‑extraction stage. Why revisit the classic L2 ranker design? Large‑scale recommender systems have historically evolved as layered pipelines: more signals lead to more feature plumbing, which in turn introduces more special cases. In our previous L2 ranking architecture, signals were split into dense and sparse branches and merged late in the stack (Fig. 1). As the system matured, three recurring issues became increasingly apparent. Figure 1: traditional ranking DNN 1) Growing pipeline surface area Each new signal expands the surrounding ecosystem—feature definitions, joins, normalization logic, validation, and offline/online parity checks. Over time, this ballooning surface area slows iteration, raises operational overhead, and increases the risk of subtle production inconsistencies. 2) Semantics diluted by flattening Generative models naturally capture rich structure: token‑level interactions, compositional meaning, and contextual dependencies. However, when these representations are flattened into sparse or dense feature vectors, much of that structure is lost—undermining the very semantics that make generative representations powerful. 3) Sequence modeling is treated as an add-on While traditional rankers can ingest history features, modeling long behavioral sequences and fine‑grained temporal interactions typically requires extensive manual feature engineering. As a result, sequence modeling is often bolted on rather than treated as a first‑class concern. DirL goal: treat ranking as native sequence learning, not as “MLP over engineered features.” What “Direct Learning” means in DirL The core shift behind Direct Learning (DirL) is simple but fundamental. Instead of the conventional pipeline: generative model → embeddings → downstream ranker, DirL adopts an end‑to‑end formulation: tokenized sequence → generative sequential model → ranking score(s). In DirL, user context, long‑term behavioral history, and candidate item information are all represented within a single, unified token sequence. Ranking is then performed directly by a generative, token‑native sequential model. This design enables several key capabilities: Long‑term behavior modeling beyond short summary windows The model operates over extended user histories, allowing it to capture long‑range dependencies and evolving interests that are difficult to represent with fixed‑size aggregates. Fine‑grained user–content interaction learning By modeling interactions at the token level, DirL learns detailed behavioral and content patterns rather than relying on coarse, pre‑engineered features. Preserved cross‑token semantics within the ranking model Semantic structure is maintained throughout the ranking process, instead of being collapsed into handcrafted dense or sparse vectors before scoring. Architecture overview (from signals to ranking) 1) Unified Tokenization All inputs in DirL are converted into a shared token embedding space, allowing heterogeneous signals to be modeled within a single sequential backbone. Conceptually, each input sequence consists of three token types: User / context tokens These tokens encode user or request‑level information, such as age or cohort‑like attributes, request or canvas context, temporal signals (e.g., day or time), and user‑level statistics like historical CTR. History tokens These represent prior user interactions over time, including signals such as engaged document IDs, semantic or embedding IDs, and topic‑like attributes. Each interaction is mapped to a token, preserving temporal order and enabling long‑range behavior modeling. Candidate tokens Each candidate item to be scored is represented as a token constructed from document features and user–item interaction features. These features are concatenated and projected into a fixed‑dimensional vector via an MLP, yielding a token compatible with the shared embedding space. Categorical features are embedded directly, while dense numerical signals are passed through MLP layers before being fused into their corresponding tokens. As a result, the model backbone consumes a sequence of the form: [1 user/context token] + [N history tokens] + [1 candidate token] 2) Long-sequence modeling backbone (HSTU) To model long input sequence, DirL adopts a sequential backbone designed to scale beyond naïve full attention. In the current setup, the backbone consists of stacked HSTU layers with multi‑head attention and dropout for regularization. The hidden state of the candidate token from the final HSTU layer is then fed into an MMoE module for scoring. 3) Multi-task prediction head (MMoE) Ranking typically optimizes multiple objectives (e.g., engagement‑related proxies). DirL employs a multi‑gate mixture‑of‑experts (MMoE) layer to support multi‑task prediction while sharing representation learning. The MMoE layer consists of N shared experts and one task‑specific expert per task. For each task, a gating network produces a weighted combination of the shared experts and the task‑specific expert. The aggregated representation is then fed into a task‑specific MLP head to produce the final prediction. Figure 2: DirL structure Early experiments: what worked and what didn’t What looked promising Early results indicate that a token‑native setup improves both inhouse evaluation metrics and online engagement (time spent per UU), suggesting that modeling long behavior sequences in a unified token space is directionally beneficial. The hard part: efficiency and scale The same design choices that improve expressiveness also raise practical hurdles: Training velocity slows down: long-sequence modeling and larger components can turn iteration cycles from hours into days, making ablations expensive. Serving and training costs increase: large sparse embedding tables + deep sequence stacks can dominate memory and compute. Capacity constraints limit rollout speed: Hardware availability and cost ceilings become a gating factor for expanding traffic and experimentation. In short: DirL’s main challenge isn’t “can it learn the right dependencies?”—it’s “can we make it cheap and fast enough to be a production workhorse?” Path to production viability: exploratory directions Our current work focuses on understanding how to keep the semantic benefits of token‑native modeling while exploring options that could help reduce overall cost. 1) Embedding tables consolidate and prune oversized sparse tables rely more on shared token representations where possible 2) Right-size the sequence model reduce backbone depth where marginal gains flatten evaluate minimal effective token sets—identify which tokens actually move metrics. explore sequence length vs. performance curves to find the “knee” 3) Inference and systems optimization dynamic batching tuned for token-native inference kernel fusion and graph optimizations quantization strategies that preserve ranking model behavior Why this direction matters DirL explores a broader shift in recommender systems—from feature‑heavy pipelines with shallow rankers toward foundation‑style sequential models that learn directly from user trajectories. If token‑native ranking can be made efficient, it unlocks several advantages: Simpler modeling interfaces, with fewer feature‑plumbing layers. Stronger semantic utilization, reducing information loss from aggressive flattening. A more natural path to long‑term behavior and intent modeling. Early signals are encouraging. The next phase is about translating this promise into practice—making the approach scalable, cost‑efficient, and fast enough to iterate as a production system. Using Microsoft Services to Enable Token‑Native Ranking Research This work was developed and validated within Microsoft’s internal machine learning and experimentation ecosystem. Training data was derived from seven days of MSN production logs and user behavior labels, encompassing thousands of features, including numerical, ID‑based, cross, and sequential features. Model training was performed using a PyTorch‑based deep learning framework built by the MSN infrastructure team and executed on Azure Machine Learning with a single A100 GPU. For online serving, the trained model was deployed on DLIS, Microsoft’s internal inference platform. Evaluation was conducted through controlled online experiments on the Azure Exp platform, enabling validation of user engagement signals under real production traffic. Although the implementation leverages Microsoft’s internal platforms, the core ideas behind DirL are broadly applicable. Practitioners interested in exploring similar approaches may consider the following high‑level steps: Construct a unified token space that captures user context, long‑term behavior sequences, and candidate items. Apply a long‑sequence modeling backbone to learn directly from extended user trajectories. Formulate ranking as a native sequence modeling problem, scoring candidates from token‑level representations. Evaluate both model effectiveness and system efficiency, balancing gains in expressiveness against training and serving cost. Call to action We encourage practitioners and researchers working on large‑scale recommender systems to experiment with token‑native ranking architectures alongside traditional feature‑heavy pipelines, compare trade‑offs in modeling power and system efficiency, and share insights on when direct sequence learning provides practical advantages in production environments. Acknowledgement: We would like to acknowledge the support and contributions from several colleagues who helped make this work possible. We thank Gaoyuan Jiang and Lightning Huang for their assistance with model deployment, Jianfei Wang for support with the training platform, Gong Cheng for ranker monitoring, Peiyuan Xu for sequential feature logging, and Chunhui Han and Peng Hu for valuable discussions on model design.Essential Microsoft Resources for MVPs & the Tech Community from the AI Tour
Unlock the power of Microsoft AI with redeliverable technical presentations, hands-on workshops, and open-source curriculum from the Microsoft AI Tour! Whether you’re a Microsoft MVP, Developer, or IT Professional, these expertly crafted resources empower you to teach, train, and lead AI adoption in your community. Explore top breakout sessions covering GitHub Copilot, Azure AI, Generative AI, and security best practices—designed to simplify AI integration and accelerate digital transformation. Dive into interactive workshops that provide real-world applications of AI technologies. Take it a step further with Microsoft’s Open-Source AI Curriculum, offering beginner-friendly courses on AI, Machine Learning, Data Science, Cybersecurity, and GitHub Copilot—perfect for upskilling teams and fostering innovation. Don’t just learn—lead. Access these resources, host impactful training sessions, and drive AI adoption in your organization. Start sharing today! Explore now: Microsoft AI Tour Resources.Model Mondays Season 2: Learn to Choose & Use the Right AI Models with Azure AI
Skill Up on the Latest AI Models & Tools with Model Mondays – Season 2 The world of AI is evolving at lightning speed. With over 11,000 models now available in the Azure AI Foundry catalog—including frontier models from top providers and thousands of open-source variants—developers face a new challenge: How do you choose the right model for your task? That’s where Model Mondays comes in. What Is Model Mondays? Model Mondays is a weekly livestream and AMA series hosted on https://developer.microsoft.com/en-us/reactor/ and the Azure AI Foundry Discord. It’s designed to help developers like you build your Model IQ one spotlight at a time. Each 30-minute episode includes: 5-min Highlights: Catch up on the latest model-related news. 15-min Spotlight: Deep dive into a specific model, model family, or tool. Live Q&A: Ask questions during the stream or join the Friday AMA on Discord. Whether you're just starting out or already building AI-powered apps, this series will help you stay current and confident in your model choices. Season 2 Starts June 16 – Register Now! We’re kicking off Season 2 with three powerful episodes: 🔹 EP1: Advanced Reasoning Models 🗓️ https://developer.microsoft.com/en-us/reactor/events/25905/ 🔹 EP2: Model Context Protocol (MCP) 🗓️ https://developer.microsoft.com/en-us/reactor/events/25906/ 🔹 EP3: SLMs and Reasoning (Phi-4 Ecosystem) 🗓️ https://developer.microsoft.com/en-us/reactor/events/25907/ Why Should You Join? Stay Ahead: Learn about the latest models, tools, and trends in AI. Get Hands-On: Explore real-world use cases and demos. Build Smarter: Discover how to evaluate, fine-tune, and deploy models effectively. Connect: Join the community on Discord and get your questions answered. Quick Links 📚 https://aka.ms/model-mondays 🎥 https://aka.ms/model-mondays/playlist 💬 https://aka.ms/model-mondays/discord Bonus: Learn from Microsoft Build 2025 If you missed Microsoft Build, now’s the time to catch up. Azure AI Foundry is expanding fast—with new tools like Model Router, AI Evaluations SDK, and Foundry Portal making it easier than ever to build, test, and deploy AI apps. Check out http://aka.ms/learnatbuild for the top 10 things you need to know. Ready to Build? Whether you're exploring edge models, open-source AI, or fine-tuning GPTs, Model Mondays will help you level up your skills and build confidently on Azure. Let’s build our model IQ together. See you on June 16!Exploring Azure AI Model Inference: A Comprehensive Guide
Azure AI model inference provides access to a wide range of flagship models from leading providers such as AI21 Labs, Azure OpenAI, Cohere, Core42, DeepSeek, Meta, Microsoft, Mistral AI, and NTT Data https://learn.microsoft.com/azure/ai-foundry/model-inference/concepts/models . These models can be consumed as APIs, allowing you to integrate advanced AI capabilities into your applications seamlessly. Model Families and Their Capabilities Azure AI Foundry categorises its models into several families, each offering unique capabilities: AI21 Labs: Known for the Jamba family models, which are production-grade large language models (LLMs) using AI21's hybrid Mamba-Transformer architecture. These models support chat completions, tool calling, and multiple languages including English, French, Spanish, Portuguese, German, Arabic, and Hebrew. https://learn.microsoft.com/azure/ai-foundry/model-inference/concepts/models Azure OpenAI: Offers diverse models designed for tasks such as reasoning, problem-solving, natural language understanding, and code generation. These models support text and image inputs, multiple languages, and tool calling https://learn.microsoft.com/azure/ai-foundry/model-inference/concepts/models Cohere: Provides models for embedding and command tasks, supporting multilingual capabilities and various response formats https://learn.microsoft.com/azure/ai-foundry/model-inference/concepts/models Core42: Features the Jais-30B-chat model, designed for chat completions https://learn.microsoft.com/azure/ai-foundry/model-inference/concepts/models DeepSeek: Includes models like DeepSeek-V3 and DeepSeek-R1, focusing on advanced AI tasks https://learn.microsoft.com/azure/ai-foundry/model-inference/concepts/models Meta: Offers the Llama series models, which are instruction-tuned for various AI tasks https://learn.microsoft.com/azure/ai-foundry/model-inference/concepts/models Microsoft: Provides the Phi series models, supporting multimodal instructions and vision tasks https://learn.microsoft.com/azure/ai-foundry/model-inference/concepts/models Mistral AI: Features models like Ministral-3B and Mistral-large, designed for high-performance AI tasks https://learn.microsoft.com/azure/ai-foundry/model-inference/concepts/models NTT Data: Offers the Tsuzumi-7b model, focusing on specific AI capabilities https://learn.microsoft.com/azure/ai-foundry/model-inference/concepts/models Deployment and Integration Azure AI model inference supports global standard deployment, ensuring consistent throughput and performance. Models can be deployed in various configurations, including regional deployments and sovereign clouds such as Azure Government, Azure Germany, and Azure China https://learn.microsoft.com/azure/ai-foundry/model-inference/concepts/models To integrate these models into your applications, you can use the Azure AI model inference API, which supports multiple programming languages including Python, C#, JavaScript, and Java. This flexibility allows you to deploy models multiple times under different configurations, providing a robust and scalable solution for your AI needs https://learn.microsoft.com/en-us/azure/ai-foundry/model-inference/overview Conclusion Azure AI model inference in Azure AI Foundry offers a comprehensive solution for integrating advanced AI models into your applications. With a wide range of models from leading providers, flexible deployment options, and robust API support, Azure AI Foundry empowers you to leverage cutting-edge AI capabilities without the complexity of hosting and managing the infrastructure. Explore the Azure AI model catalog today and unlock the potential of AI for your business. Join the Conversation on Azure AI Foundry Discussions! Have ideas, questions, or insights about AI? Don't keep them to yourself! Share your thoughts, engage with experts, and connect with a community that’s shaping the future of artificial intelligence. 👉 Click here to join the discussion!Week 2 . Microsoft Agents Hack Online Events and Readiness Resources
https://aka.ms/agentshack 2025 is the year of AI agents! But what exactly is an agent, and how can you build one? Whether you're a seasoned developer or just starting out, this FREE three-week virtual hackathon is your chance to dive deep into AI agent development. Register Now: https://aka.ms/agentshack 🔥 Learn from expert-led sessions streamed live on YouTube, covering top frameworks like Semantic Kernel, Autogen, the new Azure AI Agents SDK and the Microsoft 365 Agents SDK. Week 2 Events: April 14th-18th Day/Time Topic Track 4/14 08:00 AM PT Building custom engine agents with Azure AI Foundry and Visual Studio Code Copilots 4/15 07:00 AM PT Your first AI Agent in JS with Azure AI Agent Service JS 4/15 09:00 AM PT Building Agentic Applications with AutoGen v0.4 Python 4/15 12:00 PM PT AI Agents + .NET Aspire C# 4/15 03:00 PM PT Prototyping AI Agents with GitHub Models Python 4/16 04:00 AM PT Multi-agent AI apps with Semantic Kernel and Azure Cosmos DB C# 4/16 06:00 AM PT Building declarative agents with Microsoft Copilot Studio & Teams Toolkit Copilots 4/16 07:00 AM PT Prompting is the New Scripting: Meet GenAIScript JS 4/16 09:00 AM PT Building agents with an army of models from the Azure AI model catalog Python 4/16 12:00 PM PT Multi-Agent API with LangGraph and Azure Cosmos DB Python 4/16 03:00 PM PT Mastering Agentic RAG Python 4/17 06:00 AM PT Build your own agent with OpenAI, .NET, and Copilot Studio C# 4/17 09:00 AM PT Building smarter Python AI agents with code interpreters Python 4/17 12:00 PM PT Building Java AI Agents using LangChain4j and Dynamic Sessions Java 4/17 03:00 PM PT Agentic Voice Mode Unplugged Python1.4KViews0likes0CommentsAI Toolkit for Visual Studio Code Now Supports NVIDIA NIM Microservices for RTX AI PCs
AI Toolkit now supports NVIDIA NIM™ microservice-based foundation models for inference testing in the model playground and advanced features like bulk run, evaluation and building prompts. This collaboration helps AI Engineers streamline development processes with foundational AI models. About AI Toolkit AI Toolkit is a VS Code extension for AI engineers to build, deploy, and manage AI solutions. It includes model and prompt-centric features that allow users to explore and test different AI models, create and evaluate prompts, and perform model finetuning, all from within VS Code. Since its preview launch in 2024, AI Toolkit has helped developers worldwide learn about generative AI models and start building AI solutions. NVIDIA NIM Microservices This January, NVIDIA announced that state-of-the-art AI models spanning language, speech, animation and vision capabilities - offered as NVIDIA NIM microservices - can now run locally on NVIDIA RTX AI PCs. These microservices prepackage optimized AI models with all the necessary runtime components for deployment across NVIDIA GPUs. Developers can now develop and deploy anywhere with the same unified experience and software stack across RTX AI PCs and workstations to the cloud. Developers can jumpstart their AI development journey by downloading and running NIM containers quickly on Windows 11 PCs with GeForce RTX GPUs using Windows Subsystem for Linux (WSL2). The Power of Collaboration Integrating AI Toolkit with NIM provides AI engineers with a more cohesive and efficient workflow: Seamlessly integrate AI Toolkit with NIM to create a unified development environment without the need to switch context. Users can access any NIM supported models from AI Toolkit. Leverage the combined capabilities of both tools to streamline workflows and accelerate AI solution development process around foundation AI models, from within VS Code. How to get started Follow these steps to begin leveraging the power of NIM on AI Toolkit: Download and install the latest version of AI Toolkit for VS Code. Install NIM pre-requisites on RTX PCs using the instructions here. Select a NIM model from the model catalog on AI Toolkit and load it in Playground. Optionally, you can also add your NIM model hosted in the cloud to AI Toolkit by URL Explore NIM models from the playground Start developing prompts with new NIM models in AI Toolkit! Looking Forward We invite you to explore the possibilities of this integration and take your development projects to new heights! Try AI Toolkit today – and please continue sharing your feedback. Stay tuned for more updates and detailed tutorials on how to maximize the benefits of this exciting new collaboration. Together, we are shaping the future of AI development!Learn how to develop innovative AI solutions with updated Azure skilling paths
The rapid evolution of generative AI is reshaping how organizations operate, innovate, and deliver value. Professionals who develop expertise in generative AI development, prompt engineering, and AI lifecycle management are increasingly valuable to organizations looking to harness these powerful capabilities while ensuring responsible and effective implementation. In this blog, we’re excited to share our newly refreshed series of Plans on Microsoft Learn that aim to supply your team with the tools and knowledge to leverage the latest AI technologies, including: Find the best model for your generative AI solution with Azure AI Foundry Create agentic AI solutions by using Azure AI Foundry Build secure and responsible AI solutions and manage generative AI lifecycles From sophisticated AI agents that can autonomously perform complex tasks to advanced chat models that enable natural human-AI collaboration, these technologies are becoming essential business tools rather than optional enhancements. Let’s take a look at the latest developments and unlock their full potential with our curated training resources from Microsoft Learn. Simplify the process of choosing an AI model with Azure AI Foundry Choosing the optimal generative AI model is essential for any solution, requiring careful evaluation of task complexity, data requirements, and computational constraints. Azure AI Foundry streamlines this decision-making process by offering diverse pre-trained models, fine-tuning capabilities, and comprehensive MLOps tools that enable businesses to test, optimize, and scale their AI applications while maintaining enterprise-grade security and compliance. Our Plan on Microsoft Learn titled Find the best model for your generative AI solution with Azure AI Foundry will guide you through the process of discovering and deploying the best models for creating generative AI solutions with Azure AI Foundry, including: Learn about the differences and strengths of various language models Find out how to integrate and use AI models in your applications to enhance functionality and user experience. Rapidly create intelligent, market-ready multimodal applications with Azure models, and explore industry-specific models. In addition, you’ll have the chance to take part in a Microsoft Azure Virtual Training Day, with interactive sessions and expert guidance to help you skill up on Azure AI features and capabilities. By engaging with this Plan on Microsoft Learn, you’ll also have the chance to prove your skills and earn a Microsoft Certification. Leap into the future of agentic AI solutions with Azure After choosing the right model for your generative AI purposes, our next Plan on Microsoft Learn goes a step further by introducing agentic AI solutions. A significant evolution in generative AI, agentic AI solutions enable autonomous decision-making, problem-solving, and task execution without constant human intervention. These AI agents can perceive their environment, adapt to new inputs, and take proactive actions, making them valuable across various industries. In the Create agentic AI solutions by using Azure AI Foundry Plan on Microsoft Learn, you’ll find out how developing agentic AI solutions requires a platform that provides scalability, adaptability, and security. With pre-built AI models, MLOps tools, and deep integrations with Azure services, Azure AI Foundry simplifies the development of custom AI agents that can interact with data, make real-time decisions, and continuously learn from new information. You’ll also: Learn how to describe the core features and capabilities of Azure AI Foundry, provision and manage Azure AI resources, create and manage AI projects, and determine when to use Azure AI Foundry. Discover how to customize with RAG in Azure AI Foundry, Azure AI Foundry SDK, or Azure OpenAI Service to look for answers in documents. Learn how to use Azure AI Agent Service, a comprehensive suite of feature-rich, managed capabilities, to bring together the models, data, tools, and services your enterprise needs to automate business processes There’s also a Microsoft Virtual Training Day featuring interactive sessions and expert guidance, and you can validate your skills by earning a Microsoft Certification. Safeguard your AI systems for security and fairness Widespread AI adoption demands rigorous security, fairness, and transparency safeguards to prevent bias, privacy breaches, and vulnerabilities that lead to unethical outcomes or non-compliance. Organizations must implement responsible AI through robust data governance, explainability, bias mitigation, and user safety protocols, while protecting sensitive data and ensuring outputs align with ethical standards. Our third Plan on Microsoft Learn, Build secure and responsible AI solutions and manage generative AI lifecycles, is designed to introduce the basics of AI security and responsible AI to help increase the security posture of AI environments. You’ll not only learn how to evaluate and improve generative AI outputs for quality and safety, but you’ll also: Gain an understanding of the basic concepts of AI security and responsible AI to help increase the security posture of AI environments. Learn how to assess and improve generative AI outputs for quality and safety. Discover how to help reduce risks by using Azure AI Content Safety to detect, moderate, and manage harmful content. Learn more by taking part in an interactive, expert-guided Microsoft Virtual Training Day to deepen your understanding of core AI concepts. Got a skilling question? Our new Ask Learn AI assistant is here to help Beyond our comprehensive Plans on Microsoft Learn, we’re also excited to introduce Ask Learn, our newest skilling innovation! Ask Learn is an AI assistant that can answer questions, clarify concepts, and define terms throughout your training experience. Ask Learn is your Copilot for getting skilled in AI, helping to answer your questions within the Microsoft Learn interface, so you don’t have to search elsewhere for the information. Simply click the Ask Learn icon at the top corner of the page to activate! Begin your generative AI skilling journey with curated Azure skilling Plans Azure AI Foundry provides the necessary platform to train, test, and deploy AI solutions at scale, and with the expert-curated skilling resources available in our newly refreshed Plans on Microsoft learn, your teams can accelerate the creation of intelligent, self-improving AI agents tailored to your business needs. Get started today! Find the best model for your generative AI solution with Azure AI Foundry Create agentic AI solutions by using Azure AI Foundry Build secure and responsible AI solutions and manage generative AI lifecyclesAI Toolkit for VS Code January Update
AI Toolkit is a VS Code extension aiming to empower AI engineers in transforming their curiosity into advanced generative AI applications. This toolkit, featuring both local-enabled and cloud-accelerated inner loop capabilities, is set to ease model exploration, prompt engineering, and the creation and evaluation of generative applications. We are pleased to announce the January Update to the toolkit with support for OpenAI's o1 model and enhancements in the Model Playground and Bulk Run features. What's New? January’s update brings several exciting new features to boost your productivity in AI development. Here's a closer look at what's included: Support for OpenAI’s new o1 Model: We've added access to GitHub hosted OpenAI’s latest o1 model. This new model replaces the o1-preview and offers even better performance in handling complex tasks. You can start interacting with the o1 model within VS Code for free by using the latest AI Toolkit update. Chat History Support in Model Playground: We have heard your feedback that tracking past model interactions is crucial. The Model Playground has been updated to include support for chat history. This feature saves chat history as individual files stored entirely on your local machine, ensuring privacy and security. Bulk Run with Prompt Templating: The Bulk Run feature, introduced in the AI Toolkit December release, now supports prompt templating with variables. This allows users to create templates for prompts, insert variables, and run them in bulk. This enhancement simplifies the process of testing multiple scenarios and models. Stay tuned for more updates and enhancements as we continue to innovate and support your journey in AI development. Try out the AI Toolkit for Visual Studio Code, share your thoughts, and file issues and suggest features in our GitHub repo. Thank you for being a part of this journey with us!AI Toolkit for Visual Studio Code: October 2024 Update Highlights
The AI Toolkit’s October 2024 update revolutionizes Visual Studio Code with game-changing features for developers, researchers, and enthusiasts. Explore multi-model integration, including GitHub Models, ONNX, and Google Gemini, alongside custom model support. Dive into multi-modal capabilities for richer AI testing and seamless multi-platform compatibility across Windows, macOS, and Linux. Tailored for productivity, the enhanced Model Catalog simplifies choosing the best tools for your projects. Try it now and share feedback to shape the future of AI in VS Code!3.1KViews4likes0Comments