Expanded model catalog for AI Toolkit

vinayakh

Microsoft

Nov 14, 2024

AI Toolkit adds support for Remote models including support Google, Anthropic and Github-hosted models

As covered in an earlier blog post, the AI Toolkit extension for Visual Studio Code (VS Code) enables users to run AI models directly within the VS Code interface. This can be particularly valuable if you are new to working with AI models but already familiar with VS Code. Alternatively, if you’re already using VS Code to develop applications for personal or organizational projects and wish to integrate AI capabilities, the AI Toolkit offers seamless integration. With features like the playground, you can experiment with various models, while the port forwarding feature enables you to run models locally, allowing you to build advanced applications or infuse existing applications with AI functionality.

A month back, AI toolkit in the version 0.5 release has started supporting models on the Github Model catalog. These include popular models such as Meta's LLama family of models, models from Mistral open AI and Phi family of models (Version 3 and the latest 3.5 releases). The team has been quickly making changes and adding features at a fast pace and released version 0.6 with expanded support more model catalog. This major release marks a significant milestone with robust new features and resources, offering support for many of today’s most popular generative AI models. Users can now access GitHub models, ONNX-optimized models that run smoothly on local CPUs (including Windows-optimized Small Language Models like the Phi family), Anthropic's Claude models, Google’s Gemini models, Meta’s LLaMA models, and more.

The enhanced model catalog offers intuitive filters and detailed model cards, making model discovery effortless. Models are categorized by host, with examples including:

GitHub: Hosts models from Ai21 Labs, Cohere, OpenAI, Mistral, and Microsoft
Anthropic: Offers Claude Sonnet (v3 and v3.5), Open3, and Haiku3 models
Google: Provides Gemini-pro (v1 and v1.5), Gemini-1.5-Flash8b, and Gemini-1.5-Flash models
OpenAI: Hosts models such as GPT-3.5, GPT-4, GPT-4o, GPT-4o-mini, o1 (mini and preview)

Additionally, ONNX models, including Microsoft’s Phi and Mistral models, are supported and can be downloaded for local use.

Important Notes:

GitHub Model Access: To access models hosted by GitHub, you’ll need to sign up for the GitHub Marketplace for models. If you don’t already have access, consider joining the waitlist. Once approved, you can authenticate with your GitHub credentials to use these models seamlessly.
API Integration for Hosted Models: For models hosted by Google, Anthropic, and OpenAI, you can now connect your accounts directly through the AI Toolkit UI. Simply enter your API keys, as highlighted in the provided screenshot, to start using these models immediately.

Once you have added the API key you can start using the model in the playground and in your application. Please note that you can also change the API later by right-clicking on the model in case you are using different API keys and endpoints for different apps.

Additionally, after this release, the Playground provides an improved, interactive testing experience, allowing users to chat with models and, for multimodal models, even include attachments in conversations such as in Claude 3.5 Sonnet below.