model catalog
63 TopicsRAFT: A new way to teach LLMs to be better at RAG
In this article, we will look at the limitations of RAG and domain-specific Fine-tuning to adapt LLMs to existing knowledge and how a team of UC Berkeley researchers, Tianjun Zhang and Shishir G. Patil, may have just discovered a better approach.109KViews7likes5CommentsMeta’s next generation model, Llama 3.1 405B is now available on Azure AI
Microsoft, in collaboration with Meta, is launching Llama 3.1 405B, now available via Azure AI’s Models as a Service. Also introducing fine-tuned versions of Llama 3.1 8B and 70B. Leverage powerful AI for synthetic data generation and distillation. Access these models and more through Azure AI Studio and popular developer tools like prompt flow, OpenAI, LangChain, LiteLLM, and more. Streamline development and enhance efficiency with Azure AI.48KViews3likes7CommentsNew Hugging Face Models on Azure AI: Multilingual, SLM and BioMed- July 2024 Update
In May, we announced a deepened partnership with Hugging Face and we continue to add more leading-edge Hugging Face models to the Azure AI model catalog on a monthly basis. We have 1550+ models in the Hugging Face collection and we added 20+ models in July including highly-ranked multilingual models, tuned for Asian languages like Mandarin, Japanese, Indonesian, Thai, Malay, Vietnamese.28KViews2likes0CommentsGPT‑5.1 in Foundry: A Workhorse for Reasoning, Coding, and Chat
The pace of AI innovation is accelerating, and developers—across startups and global enterprises—are at the heart of this transformation. Today marks a significant moment for enterprise AI innovation: Azure AI Foundry is unveiling OpenAI’s GPT-5.1 series, the next generation of reasoning, analytics, and conversational intelligence. The following models will be rolling out in Foundry today: GPT-5.1: adaptive, more efficient reasoning GPT-5.1-chat: chat with new chain-of-thought for end-users GPT-5.1-codex: optimized for long-running conversations with enhanced tools and agentic workflows GPT-5.1-codex-mini: a compact variant for resource-constrained environments What’s new with GPT-5.1 series The GPT-5.1 series is built to respond faster to users in a variety of situations with adaptive reasoning, improving latency and cost efficiency across the series by varying thinking time more significantly. This, combined with other tooling improvements, enhanced stepwise reasoning visibility, multimodal intelligence, and enterprise-grade compliance. GPT-5.1: Adaptive and Efficient Reasoning GPT-5.1 is the mainline model engineered to deliver adaptive, stepwise reasoning that adjusts its approach based on the complexity of each task. Core capabilities included: Adaptive reasoning for nuanced, context-aware thinking time Multimodal intelligence: supporting text, image, and audio inputs/outputs Enterprise-grade performance, security, and compliance This model’s flexibility empowers developers to tackle a wide spectrum of tasks—from simple queries to deep, multi-step workflows for enterprise-grade solutions. With its ability to intelligently balance speed, cost, and intelligence, GPT-5.1 sets a new standard for both performance and efficiency in AI-powered development. GPT-5.1-chat: Elevating Interactive Experiences with Smart, Safe Conversations GPT-5.1-chat powers fast, context-aware chat experiences with adaptive reasoning and robust safety guardrails. With chain-of-thought added in the chat for the first time, it brings an interactive experience to the next level. It’s tuned for safety and instruction-following, making it ideal for customer support, IT helpdesk, HR, and sales enablement. Multimodal chat (text, image, and audio) improves long-turn consistency for real problem solving, delivering brand-aligned, safe conversations, and supporting next-best-action recommendations. GPT-5.1-codex and GPT-5.1-codex-mini: Frontier Models for Agentic Coding GPT-5.1-codex builds on the foundation set by GPT-5-codex, advancing developer tooling with: Enhanced reasoning frameworks for stepwise, context-aware code analysis and generation; plus Enhanced tool handling for certain development scenario's Multimodal intelligence for richer developer experiences when coding With Foundry’s enterprise-grade security and governance, GPT-5.1-codex is ideal for automated code generation and review, accelerating development cycles with intelligent code suggestions, refactoring, and bug detection. GPT-5.1-codex-mini is a compact, efficient variant optimized for resource-constrained environments. It maintains near state-of-the-art performance, multimodal intelligence, and the same safety stack and tool access as GPT-5.1-codex, making it best for cost-effective, scalable solutions in education, startups, and cost-conscience settings. Together, these Codex models empower teams to innovate faster and with greater confidence. Selecting Your AI Engine: Match Model Strengths to Your Business Goals One of the advantages of the GPT-5.1 series is unified access to deep reasoning, adaptive chat, and advanced coding—all in one place. Here’s how to match model strengths to your needs: Opt for GPT-5.1 for general ai application use—tasks like analytics, research, legal/financial review, or consolidating large documents and codebases. It’s the model of choice for reliability and high-impact outputs. Go with GPT-5.1-chat for interactive assistants and product UX, especially when adaptive reasoning is required for complex cases. Reasoning hints and adaptive reasoning help with customer latency perception. Leverage GPT-5.1-codex for deep, stepwise reasoning in complex code generation, refactoring, or multi-step analysis—ideal for demanding agentic workflows and enterprise automation. Utilize GPT-5.1-codex-mini for efficient, cost-effective coding intelligence in broad-scale deployment, education, or resource-constrained environments—delivering near-mainline performance in a compact model. Deployment and Pricing Model Deployment Available Regions Pricing ($/million tokens) Input Cached Input Output GPT-5.1 Standard Global Global $1.25 $0.125 $10.00 Standard Data Zone Data Zone (US & EU) $1.38 $0.14 $11.00 GPT-5.1-chat Standard Global Global $1.25 $0.125 $10.00 GPT-5.1-codex Standard Global Global $1.25 $0.125 $10.00 GPT-5.1-codex-mini Standard Global Global $0.25 $0.025 $2.00 Start Building Today The GPT-5.1 series is now available in Foundry Models. Whether you’re building for enterprise, small and medium-sized business, or launching the next digital-native app, these models and the Foundry platform are designed to help you innovate faster, safer, and at scale.22KViews1like22Comments