model catalog
63 TopicsOptiMind: A small language model with optimization expertise
Turning a real world decision problem into a solver ready optimization model can take days—sometimes weeks—even for experienced teams. The hardest part is often not solving the problem; it’s translating business intent into precise mathematical objectives, constraints, and variables. OptiMind is designed to try and remove that bottleneck. This optimization‑aware language model translates natural‑language problem descriptions into solver‑ready mathematical formulations, can help organizations move from ideas to decisions faster. Now available through public preview as an experimental model through Microsoft Foundry, OptiMind targets one of the more expertise‑intensive steps in modern optimization workflows. Addressing the Optimization Bottleneck Mathematical optimization underpins many enterprise‑critical decisions—from designing supply chains and scheduling workforces to structuring financial portfolios and deploying networks. While today’s solvers can handle enormous and complex problem instances, formulating those problems remains a major obstacle. Defining objectives, constraints, and decision variables is an expertise‑driven process that often takes days or weeks, even when the underlying business problem is well understood. OptiMind tries to address this gap by automating and accelerating formulation. Developed by Microsoft Research, OptiMind transforms what was once a slow, error‑prone modeling task into a streamlined, repeatable step—freeing teams to focus on decision quality rather than syntax. What makes OptiMind different? OptiMind is not just as a language model, but as a specialized system built for real-world optimization tasks. Unlike general-purpose large language models adapted for optimization through prompting, OptiMind is purpose-built for mixed integer linear programming (MILP), and its design reflects this singular focus. At inference time, OptiMind follows a multi‑stage process: Problem classification (e.g., scheduling, routing, network design) Hint retrieval tailored to the identified problem class Solution generation in solver‑compatible formats such as GurobiPy Optional self‑correction, where multiple candidate formulations are generated and validated This design can improve reliability without relying on agentic orchestration or multiple large models. In internal evaluations on cleaned public benchmarks—including IndustryOR, Mamo‑Complex, and OptMATH—OptiMind demonstrated higher formulation accuracy than similarly sized open models and competitive performance relative to significantly larger systems. OptiMind improved accuracy by approximately 10 percent over the base model. In comparison to open-source models under 32 billion parameters, OptiMind was also found to match or exceed performance benchmarks. For more information on the model, please read the official research blog or the technical paper for OptiMind. Practical use cases: Unlocking efficiency across domains OptiMind is especially valuable where modeling effort—not solver capability—is the primary bottleneck. Typical use cases include: Supply Chain Network Design: Faster formulation of multi‑period network models and logistics flows Manufacturing and Workforce Scheduling: Easier capacity planning under complex operational constraints Logistics and Routing Optimization: Rapid modeling that captures real‑world constraints and variability Financial Portfolio Optimization: More efficient exploration of portfolios under regulatory and market constraints By reducing the time and expertise required to move from problem description to validated model, OptiMind helps teams reach actionable decisions faster and with greater confidence. Getting started OptiMind is available today as an experimental model, and Microsoft Research welcomes feedback from practitioners and enterprise teams. Next steps: Explore the research details: Read more about the model on Foundry Labs and the technical paper on arXiv Try the model: Access OptiMind through Microsoft Foundry Test sample code: Available in the OptiMind GitHub repository Take the next step in optimization innovation with OptiMind—empowering faster, more accurate, and cost-effective problem solving for the future of decision intelligence.106Views0likes0CommentsBlack Forest Labs FLUX.2 Visual Intelligence for Enterprise Creative now on Microsoft Foundry
Black Forest Labs’ (BFL) FLUX.2 is now available on Microsoft Foundry. Building on FLUX1.1 [pro] and FLUX.1 Kontext [pro], we’re excited to introduce FLUX.2 [pro] which continues to push the frontier for visual intelligence. FLUX.2 [pro] delivers state-of-the-art quality with pre-optimized settings, matching the best closed models for prompt adherence and visual fidelity while generating faster at lower cost. Prompt: "Cinematic film still of a woman walking alone through a narrow Madrid street at night, warm street lamps, cool blue shadows, light rain reflecting on cobblestones, moody and atmospheric, shallow depth of field, natural skin texture, subtle film grain and introspective mood" This prompt shines because it taps into FLUX.2 [pro]'s cinematic‑lighting engine, letting the model fuse warm street‑lamp glow and cool shadows into a visually striking, film‑grade composition. What’s game-changing about FLUX.2 [pro]? FLUX.2 is designed for real-world creative workflows where consistency, accuracy, and iteration speed determine whether AI generation can replace traditional production pipelines. The model understands lighting, perspective, materials, and spatial relationships. It maintains characters and products consistent across up to 10 reference images simultaneously. It adheres to brand constraints like exact hex colors and legible text. The result: production-ready assets with fewer touchups and stronger brand fidelity. What’s New: Production‑grade quality up to 4MP: High‑fidelity, coherent scenes with realistic lighting, spatial logic, and fine detail suitable for product photography and commercial use cases. Multi‑reference consistency: Reference up to 10 images simultaneously with the best character, product, and style consistency available today. Generate dozens of brand-compliant assets where identity stays perfectly aligned shot to shot. Brand‑accurate results: Exact hex‑color matching, reliable typography, and structured controls (JSON, pose guidance) mean fewer manual fixes and stronger brand compliance. Strong prompt fidelity for complex directions: Improved adherence to complex, structured instructions including multi-part prompts, compositional constraints, and JSON-based controls. 32K token context supports long, detailed workflows with exact positioning specifications, physics-aware lighting, and precise compositional requirements in a single prompt. Optimized inference: FLUX.2 [pro] delivers state-of-the-art quality with pre-optimized inference settings, generating faster at lower cost than competing closed models. FLUX.2 transforms creative production economics by enabling workflows that weren't possible with earlier systems. Teams ship complete campaigns in days instead of weeks, with fewer manual touchups and stronger brand fidelity at scale. This performance stems from FLUX.2's unified architecture, which combines generation and editing in a single latent flow matching model. How it Works FLUX.2 combines image generation and editing in a single latent flow matching architecture, coupling a Mistral‑3 24B vision‑language model (VLM) with a rectified flow transformer. The VLM brings real‑world knowledge and contextual understanding, while the flow transformer models spatial relationships, material properties, and compositional logic that earlier architectures struggled to render. FLUX.2’s architecture unifies visual generation and editing, fuses language‑grounded understanding with flow‑based spatial modeling, and delivers production‑ready, brand‑safe images with predictable control especially when you need consistent identity, exact colors, and legible typography at high resolution. Technical details can be found in the FLUX.2 VAE blog post. Top enterprise scenarios & patterns to try with FLUX.2 [pro] The addition of FLUX.2 [pro] is the next step in the evolution for delivering faster, richer, and more controllable generation unlocking a new wave of creative potential for enterprises. Bring FLUX.2 [pro] into your workflow and transform your creative pipeline from concept to production by trying out these patterns: Enterprise scenarios Patterns to try E‑commerce hero shots Start with a small set of references (product front, material/texture, logo). Prompt for a studio hero shot on a white seamless background, three‑quarter view, softbox key + subtle rim light. Include exact hex for brand accents and specify logo placement. Output at 4MP. Product variants at scale Reuse the hero references; ask for specific colorway, angle, and background variants (e.g., “Create {COLOR} variant, {ANGLE} view, {BG} background”). Keep brand hex and logo position constant across variants. Campaign consistency (character/product identity) Provide 5–10 reference images for the character/product (faces, outfits, mood boards). Request the same identity across scenes with consistent lighting/style (e.g., cinematic warm daylight) and defined environments (e.g., urban rooftop). Marketing templates & localization Define a template (e.g., 3‑column grid: left image, right text). Set headline/body sizes (e.g., 24pt/14pt), contrast ≥ 4.5:1, and brand font. Swap localized copy per locale while keeping layout and spacing consistent. Best practices to get to production readiness with Microsoft Foundry FLUX.2 [pro] brings state-of-the-art image quality to your fingertips. In Microsoft Foundry, you can turn those capabilities into predictable, governed outcomes by standardizing templates, managing references, enforcing brand rules, and controlling spend. These practices below leverage FLUX.2 [pro]’s visual intelligence and turn them into repeatable recipes, auditable artifacts, and cost‑controlled processes within a governed Foundry pipeline. Best Practice What to do Foundry tip Approved templates Create 3–5 templates (e.g., hero shot, variant gallery, packaging, social card) with sections for Composition (camera, lighting, environment), Brand (hex colors, logo placement), Typography (font, sizes, contrast), and Output (resolution, format). Store templates in Foundry as approved artifacts; version them and restrict edits via RBAC. Versioned reference sets Keep 3–10 references per subject (product: front/side/texture; talent: face/outfit/mood) and link them to templates. Save references in governed Foundry storage; reference IDs travel with the job metadata. Resolution staging Use a three‑stage plan: Concept (1–2MP) → Review (2–3MP) → Final (4MP). Leverage FLUX.1 [pro] and FLUX1.1 Kontext [pro] before the Final stage for fast iteration and cost control Enforce stage‑based quotas and cap max resolution per job; require approval to move to 4MP. Automated QA & approvals Run post‑generation checks for color match, text legibility, and safe‑area compliance; gate final renders behind a review step. Use Foundry workflows to require sign‑off at the Review stage before Final stage. Telemetry & feedback Track latency, success rate, usage, and cost per render; collect reviewer notes and refine templates. Dashboards in Foundry: monitor job health, cost, and template performance. Foundry Models continues to grow with cutting-edge additions to meet every enterprise need—including models from Black Forest Labs, OpenAI, and more. From models like GPT‑image‑1, FLUX.2 [pro], and Sora 2, Microsoft Foundry has become the place where creators push the boundaries of what’s possible. Watch how Foundry transforms creative workflows with this demo: Customer Stories As seen at Ignite 2025, real‑world customers like Sinyi Realty have already demonstrated the efficiency of Black Forest Lab’s models on Microsoft Foundry by choosing FLUX.1 Kontext [pro] for its superior performance and selective editing. For their new 'Clear All' feature, they preferred a model that preserves the original room structure and simply removes clutter, rather than generating a new space from scratch, saving time and money. Read the story to learn more. “We wanted to stay in the same workspace rather than having to maintain different platforms,” explains TeWei Hsieh, who works in data engineering and data architecture. “By keeping FLUX Kontext model in Foundry, our data scientists and data engineers can work in the same environment.” As customers like Sinyi Realty have already shown, BFL FLUX models raise the bar for speed, precision, and operational efficiency. With FLUX.2 now on Microsoft Foundry, organizations can bring that same competitive edge directly into their own production pipelines. FLUX.2 [pro] Pricing Foundry Models are fully hosted and managed on Azure. FLUX.2 [pro] is available through pay-as-you-go and on Global Standard deployment type with the following pricing: Generated image: The first generated megapixel (MP) is charged $0.03. Each subsequent megapixel is charged $0.015. Reference image(s): We charge $0.015 for each megapixel. Important Notes: For pricing, resolution is always rounded up to the next megapixel, separately for each reference image and for the generated image. 1 megapixel is counted as 1024x1024 pixels For multiple reference images, each reference image is counted as 1 megapixel Images exceeding 4 megapixels are resized to 4 megapixels Reference the Foundry Models pricing page for pricing. Build Trustworthy AI Solutions Black Forest Labs models in Foundry Models are delivered under the Microsoft Product Terms, giving you enterprise-grade security and compliance out of the box. Each FLUX endpoint offers Content Safety controls and guardrails. Runtime protections include built-in content-safety filters, role-based access control, virtual-network isolation, and automatic Azure Monitor logging. Governance signals stream directly into Azure Policy, Purview, and Microsoft Sentinel, giving security and compliance teams real-time visibility. Together, Microsoft's capabilities let you create with more confidence, knowing that privacy, security, and safety are woven into every Black Forest Labs deployment from day one. Getting Started with FLUX.2 in Microsoft Foundry If you don’t have an Azure subscription, you can sign up for an Azure account here. Search for the model name in the model catalog in Foundry under “Build.” FLUX.2-pro Open the model card in the model catalog. Click on deploy to obtain the inference API and key. View your deployment under Build > Models. You should land on the deployment page that shows you the API and key in less than a minute. You can try out your prompts in the playground. You can use the API and key with various clients. Learn More ▶️ RSVP for the next Model Monday LIVE on YouTube or On-Demand 👩💻 Explore FLUX.2 Documentation on Microsoft Learn 👋 Continue the conversation on Discord1.2KViews0likes2CommentsOpen AI’s GPT-5.1-codex-max in Microsoft Foundry: Igniting a New Era for Enterprise Developers
Announcing GPT-5.1-codex-max: The Future of Enterprise Coding Starts Now We’re thrilled to announce the general availability of OpenAI's GPT-5.1-codex-max in Microsoft Foundry Models; a leap forward that redefines what’s possible for enterprise-grade coding agents. This isn’t just another model release; it’s a celebration of innovation, partnership, and the relentless pursuit of developer empowerment. At Microsoft Ignite, we unveiled Microsoft Foundry: a unified platform where businesses can confidently choose the right model for every job, backed by enterprise-grade reliability. Foundry brings together the best from OpenAI, Anthropic, xAI, Black Forest Labs, Cohere, Meta, Mistral, and Microsoft’s own breakthroughs, all under one roof. Our partnership with Anthropic is a testament to our commitment to giving developers access to the most advanced, safe, and high-performing models in the industry. And now, with GPT-5.1-codex-max joining the Foundry family, the possibilities for intelligent applications and agentic workflows have never been greater. GPT 5.1-codex-max is available today in Microsoft Foundry and accessible in Visual Studio Code via the Foundry extension . Meet GPT-5.1-codex-max: Enterprise-Grade Coding Agent for Complex Projects GPT-5.1-codex-max is engineered for those who build the future. Imagine tackling complex, long-running projects without losing context or momentum. GPT-5.1-codex-max delivers efficiency at scale, cross-platform readiness, and proven performance with top scores on SWE-Bench (77.9), the gold standard for AI coding. With GPT-5.1-codex-max, developers can focus on creativity and problem-solving, while the model handles the heavy lifting. GPT-5.1-codex-max isn’t just powerful; it’s practical, designed to solve real challenges for enterprise developers: Multi-Agent Coding Workflows: Automate repetitive tasks across microservices, maintaining shared context for seamless collaboration. Enterprise App Modernization: Effortlessly refactor legacy .NET and Java applications into cloud-native architectures. Secure API Development: Generate and validate secure API endpoints, with `compliance checks built-in for peace of mind. Continuous Integration Support: Integrate GPT-5.1-codex-max into CI/CD pipelines for automated code reviews and test generation, accelerating delivery cycles. These use cases are just the beginning. GPT-5.1-codex-max is your partner in building robust, scalable, and secure solutions. Foundry: Platform Built for Developers Who Build the Future Foundry is more than a model catalog—it’s an enterprise AI platform designed for developers who need choice, reliability, and speed. • Choice Without Compromise: Access the widest range of models, including frontier models from leading model providers. • Enterprise-Grade Infrastructure: Built-in security, observability, and governance for responsible AI at scale. • Integrated Developer Experience: From GitHub to Visual Studio Code, Foundry connects with tools developers love for a frictionless build-to-deploy journey. Start Building Smarter with GPT-5.1-codex-max in Foundry The future is here, and it’s yours to shape. Supercharge your coding workflows with GPT-5.1-codex-max in Microsoft Foundry today. Learn more about Microsoft Foundry: aka.ms/IgniteFoundryModels. Watch Ignite sessions for deep dives and demos: ignite.microsoft.com. Build faster, smarter, and with confidence on the platform redefining enterprise AI.4.3KViews3likes5CommentsIntroducing OpenAI’s GPT-image-1.5 in Microsoft Foundry
Developers building with visual AI can often run into the same frustrations: images that drift from the prompt, inconsistent object placement, text that renders unpredictably, and editing workflows that break when iterating on a single asset. That’s why we are excited to announce OpenAI's GPT Image 1.5 is now generally available in Microsoft Foundry. This model can bring sharper image fidelity, stronger prompt alignment, and faster image generation that supports iterative workflows. Starting today, customers can request access to the model and start building in the Foundry platform. Meet GPT Image 1.5 AI driven image generation began with early models like OpenAI's DALL-E, which introduced the ability to transform text prompts into visuals. Since then, image generation models have been evolving to enhance multimodal AI across industries. GPT Image 1.5 represents continuous improvement in enterprise-grade image generation. Building on the success of GPT Image 1 and GPT Image 1 mini, these enhanced models introduce advanced capabilities that cater to both creative and operational needs. The new image models offer: Text-to-image: Stronger instruction following and highly precise editing. Image-to-image: Transform existing images to iteratively refine specific regions Improved visual fidelity: More detailed scenes and realistic rendering. Accelerated creation times: Up to 4x faster generation speed. Enterprise integration: Deploy and scale securely in Microsoft Foundry. GPT Image 1.5 delivers stronger image preservation and editing capabilities, maintaining critical details like facial likeness, lighting, composition, and color tone across iterative changes. You’ll see more consistent preservation of branded logos and key visuals, making it especially powerful for marketing, brand design, and ecommerce workflows—from graphics and logo creation to generating full product catalogs (variants, environments, and angles) from a single source image. Benchmarks Based on an internal Microsoft dataset, GPT Image 1.5 performs higher than other image generation models in prompt alignment and infographics tasks. It focuses on making clear, strong edits – performing best on single-turn modification, delivering the higher visual quality in both single and multi-turn settings. The following results were found across image generation and editing: Text to image Prompt alignment Diagram / Flowchart GPT Image 1.5 91.2% 96.9% GPT Image 1 87.3% 90.0% Qwen Image 83.9% 33.9% Nano Banana Pro 87.9% 95.3% Image editing Evaluation Aspect Modification Preservation Visual Quality Face Preservation Metrics BinaryEval SC (semantic) DINO (Visual) BinaryEval AuraFace Single-turn GPT image 1 99.2% 51.0% 0.14 79.5% 0.30 Qwen image 81.9% 63.9% 0.44 76.0% 0.85 GPT Image 1.5 100% 56.77% 0.14 89.96% 0.39 Multi-turn GPT Image 1 93.5% 54.7% 0.10 82.8% 0.24 Qwen image 77.3% 68.2% 0.43 77.6% 0.63 GPT image 1.5 92.49% 60.55% 0.15 89.46% 0.28 Using GPT Image 1.5 across industries Whether you’re creating immersive visuals for campaigns, accelerating UI and product design, or producing assets for interactive learning GPT Image 1.5 gives modern enterprises the flexibility and scalability they need. Image models can allow teams to drive deeper engagement through compelling visuals, speed up design cycles for apps, websites, and marketing initiatives, and support inclusivity by generating accessible, high‑quality content for diverse audiences. Watch how Foundry enables developers to iterate with multimodal AI across Black Forest Labs, OpenAI, and more: Microsoft Foundry empowers organizations to deploy these capabilities at scale, integrating image generation seamlessly into enterprise workflows. Explore the use of AI image generation here across industries like: Retail: Generate product imagery for catalogs, e-commerce listings, and personalized shopping experiences. Marketing: Create campaign visuals and social media graphics. Education: Develop interactive learning materials or visual aids. Entertainment: Edit storyboards, character designs, and dynamic scenes for films and games. UI/UX: Accelerate design workflows for apps and websites. Microsoft Foundry provides security and compliance with built-in content safety filters, role-based access, network isolation, and Azure Monitor logging. Integrated governance via Azure Policy, Purview, and Sentinel gives teams real-time visibility and control, so privacy and safety are embedded in every deployment. Learn more about responsible AI at Microsoft. Pricing Model Pricing (per 1M tokens) - Global GPT-image-1.5 Input Tokens: $8 Cached Input Tokens: $2 Output Tokens: $32 Cost efficiency improves as well: image inputs and outputs are now cheaper compared to GPT Image 1, enabling organizations to generate and iterate on more creative assets within the same budget. For detailed pricing, refer here. Getting started Learn more about image generation, explore code samples, and read about responsible AI protections here. Try GPT Image 1.5 in Microsoft Foundry and start building multimodal experiences today. Whether you’re designing educational materials, crafting visual narratives, or accelerating UI workflows, these models deliver the flexibility and performance your organization needs.5.5KViews2likes1CommentFine-tuning at Ignite 2025: new models, new tools, new experience
Fine‑tuning isn’t just “better prompts.” It’s how you tailor a foundation model to your domain and tasks to get higher accuracy, lower cost, and faster responses -- then run it at scale. As Agents become more critical to businesses, we’re seeing growing demand for fine tuning to ensure agents are low latency, low cost, and call the right tools and the right time. At Ignite 2025, we saw how Docusign fine-tuned models that powered their document management system to achieve major gains: more than 50% cost reduction per document, 2x faster inference time, and significant improvements in accuracy. At Ignite, we launched several new features in Microsoft Foundry that make fine‑tuning easier, more scalable, and more impactful than ever with the goal of making agents unstoppable in the real world: New Open-Source models – Qwen3 32B, Ministral 3B, GPT-OSS-20B and Llama 3.3 70B – to give users access to Open-Source models in the same low friction experience as OpenAI Synthetic data generation to jump start your training journey – just upload your documents and our multi-agent system takes care of the rest Developer Training tier to reduce the barrier to entry by offering discounted training (50% off global!) on spot capacity Agentic Reinforcement Fine-tuning with GPT-5: leverage tool calling during chain of thought to teach reasoning models to use your tools to solve complex problems And if that wasn’t enough, we also released a re-imagined fine tuning experience in Foundry (new), providing access to all these capabilities in a simplified and unified UI. New Open-Source Models for Fine-tuning (Public Preview): Bringing open-source innovation to your fingertips We’ve expanded our model lineup to new open-source models you can fine-tune without worrying about GPUs or compute. Ministral-3B and Qwen3 32B are now available to fine-tune with Supervised Fine-Tuning (SFT) in Microsoft Foundry, enabling developers to adapt open-source models to their enterprise-specific domains with ease. Look out for Llama 3.3 70B and GPT-OSS-20B, coming next week! These OSS models are offered through a unified interface with OpenAI via the UI or Foundry SDK which means the same experience, regardless of model choice. These models can be used alongside your favorite Foundry tools, from AI Search to Evaluations, or to power your agents. Note: New OSS models are only available in "New" Foundry – so upgrade today! Like our OpenAI models, Open-Source models in Foundry charge per-token for training, making it simple to forecast and estimate your costs. All models are available on Global Standard tier, making discoverability easy. For more details on pricing, please see our Microsoft Foundry Models pricing page. Customers like Co-Star Group have already seen success leveraging fine tuning with Mistral models to power their home search experience on Homes.com. They selected Ministral-3B as a small, efficient model to power high volume, low latency processing with lower costs and faster deployment times than Frontier models – while still meeting their needs for accuracy, scalability, and availability thanks to fine tuning in Foundry. Synthetic data generation (Public Preview): Create high-quality training data automatically Developers can now generate high-quality, domain-specific synthetic datasets to close those persistent data gaps with synthetic data generation. One of the biggest challenges we hear teams face during fine-tuning is not having enough data or the right kind of data because it’s scarce, sensitive, or locked behind compliance constraints (think healthcare and finance). Our new synthetic data generation capability solves this by giving you a safe, scalable way to create realistic, diverse datasets tailored to your use case so you can fine-tune and evaluate models without waiting for perfect real-world data. Now, you can produce realistic question–answer pairs from your documents, or simulate multi‑turn tool‑use dialogues that include function calls without touching sensitive production data. How it works: Fine‑tuning datasets: Upload a reference file (PDF/Markdown/TXT) and Foundry converts it into SFT‑formatted Q&A pairs that reflect your domain’s language and nuances so your model learns from the right examples. Agent tool‑use datasets: Provide an OpenAPI (Swagger) spec, and Foundry simulates multi‑turn assistant–user conversations with tool calls, producing SFT‑ready examples that teach models to call your APIs reliably. Evaluation datasets: Generate distinct test queries tailored to your scenarios so you can measure model and agent quality objectively—separate from your training data to avoid false confidence. Agents succeed when they reliably understand domain intent and call the right tools at the right time. Foundry’s synthetic data generation does exactly that: it creates task‑specific training and test data so your agent learns from the right examples and you can prove it works before you go live so they are reliable in the real world. Developer Training Tier (Public Preview): 50% discount on training jobs Fine-tuning can be expensive, especially when you may need to run multiple experiments to create the right model for your production agents. To make it easier than ever to get started, we’re introducing Developer Training tier – providing users with a 50% discount when they choose to run workloads on pre-emptible capacity. It also lets users iterate faster: we support up to 10 concurrent jobs on Developer tier, making it ideal for running experiments in parallel. Because it uses reclaimable capacity, jobs may be pre‑empted and automatically resumed, so they may take longer to complete. When to use Developer Training tier: When cost matters - great for early experimentation or hyperparameter tuning thanks to 50% lower training cost. When you need high concurrency - supports up to 10 simultaneous jobs, ideal for running multiple experiments in parallel. When the workload is non‑urgent - suitable for jobs that can tolerate pre-emption and longer, capacity-dependent runtimes. Agentic Reinforcement Fine‑Tuning (RFT) (Private Preview): Train reasoning models to use your tools through outcome based optimization Building reliable AI agents requires more than copying correct behavior; models need to learn which reasoning paths lead to successful outcomes. While supervised fine-tuning trains models to imitate demonstrations, reinforcement fine-tuning optimizes models based on whether their chain of thought actually generates a successful outcome. It teaches them to think in new ways, about new domains – to solve complex problems. Agentic RFT applies this to tool-using workflows: the model generates multiple reasoning traces (including tool calls and planning steps), receives feedback on which attempts solved the problem correctly, and updates its reasoning patterns accordingly. This helps models learn effective strategies for tool sequencing, error recovery, and multi-step planning—behaviors that are difficult to capture through demonstrations alone. The difference now is that you can provide your own custom tools for use during chain of thought: models can interact with your own internal systems, retrieve the data they need, and access your proprietary APIs to solve your unique problems. Agentic RFT is currently available in private preview for o4-mini and GPT-5, with configurable reasoning effort, sampling rates, and per-run telemetry. Request access at aka.ms/agentic-rft-preview. What are customers saying? Fine-tuning is critical to achieve the accuracy and latency needed for enterprise agentic workloads. Decagon is used by many of the world’s most respected enterprises to build, manage and scale AI agents that can resolve millions of customer inquiries across chat, email, and voice – 24 hours a day, seven days a week. This experience is powered by fine-tuning: “Providing accurate responses with minimal latency is fundamental to Decagon’s product experience. We saw an opportunity to reduce latency while improving task-specific accuracy by fine-tuning models using our proprietary datasets. Via fine-tuning, we were able to exceed the performance of larger state of the art models with smaller, lighter-weight models which could be served significantly faster.” -- Cyrus Asgari, Lead Research Engineer for fine-tuning at Decagon But it’s not just agent-first startups seeing results. Companies like Discover Bank are using fine tuned models to provide better customer experiences with personal banking agents: We consolidated three steps into one, response times that were previously five or six seconds came down to one and a half to two seconds on average. This approach made the system more efficient and the 50% reduction in latency made conversations with Discovery AI feel seamless. - Stuart Emslie, Head of Actuarial and Data Science at Discovery Bank Fine-tuning has evolved from an optimization technique to essential infrastructure for production AI. Whether building specialized agents or enhancing existing products, the pattern is clear: custom-trained models deliver the accuracy and speed that general-purpose models can't match. As techniques like Agentic RFT and synthetic data generation mature, the question isn't whether to fine-tune, but how to build the systems to do it systematically. Learn More 🧠 Get Started with fine-tuning with Azure AI Foundry on Microsoft Learn Docs ▶️ Watch On-Demand: https://ignite.microsoft.com/en-US/sessions/BRK188?source=sessions 👩 Try the demos: aka.ms/FT-ignite-demos 👋 Continue the conversation on Discord636Views0likes0CommentsIntroducing Cohere Rerank 4.0 in Microsoft Foundry
These new retrieval models deliver state-of-the-art accuracy, multilingual coverage across 100+ languages, and breakthrough performance for enterprise search and retrieval-augmented generation (RAG) systems. With Rerank 4.0, customers can dramatically improve the quality of search, reduce hallucinations in RAG applications, and strengthen the reasoning capabilities of their AI agents, all with just a few lines of code. Why Rerank Models Matter for Enterprise AI Retrieval is the foundation of grounded AI systems. Whether you are building an internal assistant, a customer-facing chatbot, or a domain-specific knowledge engine, the quality of the retrieved documents determines the quality of the final answer. Traditional embeddings get you close, but reranking is what gets you the right answer. Rerank improves this step by reading both the query and document together (cross-encoding), producing highly precise semantic relevance scores. This means: More accurate search results More grounded responses in RAG pipelines Lower generative model usage , reducing cost Higher trust and quality across enterprise workloads Introducing Cohere Rerank 4.0 Fast and Rerank 4.0 Pro Microsoft Foundry now offers two versions of Rerank 4.0 to meet different enterprise needs: Rerank 4.0 Fast Best balance of speed and accuracy Same latency as Cohere Rerank 3.5, with significantly higher accuracy Ideal for high-traffic applications and real-time systems Rerank 4.0 Pro Highest accuracy across all benchmarks Excels at complex, reasoning-heavy, domain-specific retrieval Tuned for industries like finance, healthcare, manufacturing, government, and energy Multilingual & Cross-Domain Performance Rerank 4.0 delivers unmatched multilingual and cross-domain performance, supporting more than 100 languages and enabling powerful cross-lingual search across complex enterprise datasets. The models achieve state-of-the-art accuracy in 10 of the world’s most important business languages, including Arabic, Chinese, French, German, Hindi, Japanese, Korean, Portuguese, Russian, and Spanish, making them exceptionally well suited for global organizations with multilingual knowledge bases, compliance archives, or international operations. Effortless Integration: Add Rerank to Any System One of the biggest benefits of Rerank 4.0 is how easy it is to adopt. You can add reranking to: Existing enterprise search Vector DB pipelines Keyword search systems Hybrid retrieval setups RAG architectures Agent workflows No infrastructure changes required. Just a few lines of code.This makes it one of the fastest ways to meaningfully upgrade grounding, precision, and search quality in enterprise AI systems. Better RAG, Better Agents, Better Outcomes In Foundry, customers can pair Cohere Rerank 4.0 with Azure Search, vector databases, Agent Service, Azure Functions, Foundry orchestration, and any LLM—including GPT-4.1, Claude, DeepSeek, and Mistral—to deliver more grounded copilots, higher-fidelity agent actions, and better reasoning from cleaner context windows. This reduces hallucinations, lowers LLM spend, and provides a foundational upgrade for mission-critical AI systems. Built for Enterprise: Security, Observability, Governance As a direct from Azure model, Rerank 4.0 is fully integrated with: Azure role-based access control (RBAC) Virtual network isolation Customer-managed keys Logging & observability Entra ID authentication Private deployments You can run Rerank 4.0 in environments that meet the strictest enterprise security and compliance needs. Optimized for Enterprise Models & High-Value Industries Rerank 4.0 is built for sectors where accuracy matters: Finance - Delivers precise retrieval for complex disclosures, compliance documents, and regulatory filings. Healthcare- Accurately retrieves clinical notes, biomedical literature, and care protocols for safer, more reliable insights. Manufacturing- Surfaces the right engineering specs, manuals, and parts data to streamline operations and reduce downtime. Government & Public Sector - Improves access to policy documents, case archives, and citizen service information with semantic precision. Energy- Understands industrial logs, safety manuals, and technical standards to support safer and more efficient operations. Pricing Model Name Deployment Type Azure Resource Region Price /1K Search Units Availability Cohere Rerank 4.0 Pro Global Standard All regions (Check this page for region details) $2.50 Public Preview, Dec 11, 2025 Cohere Rerank 4.0 Fast Global Standard All regions (Check this page for region details) $2.00 Public Preview, Dec 11, 2025 Get Started Today Cohere Rerank 4.0 Fast and Rerank 4.0 Pro are now available in Microsoft Foundry. Rerank 4.0 is one of the simplest and highest impact upgrades you can make to your enterprise AI stack, bringing better retrieval, better agents, and more trustworthy AI to every application.3.1KViews2likes0CommentsExpanded Models Available in Microsoft Foundry Agent Service
Announcement Summary Foundry Agent Service now supports an expanded ecosystem of frontier and specialist models. Access models from Anthropic, DeepSeek AI, Meta, Microsoft, xAI, and more. Avoid model lock-in and choose the best model for each scenario. Build complex, multimodal, multi-agent workflows at enterprise scale. From document intelligence to operational automation, Microsoft Foundry makes AI agents ready for mission-critical workloads.422Views0likes0CommentsFoundry Agent Service at Ignite 2025: Simple to Build. Powerful to Deploy. Trusted to Operate.
The upgraded Foundry Agent Service delivers a unified, simplified platform with managed hosting, built-in memory, tool catalogs, and seamless integration with Microsoft Agent Framework. Developers can now deploy agents faster and more securely, leveraging one-click publishing to Microsoft 365 and advanced governance features for streamlined enterprise AI operations.7.5KViews3likes1CommentRosettaFold3 Model at Ignite 2025: Extending Frontier of Biomolecular Modeling in Microsoft Foundry
Today at Microsoft Ignite 2025, we are excited to launch RosettaFold3 (RF3) on Microsoft Foundry - making a new generation of multi-molecular structure prediction models available to researchers, biotech innovators, and scientific teams worldwide. RF3 was developed by the Baker lab and DiMaio lab from the Institute for Protein Design (IPD) at the University of Washington, in collaboration with Microsoft’s AI for Good lab and other research partners. RF3 is now available in Foundry Models, offering scalable access to a new generation of biomolecular modeling capabilities. Try RF3 now in Foundry Models A new multi-molecular modeling system, now accessible in Foundry Models RF3 represents a leap forward in biomolecular structure prediction. Unlike previous generation models focused narrowly on proteins, RF3 can jointly model: Proteins (enzymes, antibodies, peptides) Nucleic acids (DNA, RNA) Small molecules/ligands Multi-chain complexes This unified modeling approach allows researchers to explore entire interaction systems—protein–ligand docking, protein–RNA assembly, protein–DNA binding, and more—in a single end-to-end workflow. Key advances in RF3 RF3 incorporates several advancements in protein and complex prediction, making it the state-of-the-art open-source model. Joint atom-level modeling across molecular types RF3 can simultaneously model all atom types across proteins, nucleic acids, and ligands—enabled by innovations in multimodal transformers and generative diffusion models. Unprecedented control: atom-level conditioning Users can provide the 3D structure of a ligand or compound, and RF3 will fold a protein around it. This atom-level conditioning unlocks: Targeted drug-design workflows Protein pocket and surface engineering Complex interaction modeling Example showing how RF3 allows conditioning on user inputs offering greater control of the model’s predictions. Broad templating support for structure-guided design RF3 allows users to guide structure prediction using: Distance constraints Geometric templates Experimental data (e.g., cryo-EM) This flexibility is limited in other models and makes RF3 ideal for hybrid computation–wet-lab workflows. Extensible foundation for scientific and industrial research RF3 can be adapted to diverse application areas—including enzyme engineering, materials science, agriculture, sustainability, and synthetic biology. Use cases RF3’s multimolecular modeling capabilities have broad applicability beyond fundamental biology. The model enables breakthroughs across medicine, materials science, sustainability, and defense—where structure-guided design directly translates into measurable innovation. Sector Illustrative Use Cases Medicine Gene therapy research: RF3 enables the design of custom proteins that bind specific DNA sequences for targeted genome repair. Materials Science Inspired by natural protein fibers such as wool and silk, IPD researchers are designing synthetic fibers with tunable mechanical properties and texture—enabling sustainable textiles and advanced materials. Sustainability RF3 supports enzyme design for plastic degradation and waste recycling, contributing to circular bioeconomy initiatives. Disease & Vaccine Development RF3-powered workflows will contribute to structure-guided vaccine design, building on IPD’s prior success with the SKYCovione COVID-19 nanoparticle vaccine developed with SK Bioscience and GSK. Crop Science and Food security Support for gene-editing technology (due to protein-DNA binding prediction capabilities) for agricultural research, design of small proteins called Anti-Microbial Peptides or Anti-Fungal peptides to fight crop diseases and tree diseases such as citrus greening. Defense & Biosecurity Enables detection and rapid countermeasure design against toxins or novel pathogens; models of this class are being studied for biosafety applications (Horvitz et al., Science, 2025). Aerospace & Extreme Environments Supports design of lightweight, self-healing, and radiation-resistant biomaterials capable of functioning under non-terrestrial conditions (e.g., high temperature, pressure, or radiation exposure). RF3 has the potential to lower the cost of exploratory modeling, raise success rates in structure-guided discovery, and expand biomolecular AI into domains that were previously limited by sparse experimental structures or difficult multimolecular interactions. Because the model and training framework are open and extensible, partners can also adapt RF3 for their own research, making it a foundation for the next generation of biomolecular AI on Microsoft Foundry. Get started today RosettaFold3 (RF3) brings advanced multimolecular modeling capabilities into Foundry Models, enabling researchers and biotech teams to run structure-guided workflows with greater flexibility and speed. Within Microsoft Foundry, you can integrate RF3 into your existing scientific processes—combining your data, templates, and downstream analysis tools in one connected environment. Start exploring the next frontier of biomolecular modeling with RosettaFold3 in the Foundry Models. You can also discover other early-stage AI innovations in Foundry Labs. If you’re attending Microsoft Ignite 2025, or watching on demand, be sure to check out our session: Session: AI Frontier in Foundry Labs: Experiment Today, Lead Tomorrow About the session: “Curious about the next wave of AI breakthroughs? Get a sneak peek into the future of AI with Azure AI Foundry Labs—your front door to experimental models, multi-agent orchestration prototypes, Agent Factory blueprints, and edge innovations. If you’re a researcher eager to test, validate, and influence what’s next in enterprise AI, this session is your launchpad. See how Labs lets you experiment fast, collaborate with innovators, and turn new ideas into real impact.”534Views0likes0CommentsGPT‑5.1 in Foundry: A Workhorse for Reasoning, Coding, and Chat
The pace of AI innovation is accelerating, and developers—across startups and global enterprises—are at the heart of this transformation. Today marks a significant moment for enterprise AI innovation: Azure AI Foundry is unveiling OpenAI’s GPT-5.1 series, the next generation of reasoning, analytics, and conversational intelligence. The following models will be rolling out in Foundry today: GPT-5.1: adaptive, more efficient reasoning GPT-5.1-chat: chat with new chain-of-thought for end-users GPT-5.1-codex: optimized for long-running conversations with enhanced tools and agentic workflows GPT-5.1-codex-mini: a compact variant for resource-constrained environments What’s new with GPT-5.1 series The GPT-5.1 series is built to respond faster to users in a variety of situations with adaptive reasoning, improving latency and cost efficiency across the series by varying thinking time more significantly. This, combined with other tooling improvements, enhanced stepwise reasoning visibility, multimodal intelligence, and enterprise-grade compliance. GPT-5.1: Adaptive and Efficient Reasoning GPT-5.1 is the mainline model engineered to deliver adaptive, stepwise reasoning that adjusts its approach based on the complexity of each task. Core capabilities included: Adaptive reasoning for nuanced, context-aware thinking time Multimodal intelligence: supporting text, image, and audio inputs/outputs Enterprise-grade performance, security, and compliance This model’s flexibility empowers developers to tackle a wide spectrum of tasks—from simple queries to deep, multi-step workflows for enterprise-grade solutions. With its ability to intelligently balance speed, cost, and intelligence, GPT-5.1 sets a new standard for both performance and efficiency in AI-powered development. GPT-5.1-chat: Elevating Interactive Experiences with Smart, Safe Conversations GPT-5.1-chat powers fast, context-aware chat experiences with adaptive reasoning and robust safety guardrails. With chain-of-thought added in the chat for the first time, it brings an interactive experience to the next level. It’s tuned for safety and instruction-following, making it ideal for customer support, IT helpdesk, HR, and sales enablement. Multimodal chat (text, image, and audio) improves long-turn consistency for real problem solving, delivering brand-aligned, safe conversations, and supporting next-best-action recommendations. GPT-5.1-codex and GPT-5.1-codex-mini: Frontier Models for Agentic Coding GPT-5.1-codex builds on the foundation set by GPT-5-codex, advancing developer tooling with: Enhanced reasoning frameworks for stepwise, context-aware code analysis and generation; plus Enhanced tool handling for certain development scenario's Multimodal intelligence for richer developer experiences when coding With Foundry’s enterprise-grade security and governance, GPT-5.1-codex is ideal for automated code generation and review, accelerating development cycles with intelligent code suggestions, refactoring, and bug detection. GPT-5.1-codex-mini is a compact, efficient variant optimized for resource-constrained environments. It maintains near state-of-the-art performance, multimodal intelligence, and the same safety stack and tool access as GPT-5.1-codex, making it best for cost-effective, scalable solutions in education, startups, and cost-conscience settings. Together, these Codex models empower teams to innovate faster and with greater confidence. Selecting Your AI Engine: Match Model Strengths to Your Business Goals One of the advantages of the GPT-5.1 series is unified access to deep reasoning, adaptive chat, and advanced coding—all in one place. Here’s how to match model strengths to your needs: Opt for GPT-5.1 for general ai application use—tasks like analytics, research, legal/financial review, or consolidating large documents and codebases. It’s the model of choice for reliability and high-impact outputs. Go with GPT-5.1-chat for interactive assistants and product UX, especially when adaptive reasoning is required for complex cases. Reasoning hints and adaptive reasoning help with customer latency perception. Leverage GPT-5.1-codex for deep, stepwise reasoning in complex code generation, refactoring, or multi-step analysis—ideal for demanding agentic workflows and enterprise automation. Utilize GPT-5.1-codex-mini for efficient, cost-effective coding intelligence in broad-scale deployment, education, or resource-constrained environments—delivering near-mainline performance in a compact model. Deployment and Pricing Model Deployment Available Regions Pricing ($/million tokens) Input Cached Input Output GPT-5.1 Standard Global Global $1.25 $0.125 $10.00 Standard Data Zone Data Zone (US & EU) $1.38 $0.14 $11.00 GPT-5.1-chat Standard Global Global $1.25 $0.125 $10.00 GPT-5.1-codex Standard Global Global $1.25 $0.125 $10.00 GPT-5.1-codex-mini Standard Global Global $0.25 $0.025 $2.00 Start Building Today The GPT-5.1 series is now available in Foundry Models. Whether you’re building for enterprise, small and medium-sized business, or launching the next digital-native app, these models and the Foundry platform are designed to help you innovate faster, safer, and at scale.22KViews1like22Comments