Blog Post

Microsoft Foundry Blog
6 MIN READ

Fine-tuning at Ignite 2025: new models, new tools, new experience

AliciaFrame's avatar
AliciaFrame
Icon for Microsoft rankMicrosoft
Dec 11, 2025

At Ignite 2025, Microsoft introduced major upgrades that make fine‑tuning cheaper, easier, and more capable than ever. New open‑source models give teams more choice and flexibility, while synthetic data generation removes one of the biggest barriers to getting started. Developers can now train at 50% lower cost with the new Developer Training tier, and Agentic RFT unlocks smarter, tool‑using reasoning for complex workflows. Together, these innovations create a faster, more scalable path to building unstoppable enterprise‑grade AI agents.

Fine‑tuning isn’t just “better prompts.” It’s how you tailor a foundation model to your domain and tasks to get higher accuracy, lower cost, and faster responses -- then run it at scale. As Agents become more critical to businesses, we’re seeing growing demand for fine tuning to ensure agents are low latency, low cost, and call the right tools and the right time. At Ignite 2025, we saw how Docusign fine-tuned models that powered their document management system to achieve major gains: more than 50% cost reduction per document, 2x faster inference time, and significant improvements in accuracy.

At Ignite, we launched several new features in Microsoft Foundry that make fine‑tuning easier, more scalable, and more impactful than ever with the goal of making agents unstoppable in the real world:

  • New Open-Source models Qwen3 32B, Ministral 3B, GPT-OSS-20B and Llama 3.3 70B – to give users access to Open-Source models in the same low friction experience as OpenAI
  • Synthetic data generation to jump start your training journey – just upload your documents and our multi-agent system takes care of the rest
  • Developer Training tier to reduce the barrier to entry by offering discounted training (50% off global!) on spot capacity
  • Agentic Reinforcement Fine-tuning with GPT-5: leverage tool calling during chain of thought to teach reasoning models to use your tools to solve complex problems

And if that wasn’t enough, we also released a re-imagined fine tuning experience in Foundry (new), providing access to all these capabilities in a simplified and unified UI.

New Open-Source Models for Fine-tuning (Public Preview)Bringing open-source innovation to your fingertips 

We’ve expanded our model lineup to new open-source models you can fine-tune without worrying about GPUs or compute. Ministral-3B and Qwen3 32B are now available to fine-tune with Supervised Fine-Tuning (SFT) in Microsoft Foundry, enabling developers to adapt open-source models to their enterprise-specific domains with ease.  Look out for Llama 3.3 70B and GPT-OSS-20B, coming next week! 

These OSS models are offered through a unified interface with OpenAI via the UI or Foundry SDK which means the same experience, regardless of model choice. These models can be used alongside your favorite Foundry tools, from AI Search to Evaluations, or to power your agents.

Note: New OSS models are only available in "New" Foundry – so upgrade today!

Like our OpenAI models, Open-Source models in Foundry charge per-token for training, making it simple to forecast and estimate your costs. All models are available on Global Standard tier, making discoverability easy. For more details on pricing, please see our Microsoft Foundry Models pricing page.

Customers like Co-Star Group have already seen success leveraging fine tuning with Mistral models to power their home search experience on Homes.com. They selected Ministral-3B as a small, efficient model to power high volume, low latency processing with lower costs and faster deployment times than Frontier models – while still meeting their needs for accuracy, scalability, and availability thanks to fine tuning in Foundry. 

Synthetic data generation (Public Preview): Create high-quality training data automatically  

Developers can now generate high-quality, domain-specific synthetic datasets to close those persistent data gaps with synthetic data generation. One of the biggest challenges we hear teams face during fine-tuning is not having enough data or the right kind of data because it’s scarce, sensitive, or locked behind compliance constraints (think healthcare and finance).   

Our new synthetic data generation capability solves this by giving you a safe, scalable way to create realistic, diverse datasets tailored to your use case so you can fine-tune and evaluate models without waiting for perfect real-world data. Now, you can produce realistic question–answer pairs from your documents, or simulate multiturn tooluse dialogues that include function calls without touching sensitive production data.  

How it works: 

  • Finetuning datasets: Upload a reference file (PDF/Markdown/TXT) and Foundry converts it into SFTformatted Q&A pairs that reflect your domain’s language and nuances so your model learns from the right examples.  
  • Agent tooluse datasets: Provide an OpenAPI (Swagger) spec, and Foundry simulates multiturn assistant–user conversations with tool calls, producing SFTready examples that teach models to call your APIs reliably.  
  • Evaluation datasets: Generate distinct test queries tailored to your scenarios so you can measure model and agent quality objectively—separate from your training data to avoid false confidence.  

Agents succeed when they reliably understand domain intent and call the right tools at the right time. Foundry’s synthetic data generation does exactly that: it creates taskspecific training and test data so your agent learns from the right examples and you can prove it works before you go live so they are reliable in the real world.  

Developer Training Tier (Public Preview): 50% discount on training jobs 

Fine-tuning can be expensive, especially when you may need to run multiple experiments to create the right model for your production agents. To make it easier than ever to get started, we’re introducing Developer Training tier – providing users with a 50% discount when they choose to run workloads on pre-emptible capacity. It also lets users iterate faster: we support up to 10 concurrent jobs on Developer tier, making it ideal for running experiments in parallel. Because it uses reclaimable capacity, jobs may be preempted and automatically resumed, so they may take longer to complete. 

When to use Developer Training tier: 

  • When cost matters - great for early experimentation or hyperparameter tuning thanks to 50% lower training cost. 
  • When you need high concurrency - supports up to 10 simultaneous jobs, ideal for running multiple experiments in parallel. 
  • When the workload is nonurgent - suitable for jobs that can tolerate pre-emption and longer, capacity-dependent runtimes. 

Agentic Reinforcement FineTuning (RFT) (Private Preview): Train reasoning models to use your tools through outcome based optimization 

Building reliable AI agents requires more than copying correct behaviormodels need to learn which reasoning paths lead to successful outcomes. While supervised fine-tuning trains models to imitate demonstrations, reinforcement fine-tuning optimizes models based on whether their chain of thought actually generates a successful outcome. It teaches them to think in new ways, about new domains – to solve complex problems. 

Agentic RFT applies this to tool-using workflows: the model generates multiple reasoning traces (including tool calls and planning steps), receives feedback on which attempts solved the problem correctly, and updates its reasoning patterns accordingly. This helps models learn effective strategies for tool sequencing, error recovery, and multi-step planning—behaviors that are difficult to capture through demonstrations alone. The difference now is that you can provide your own custom tools for use during chain of thought: models can interact with your own internal systems, retrieve the data they need, and access your proprietary APIs to solve your unique problems. 

Agentic RFT is currently available in private preview for o4-mini and GPT-5, with configurable reasoning effort, sampling rates, and per-run telemetry. Request access at aka.ms/agentic-rft-preview. 

What are customers saying? 

Fine-tuning is critical to achieve the accuracy and latency needed for enterprise agentic workloads. Decagon is used by many of the world’s most respected enterprises to build, manage and scale AI agents that can resolve millions of customer inquiries across chat, email, and voice – 24 hours a day, seven days a week. This experience is powered by fine-tuning: 

Providing accurate responses with minimal latency is fundamental to Decagon’s product experience. We saw an opportunity to reduce latency while improving task-specific accuracy by fine-tuning models using our proprietary datasets. Via fine-tuning, we were able to exceed the performance of larger state of the art models with smaller, lighter-weight models which could be served significantly faster.” -- Cyrus Asgari, Lead Research Engineer for fine-tuning at Decagon

But it’s not just agent-first startups seeing results. Companies like Discover Bank are using fine tuned models to provide better customer experiences with personal banking agents: 

We consolidated three steps into one, response times that were previously five or six seconds came down to one and a half to two seconds on average. This approach made the system more efficient and the 50% reduction in latency made conversations with Discovery AI feel seamless.  - Stuart Emslie, Head of Actuarial and Data Science at Discovery Bank

Fine-tuning has evolved from an optimization technique to essential infrastructure for production AI. Whether building specialized agents or enhancing existing products, the pattern is clear: custom-trained models deliver the accuracy and speed that general-purpose models can't match. As techniques like Agentic RFT and synthetic data generation mature, the question isn't whether to fine-tune, but how to build the systems to do it systematically. 

 

Learn More 

🧠 Get Started with fine-tuning with Azure AI Foundry on Microsoft Learn Docs 

▶️ Watch On-Demand: https://ignite.microsoft.com/en-US/sessions/BRK188?source=sessions

👩 Try the demos: aka.ms/FT-ignite-demos

👋 Continue the conversation on Discord 

Updated Dec 11, 2025
Version 1.0
No CommentsBe the first to comment