What’s new in Azure AI Foundry Fine Tuning

davevoutila

Microsoft

May 19, 2025

All the new things from Build 2025

Fine-tuning has become a critical tool for organizations looking to push the boundaries of what their AI can achieve. It enables teams to tailor models for specific tasks, reduce operational costs, and improve overall performance—all while reducing the complexity of their AI workflows. As Stuart Emslie, Head of Actuarial and Data Science at Discovery Bank, explains:

“We were struggling to achieve the performance we required in defined processors within our agentic applications. Sometimes we required two steps with large models to achieve the result we needed, and we required large prompts to ensure the models interpreted our business context and requirements in the way we envisioned. Fine-tuning allowed us to collapse these multiple steps into one, use a smaller model, and still achieve better performance without the need for as much context within our agentic applications. The Microsoft Team has provided us with the flexibility and support we needed to achieve these results at scale. There is a high likelihood that without fine-tuning and the support provided, we would not have been able to launch our AI solution at scale.”

With that in mind, we’re excited to share a slew of new features and offerings at //Build 2025, designed to make fine-tuning more accessible, scalable, and impactful than ever before. Hot on the heels of our latest model releases, what we're showcasing today represents a significant step forward in helping organizations like yours unlock the full potential of their AI models.

🌐 Reinforcement Fine-Tuning (RFT) with o4-mini: Now in Public Preview

Earlier this month, we announced the upcoming launch of Reinforcement Fine-Tuning (RFT) with the o4-mini model, and today we’re excited to share that it is now in public preview. This marks a significant milestone in bringing advanced reasoning capabilities to Azure AI Foundry.

RFT empowers developers to fine-tune models using reinforcement signals, enabling more adaptive reasoning, context-aware responses, and domain-specific logic. Unlike traditional supervised fine-tuning, RFT uses feedback signals – rewarding accurate outputs and penalizing errors – to continuously improve a model’s decision-making capabilities. This approach makes it ideal for applications that require ongoing learning and adaptation, especially when business logic evolves over time.

Whether you’re building reasoning engines that learn from real-world interactions, dynamic agents capable of handling nuanced scenarios, or AI systems that need to incorporate custom decision rules, RFT with o4-mini offers a powerful foundation. It’s designed to work seamlessly within your existing Azure AI Foundry workflows, ensuring consistent performance and flexibility.

Availability: East US 2 and Sweden Central
Use Cases: Complex decision-making, adaptive workflows, dynamic response generation, and context-sensitive interactions

🥳 Welcoming Global Standard and Regional Provisioned Throughput to General Availability

Originally unveiled at Ignite 2024, we’ve promoted both the Global Standard pay-as-you-go deployment type and Regional Provisioned Throughput to Generally Available.

Our Global Standard public preview saw record customer adoption across model versions. It’s been so well received, it’s now the default launch point for all new fine-tuned models. You can find every fine-tunable Azure OpenAI model, deployable from wherever you trained, and at a more favorable price point than our regional Standard deployments.

Regional Provisioned Throughput (née Provisioned Throughput Managed) now supports the full GPT-4o model family in both North Central US and Sweden Central. Do you have EU data residency needs? No longer a problem! Look for the GPT-4.1 family coming soon to Regional Provisioned Throughput.

Whether you prioritize cost-efficiency or data residency, Azure AI Foundry now supports a broader range of deployment options.

🌐 Global Training: Expanding Fine-Tuning Accessibility

We’re making fine-tuning even more accessible by launching Global Training in public preview. You can now fine-tune Azure OpenAI models from any of the 24 Azure OpenAI regions with reduced per-token training rates, significantly lowering the barrier to entry for model customization.

Why Global Training?
- Train the latest OpenAI models from over a dozen Azure OpenAI regions, with more coming soon.
- Lower costs compared to Standard regional training.
- Flexible data handling with Azure’s strict privacy policies.
Where is Global Training available at launch?
- Australia East
- Brazil South
- France Central
- Germany West Central
- Italy North
- Japan East (no vision support at this time)
- Korea Central
- Norway East
- Poland Central
- Southeast Asia
- Spain Central
- South Africa North

For customers with data residency requirements, you can still utilize regional Standard training in North Central US, East US 2, or Sweden Central.

Switching to Global Training is easy—just toggle the option in the Azure AI Foundry UI or use the REST API. Stay tuned for full API support and even more regions coming shortly after //Build.

🧪 Developer Tier: Affordable Model Evaluation

Introducing the Developer Tier – the most cost-effective way for developers and data scientists to evaluate their fine-tuned OpenAI models.

Key Features:

Deploy fine-tuned GPT-4.1 and GPT-4.1-mini models from any training region, including Global Training.
Free hosting for 24 hours per deployment.
Pay per token at the same rate as Global Standard, helping you budget your testing costs.
Simultaneously evaluate multiple models to choose the best candidate for production.

Designed specifically for model testing, the Developer Tier does not come with SLAs or data residency guarantees. If those are critical, continue using Standard or Global deployments.

Getting started is simple – deploy your fine-tuned model via the Foundry UI or REST API, setting the SKU to Developer Tier. Remember, the TPM quota is shared across all concurrent Developer deployments in a region, so plan capacity accordingly.

Learn more in our accompanying blog post!

📝 End-to-End Distillation with Expanded Evaluations and Stored Completions

Fine-tuning is just one piece of the puzzle. To truly optimize your models for production, you need a seamless, end-to-end pipeline for testing, evaluation, and iteration. That’s why we’re excited to announce that Evaluations and Stored Completions are now available in all Azure OpenAI regions, supporting a complete distillation workflow from training to production.

Comprehensive Evaluations: Test and compare fine-tuned models against benchmarks to ensure they meet your performance goals. With fine-grained evaluation metrics, you can identify performance gaps, validate task-specific accuracy, and fine-tune your models for the exact outcomes you need.
Stored Completions: Automatically capture model outputs during evaluation, providing rich insights into real-world performance. Use this data to refine prompts, adjust training objectives, and identify edge cases – all within the same environment.

Whether you’re refining RFT with o4-mini for dynamic reasoning or scaling 4.1-nano for customer interactions, these tools ensure your models are optimized for real-world impact.

Available both in Azure AI Foundry and via API, check out the docs to get started.

🌉 Extended Support for fine-tuned GPT-4o and GPT-4o-mini

While we at Microsoft share the same enthusiasm for OpenAI’s GPT-4.1-series of models as our customers, we understand not everyone has the bandwidth to retrain and evaluate in the few months left before the retirement of the GPT-4o series. Today, it’s my pleasure to announce extended support beyond the retirement dates for fine-tuned GPT-4o and GPT-4o-mini models in AI Foundry:

If you have trained a custom GPT-4o or GPT-4o-mini model prior to their retirement dates, you can continue to deploy and inference your fine-tuned model 12 months past the model retirement.
If you have any of these models deployed at retirement, there’s no action required! You’re automatically covered by extended support.
Have your requirements or scale needs changed? You’ll also be free to redeploy in additional regions, deployment types, and scale just as you do today.
There’s no additional fee for extended support!

If you aren’t currently using a fine-tuned GPT-4o-series model today, you can still get started. Training remains available until model retirement.

These new capabilities reflect our commitment to making fine-tuning more versatile, accessible, and scalable. From RFT with o4-mini to Global Training and Developer Tier, Azure AI Foundry continues to evolve with the needs of developers and enterprises alike.

🔗 Get Started: