Blog Post

Microsoft Foundry Blog
4 MIN READ

NVIDIA’s Open Models on Microsoft Foundry:

vaidyas's avatar
vaidyas
Icon for Microsoft rankMicrosoft
Mar 16, 2026

Building a Unified Platform for Frontier Agentic and Physical AI Systems

As enterprises and governments move from AI experimentation to production, they face several structural challenges: fragmented AI stacks across environments, operational complexity in customizing models, and strict requirements around data sovereignty, privacy, and security that limit the use of proprietary models. These challenges often lead to long development cycles, duplicated engineering effort, and stalled AI adoption. 

Microsoft Foundry is designed to address these challenges by providing a unified, interoperable platform to build, optimize, and govern AI innovation at scale across environments. Through Foundry, developers can access leading models, infrastructure, and AI development tools in a single platform that supports enterprise security, governance, and lifecycle management. 

Today, Microsoft and NVIDIA are expanding this platform through deeper integrations that bring NVIDIA’s open models, accelerated computing, and AI development frameworks into the Foundry ecosystem. By combining Microsoft Foundry’s AI platform with NVIDIA’s optimized inference frameworks and accelerated computing, developers can build specialized AI systems once and deploy them across cloud, hybrid, and sovereign environments. 

At GTC 2026, Microsoft and NVIDIA are announcing the first deliverables of this collaboration, focused on: 

  • Specialized agentic systems powered by Nemotron models 
  • Sovereign and on-prem AI deployments through Foundry Local 
  • Production-grade physical AI workflows built on Azure and NVIDIA AI platforms 

 

Building Specialized Agents with Nemotron on Foundry 

Developers can build and deploy specialized AI agents using NVIDIA Nemotron models directly within Microsoft Foundry. Through Foundry’s managed compute environment, teams can deploy and customize Nemotron models on Azure infrastructure accelerated by NVIDIA GPUs. 

Microsoft Foundry expands its open model catalog with several NVIDIA Nemotron models available through NVIDIA NIM microservices, providing enterprises with production-ready open-weight reasoning models accessible through a unified platform. 

These models include: 

NVIDIA Nemotron Nano 9B v2, Llama 3.1 Nemotron Nano VL 8B, and Llama 3.3 Nemotron Super 49B v1.5 and NVIDIA Nemotron Super 3. 

Over the coming quarters, the Nemotron lineup in Microsoft Foundry will expand to include: 

Nemotron 3 family 

  • Nano – optimized for low-latency and cost-efficient targeted agent tasks 
  • Super – designed for deep research and high-accuracy reasoning (available now) 
  • Ultra – designed for large-scale multi-agent enterprise applications requiring the highest reasoning performance 

Additional models planned include: 

  • Nemotron Speech – enterprise-grade open speech models for low-latency voice agents 
  • Nemotron Vision – vision-language models for document intelligence and video understanding 
  • Nemotron AI Safety models – guardrail models designed to detect harmful content, jailbreak attempts, and sensitive data exposure 

Later this year, Azure will offer Nemotron models through serverless pay-as-you-go APIs, allowing developers to deploy them through simple APIs without managing infrastructure. 

Plus, with our recent announcement with Fireworks AI, developers can soon deploy open-weight models they’ve trained or fine-tuned elsewhere on Azure using bring-your-own (BYOW) in Microsoft Foundry. Nvidia Nemotron models are fully open-weight and designed for this kind of developer-led customization.   

 

Extending Sovereign AI with Foundry Local 

Many organizations, particularly governments and regulated industries, require AI systems that operate within sovereign environments while maintaining full control over their data and infrastructure. 

Foundry Local extends Microsoft Foundry’s capabilities into these environments, enabling organizations to run AI models and workloads closer to their data across on-premises datacenters, edge locations, and sovereign private cloud infrastructure. 

Through integrations with NVIDIA accelerated computing platforms, customers can run advanced AI systems powered by GPUs such as the NVIDIA RTX PRO 6000 Blackwell Server Edition, with future support for next-generation platforms including NVIDIA Rubin. 

Azure extends to these sovereign environments with Azure Local infrastructure, with tools like Azure Arc bringing edge compute into a single management layer in portal.  Stacks like Azure Kubernetes Service (AKS) and Foundry Local then enable organizations to deploy, operate, and scale advanced AI models directly within those sovereign environments while maintaining enterprise-grade governance, security, and operational control. 

This allows governments and regulated industries to bring powerful AI capabilities to sensitive workloads without compromising sovereignty requirements. Azure Arc blog: MTC, Local, Sovereign

 

Powering Physical AI on Azure 

Microsoft is also collaborating with NVIDIA to support the next generation of physical AI and robotics systems. 

At GTC, Microsoft is introducing an open Azure Physical AI toolchain that integrates the NVIDIA Physical AI Data Factory Blueprint with Azure services including: 

  • Azure IoT Operations 
  • Microsoft Fabric Real-Time Intelligence 
  • Microsoft Foundry 
  • GitHub Copilot 

This toolchain enables robotics and physical AI developers to automate and scale data curation, augmentation, and evaluation across perception, mobility, imitation learning, and reinforcement learning pipelines. 

Developers can also leverage the NVIDIA Metropolis VSS Blueprint, designed to accelerate the development of video analytics AI agents. These blueprints are powered by NVIDIA Cosmos world foundation models, enabling synthetic world generation and large-scale physical AI reasoning optimized for accelerated inference on Azure. 

Additionally, the upcoming NVIDIA Alpamayo open model will support advanced reasoning capabilities for data processing, closed-loop simulation, and evaluation workflows for autonomous driving systems. 

 

Foundry: AI Across Every Environment 

Today’s announcements represent an important step toward a broader vision for Microsoft Foundry. 

Microsoft is building Foundry as a unified AI platform spanning every deployment environment, including: 

  • Global public cloud 
  • Hybrid infrastructure 
  • Sovereign public clouds 
  • Sovereign private environments 

Through integrations with NVIDIA accelerated computing and infrastructure, Foundry enables organizations to deploy AI systems consistently across environments while maintaining a unified development and operations platform. 

This approach allows enterprises to build specialized AI systems once and deploy them wherever their workloads and data reside. 

More details on this roadmap will be shared at Microsoft Build 2026. 

 

Get Started Today 

Explore models 
Access NVIDIA Nemotron models in the AI Model Catalog | Microsoft Foundry Models

Deploy sovereign AI 
Learn more about Foundry Local deployments with NVIDIA accelerated infrastructure. 

Build physical AI systems 
Explore the Azure Physical AI toolchain and NVIDIA Physical AI Data Factory. 

See it live at GTC 
Visit the Microsoft booth to experience end-to-end demos of agentic and physical AI workflows. 

Updated Mar 16, 2026
Version 1.0
No CommentsBe the first to comment