Microsoft Foundry Blog

4 MIN READ

NVIDIA’s Open Models on Microsoft Foundry:

Microsoft

Mar 16, 2026

Building a Unified Platform for Frontier Agentic and Physical AI Systems

As enterprises and governments move from AI experimentation to production, they face several structural challenges: fragmented AI stacks across environments, operational complexity in customizing models, and strict requirements around data sovereignty, privacy, and security that limit the use of proprietary models. These challenges often lead to long development cycles, duplicated engineering effort, and stalled AI adoption.

Microsoft Foundry is designed to address these challenges by providing a unified, interoperable platform to build, optimize, and govern AI innovation at scale across environments. Through Foundry, developers can access leading models, infrastructure, and AI development tools in a single platform that supports enterprise security, governance, and lifecycle management.

Today, Microsoft and NVIDIA are expanding this platform through deeper integrations that bring NVIDIA’s open models, accelerated computing, and AI development frameworks into the Foundry ecosystem. By combining Microsoft Foundry’s AI platform with NVIDIA’s optimized inference frameworks and accelerated computing, developers can build specialized AI systems once and deploy them across cloud, hybrid, and sovereign environments.

At GTC 2026, Microsoft and NVIDIA are announcing the first deliverables of this collaboration, focused on:

Specialized agentic systems powered by Nemotron models

Sovereign and on-prem AI deployments through Foundry Local

Production-grade physical AI workflows built on Azure and NVIDIA AI platforms

Building Specialized Agents with Nemotron on Foundry

Developers can build and deploy specialized AI agents using NVIDIA Nemotron models directly within Microsoft Foundry. Through Foundry’s managed compute environment, teams can deploy and customize Nemotron models on Azure infrastructure accelerated by NVIDIA GPUs.

Microsoft Foundry expands its open model catalog with several NVIDIA Nemotron models available through NVIDIA NIM microservices, providing enterprises with production-ready open-weight reasoning models accessible through a unified platform.

These models include:

NVIDIA Nemotron Nano 9B v2, Llama 3.1 Nemotron Nano VL 8B, and Llama 3.3 Nemotron Super 49B v1.5 and NVIDIA Nemotron Super 3.

Over the coming quarters, the Nemotron lineup in Microsoft Foundry will expand to include:

Nemotron 3 family

Nano – optimized for low-latency and cost-efficient targeted agent tasks

Super – designed for deep research and high-accuracy reasoning (available now)

Ultra – designed for large-scale multi-agent enterprise applications requiring the highest reasoning performance

Additional models planned include:

Nemotron Speech – enterprise-grade open speech models for low-latency voice agents

Nemotron Vision – vision-language models for document intelligence and video understanding

Nemotron AI Safety models – guardrail models designed to detect harmful content, jailbreak attempts, and sensitive data exposure

Later this year, Azure will offer Nemotron models through serverless pay-as-you-go APIs, allowing developers to deploy them through simple APIs without managing infrastructure.

Plus, with our recent announcement with Fireworks AI, developers can soon deploy open-weight models they’ve trained or fine-tuned elsewhere on Azure using bring-your-own (BYOW) in Microsoft Foundry. Nvidia Nemotron models are fully open-weight and designed for this kind of developer-led customization.

Extending Sovereign AI with Foundry Local

Many organizations, particularly governments and regulated industries, require AI systems that operate within sovereign environments while maintaining full control over their data and infrastructure.

Foundry Local extends Microsoft Foundry’s capabilities into these environments, enabling organizations to run AI models and workloads closer to their data across on-premises datacenters, edge locations, and sovereign private cloud infrastructure.

Through integrations with NVIDIA accelerated computing platforms, customers can run advanced AI systems powered by GPUs such as the NVIDIA RTX PRO 6000 Blackwell Server Edition, with future support for next-generation platforms including NVIDIA Rubin.

Azure extends to these sovereign environments with Azure Local infrastructure, with tools like Azure Arc bringing edge compute into a single management layer in portal. Stacks like Azure Kubernetes Service (AKS) and Foundry Local then enable organizations to deploy, operate, and scale advanced AI models directly within those sovereign environments while maintaining enterprise-grade governance, security, and operational control.

This allows governments and regulated industries to bring powerful AI capabilities to sensitive workloads without compromising sovereignty requirements. Azure Arc blog: MTC, Local, Sovereign

Powering Physical AI on Azure

Microsoft is also collaborating with NVIDIA to support the next generation of physical AI and robotics systems.

At GTC, Microsoft is introducing an open Azure Physical AI toolchain that integrates the NVIDIA Physical AI Data Factory Blueprint with Azure services including:

Azure IoT Operations

Microsoft Fabric Real-Time Intelligence

Microsoft Foundry

GitHub Copilot

This toolchain enables robotics and physical AI developers to automate and scale data curation, augmentation, and evaluation across perception, mobility, imitation learning, and reinforcement learning pipelines.

Developers can also leverage the NVIDIA Metropolis VSS Blueprint, designed to accelerate the development of video analytics AI agents. These blueprints are powered by NVIDIA Cosmos world foundation models, enabling synthetic world generation and large-scale physical AI reasoning optimized for accelerated inference on Azure.

Additionally, the upcoming NVIDIA Alpamayo open model will support advanced reasoning capabilities for data processing, closed-loop simulation, and evaluation workflows for autonomous driving systems.

Foundry: AI Across Every Environment

Today’s announcements represent an important step toward a broader vision for Microsoft Foundry.

Microsoft is building Foundry as a unified AI platform spanning every deployment environment, including:

Global public cloud

Hybrid infrastructure

Sovereign public clouds

Sovereign private environments

Through integrations with NVIDIA accelerated computing and infrastructure, Foundry enables organizations to deploy AI systems consistently across environments while maintaining a unified development and operations platform.

This approach allows enterprises to build specialized AI systems once and deploy them wherever their workloads and data reside.

More details on this roadmap will be shared at Microsoft Build 2026.

Get Started Today

Explore models
Access NVIDIA Nemotron models in the AI Model Catalog | Microsoft Foundry Models

Deploy sovereign AI
Learn more about Foundry Local deployments with NVIDIA accelerated infrastructure.

Build physical AI systems
Explore the Azure Physical AI toolchain and NVIDIA Physical AI Data Factory.

See it live at GTC
Visit the Microsoft booth to experience end-to-end demos of agentic and physical AI workflows.

Updated Mar 16, 2026

Version 1.0

vaidyas

Microsoft

Joined October 27, 2021

View Profile

Microsoft Foundry Blog

Follow this blog board to get notified when there's new activity

Blog Post

NVIDIA’s Open Models on Microsoft Foundry:

Building a Unified Platform for Frontier Agentic and Physical AI Systems