Announcing Qwen3-VL family of models on Azure AI Foundry

Microsoft

Oct 29, 2025

We’re excited to share that Qwen3-VL family of vision-language models are now available in Azure AI Foundry thanks to our collaboration with Hugging Face. Our collaboration with Hugging Face models represents a major leap forward in operationalizing open-source AI at scale. By bridging the gap between community-driven innovation and enterprise-grade infrastructure, this collaboration enables developers to deploy state-of-the-art vision-language models rapidly, with enterprise-grade security and performance.

Azure AI Foundry provides a robust environment where Hugging Face models can be:

Deployed instantly via managed endpoints.
Governed effectively with built-in telemetry, access controls, and lifecycle management.

This unlocks new possibilities for building intelligent agents, apps, and services using the latest OSS models securely from your Azure tenancy, with optimized performance.

What is Qwen3-VL?

Qwen3-VL is the most powerful vision-language model in the Qwen series to date, marking a leap from perception to true multimodal understanding and reasoning.

Qwen3-VL integrates advanced visual agent abilities to operate interfaces, generate code from images, and reason over long contexts of up to 1 million tokens. It shows major gains in spatial understanding, STEM reasoning, and multilingual OCR (32 languages), alongside broader visual recognition across everyday and professional domains. Altogether, Qwen3-VL represents a new generation of models that not only see and describe the world — but truly understand it.

Overall, the largest mixture-of-experts Qwen3-VL-235B-A22B-Instruct achieves top performance on most metrics among non-reasoning models, significantly outperforming closed-source models such as Gemini 2.5 Pro and GPT-5, while also setting new state-of-the-art results among open-source multimodal models, demonstrating its strong generalization ability and comprehensive performance on complex visual tasks.

To see more benchmarks, take a look at Qwen’s research post.

Available models in the Azure AI Foundry

The Qwen3-VL family comprises 24 models. There are dense models with 2, 4, 8 and 32 billion parameters, but also mixture-of-experts with 235 billion parameters (22 billions active) and 30 billion parameters (3 billion active). Each model comes with a FP8 quantized checkpoint and is available in two modes, “Instruct” and “Thinking”.

22+ models are available in Azure AI Foundry:

You can try their flagship model for free on Hugging Face Spaces to test it on your scenario.

Deploy the model in your Azure account

In order to deploy the model in your Azure account, navigate to the Azure AI Foundry and search for the model of your choice. Once on the model page, click on “Use this model” and in a matter of minutes, an endpoint will be deployed for you.

For every deployment, there are pre-configured environments you can choose from with varying instance sizes, depending on the scale of traffic you expect. All environments for Qwen3-VL models are built on top of vLLM. For more advanced users, we recommend modifying the configuration according to your needs.

Whether you're building agents, apps, or research prototypes, the Foundry platform aims to offer unmatched flexible, secure and fast access to the latest OSS models.

Get started and search for Qwen3-VL in the Hugging Face Collection on Azure AI Foundry.

- Simon Pagezy (Hugging Face) and Vaidyaraman Sambasivam (Microsoft)

Updated Oct 29, 2025

Version 1.0

vaidyas

Microsoft

Joined October 28, 2021

View Profile

Microsoft Foundry Blog

Follow this blog board to get notified when there's new activity