Blog Post

Azure AI Foundry Blog
4 MIN READ

Deploying OpenAI’s First Open-Source Model on Azure AKS with KAITO

maljazaery's avatar
maljazaery
Icon for Microsoft rankMicrosoft
Aug 15, 2025

In this tutorial, we’ll walk through deploying the openai/gpt-oss-20B model on Azure’s Standard_NV36ads_A10_v5 GPU instances using vLLM for fast inference — all running on AKS via KAITO. By the end, you’ll have a public endpoint where you can send API requests in OpenAI’s format.

Special Thanks Thanks to Andrew Thomas, Kurt Niebuhr, and Sachi Desai for their invaluable support, insightful discussions, and for providing the compute resources that made testing this deployment ...
Updated Aug 15, 2025
Version 5.0