Releasing as Serverless offering today, Llama 4 Scout and Maverick Instruct models. Also available on GitHub Models.
Last week, we kicked off the arrival of Meta’s powerful new Llama 4 models in Azure with the launch of three models across Azure AI Foundry and Azure Databricks. Today, we’re expanding the herd with the addition of two new 17B-parameter instruction-tuned models — now available in the Azure AI Foundry model catalog as Models as a Service (MaaS) endpoints
New Models
- Llama-4-Scout-17B-16E-Instruct
A fast, low-latency 17B model with 16 experts — optimized for general-purpose tasks with strong instruction following. - Llama-4-Maverick-17B-128E-Instruct-FP8
A larger, more expressive variant with 128 experts and FP8 precision — built for heavier, higher-quality reasoning under constrained compute.
Both models are:
- Hosted as serverless MaaS endpoints — no infrastructure setup required
- Available on GitHub Models and playground
What Makes These Llama 4 Models Special?
These models are part of Meta’s mixture-of-experts (MoE) family of Llama 4 variants. Unlike dense models, these MoE architectures selectively activate a subset of model parameters (experts) per token, yielding improved efficiency without sacrificing output quality.
- Scout-17B-16E offers fast inference for common enterprise workloads like summarization, Q&A, and structured output tasks.
- Maverick-17B-128E-FP8 introduces aggressive expert scaling and FP8 precision, enabling high-throughput inference with improved energy efficiency.
How to Get Started
You can find these models in the Azure AI Foundry model catalog — just search for "Llama-4" or navigate to the Meta model family. With a few clicks, you can:
- Deploy the model as a serverless endpoint
- Invoke it via the Azure AI Foundry playground
- Integrate using the Azure OpenAI-compatible REST API or Python SDK
Use Cases
These 17B models are a great fit for:
- Knowledge assistant copilots
- Long-form summarization
- Table-to-text transformation
- Conversational agents
- Internal developer tools
Explore More
To learn more about last week’s launch of Llama 4 models, including 8B and 70B variants, check out the official Azure blog: Introducing the Llama 4 Herd in Azure AI Foundry and Azure Databricks
Try these models today in Azure AI Foundry and let us know what you build!