Announcing Global Provisioned Managed Deployments for Scaling Azure OpenAI Service Workloads

Microsoft

Sep 18, 2024

We’re excited to announce a major advancement in AI deployments with Azure OpenAI Service: Global Provisioned Managed Deployments, now Generally Available (GA) as of September 18, 2024. This launch marks a significant milestone in our commitment to making AI more accessible, scalable, and flexible for customers worldwide, building on our August release of Provisioned Throughput Units (PTU) for self-service regional deployments.

What is Global Provisioned Managed?

Global Provisioned Managed is a new deployment type within the Azure OpenAI Service that leverages Azure's global infrastructure to serve provisioned traffic more efficiently. It supports the latest GPT-4o (2024-08-06) and GPT-4o-mini (2024-07-18) models, making them accessible to customers without the limitations of region-specific quotas or capacities. This new deployment model empowers customers to extend AI capabilities to any corner of the globe, providing greater flexibility and speed in deploying models.

Dual Availability: Global and Regional

We are also pleased to announce that the GPT-4o (2024-08-06) model is now available not only through Global Provisioned Managed deployments but also for Provisioned Regional Deployments via self-service. This means customers have the flexibility to choose between a globally managed deployment model or a more controlled, region-specific deployment approach, depending on their specific needs and preferences.

Key Benefits of Global Provisioned Managed Deployments

Access to the Latest Models Everywhere: The Global Provisioned Managed deployment model removes regional limitations, allowing customers to access the newest AI models like GPT-4o and GPT-4o-mini across all supported Azure regions, including eastus, westeurope, japaneast, and more.
Simplified Deployment and Management: Unlike traditional deployment approaches, Global Provisioned Managed decouples capacity management from specific regions, granting automatic access to the new global quota for all eligible customers.
Data Residency and Compliance Flexibility: While API traffic may be processed globally, all customer data is securely stored in the Azure OpenAI Service resource’s region, ensuring adherence to regional data residency and compliance requirements.
Transparent and Flexible Pricing: Billing for Global Provisioned Managed follows the same model as existing Provisioned Managed deployments, ensuring predictable costs with options for hourly pricing and reservations to accommodate diverse usage scenarios.
Dual Deployment Options for Greater Flexibility: The availability of the GPT-4o model for both Global Provisioned Managed and Provisioned Regional Deployments gives customers the freedom to choose the most suitable deployment strategy for their organizational needs.

Why Choose Global Provisioned Managed?

This new deployment type represents a significant evolution in our approach to AI, offering:

Global Reach: Deploy AI models anywhere without the constraints of regional quotas or capacities.
Cost Efficiency: Benefit from cost management options, including monthly and yearly reservations.
Enhanced Flexibility: Deploy and scale AI solutions faster with less complexity and administrative burden, allowing you to focus more on innovation.
Regional Control: For customers needing specific regional deployments, the GPT-4o model remains available through self-service, enabling full control over capacity management.

How to Get Started

Deploying your AI models globally or regionally is simple:

For Global Provisioned Managed Deployments: This option will be available in your Azure OpenAI Service regional resources starting September 18, 2024. To use it, create or select an existing regional resource, and choose the Global Provisioned Managed deployment option.
For Provisioned Regional Deployments: The GPT-4o (2024-08-06) model is available for self-service regional deployments, giving you flexibility to manage regional capacities and resources according to your needs.

Looking Ahead: More Models and Regions

Our initial rollout of Global Provisioned Managed includes support for the GPT-4o and GPT-4o-mini models, with plans to expand the availability of more models under this deployment type. For those requiring specific regional support, the existing Provisioned Managed deployment remains available.

Embrace the Future of AI with Azure OpenAI Service

Azure OpenAI Service is committed to pushing the boundaries of AI capabilities. With the new Global Provisioned Managed deployments, we’re breaking down barriers, providing more flexibility, and ensuring our customers can fully leverage AI's potential anywhere in the world.

Learn More: