Announcing the General Availability of GPT-4 Turbo with Vision on Azure OpenAI Service

Former Employee

May 01, 2024

We are excited to announce the general availability (GA) of GPT-4 Turbo with Vision on the Azure OpenAI Service. The GA model, gpt-4-turbo-2024-04-09, is a multimodal model capable of processing both text and image inputs to generate text outputs. This model replaces the following preview models:

gpt-4-1106-preview
gpt-4-0125-preview
gpt-4-vision-preview

Our customers and partners have been utilizing GPT-4 Turbo with Vision to create new processes, enhance efficiencies, and innovate within their businesses. Applications range from retailers improving the online shopping experience, to media and entertainment companies enriching digital asset management, and various organizations deriving insights from charts and diagrams. We will be showcasing detailed case studies from these applications at the upcoming Build conference.

Existing Azure OpenAI Service customers can now deploy gpt-4-turbo-2024-04-09 in Sweden Central and East US 2. For more information, please visit our model availability page.

Guide to Deploying GPT-4 Turbo with Vision GA

To deploy this GA model from the Studio UI, select "GPT-4" and then choose the "turbo-2024-04-09" version from the dropdown menu. The default quota for the gpt-4-turbo-2024-04-09 model will be the same as current quota for GPT-4-Turbo. See the regional quota limits.

Upgrade Path from Preview to GA Models

We are targeting the upgrade of deployments that utilize any of the three preview models (gpt-4-1106-preview, gpt-4-0125-preview, and gpt-4-vision-preview) and are configured for auto-update on the Azure OpenAI Service. These deployments will be upgraded to gpt-4-turbo-2024-04-09 starting on June 10th or later. We will notify all customers with these preview deployments at least two weeks before the start of the upgrades. We will publish an upgrade schedule detailing the order of regions and model versions that we will follow during the upgrades in our public documentation.

Upcoming Features for image (vision) inputs: JSON Mode and Function Calling

JSON mode and function calling for inference requests involving image (vision) inputs will be available in GA in the coming weeks. Please note that text-based inputs will continue to support both JSON mode and function calling.

Changes to GPT-4 Vision Enhancements

Enhancements such as Optical Character Recognition (OCR), object grounding, video prompts, and "Azure OpenAI Service on your data with images", that were integrated with the gpt-4-vision-preview model will not be available with the GA model. We are dedicated to enhancing our products to provide value to our customers, and are actively exploring how to best integrate these features into future offerings.