Blog Post

Microsoft Foundry Blog
5 MIN READ

Introducing OpenAI's GPT-image-2 in Microsoft Foundry

Naomi Moneypenny's avatar
Apr 21, 2026

Unlock next-generation visual AI for enterprise workflows

Take a small design team running a global social campaign. They have the creative vision to produce localized imagery for every market, but not the resources to reshoot, reformat, or outsource that scale. Every asset needs to fit a different platform, a different dimension, a different cultural context, and they all need to ship at the same time. This is where flexible image generation comes in handy.

OpenAI's GPT-image-2 is now generally available and rolling out today to Microsoft Foundry, introducing a step change in image generation. Developers and designers now get more control over image output, so a small team can execute with the reach and flexibility of a much larger one.

What is new in GPT-image-2?

GPT-image-2 brings real world intelligence, multilingual understanding, improved instruction following, increased resolution support, and an intelligent routing layer giving developers the tools to scale image generation for production workflows.

Real world intelligence

GPT-image-2 has a knowledge cut off of December 2025, meaning that it is able to give you more contextually relevant and accurate outputs. The model also comes with enhanced thinking capabilities that allow it to search the web, check its own outputs, and create multiple images from just one prompt. These enhancements shift image generation models away from being simple tools and runs them into creative sidekicks.

Multilingual understanding

GPT-image-2 includes increased language support across Japanese, Korean, Chinese, Hindi, and Bengali, as well as new thinking capabilities. This means the model can create images and render text that feels localized.

Increased resolution support

GPT-image-2 introduces 4K resolution support, giving developers the ability to generate rich, detailed, and photorealistic images at custom dimensions.

Resolution guidelines to keep in mind:

Constraint

Detail

Total pixel budget

Maximum pixels in final image cannot exceed 8,294,400

Minimum pixels in final image cannot be less than 655,360

Requests exceeding this are automatically resized to fit.

Resolutions

4K, 1024x1024, 1536x1024, and 1024x1536

Dimension alignment

Each dimension must be a multiple of 16

Note: If your requested resolution exceeds the pixel budget, the service will automatically resize it down.

Intelligent routing layer

GPT-image-2 also includes an expanded routing layer with two distinct modes, allowing the service to intelligently select the right generation configuration for a request without requiring an explicitly set size value.

Mode 1 — Legacy size selection

In Mode 1, the routing layer selects one of the three legacy size tiers to use for generation:

Size tier

Description

smimage

Small image output

image

Standard image output

xlimage

Large image output

This mode is useful for teams already familiar with the legacy size tiers who want to benefit from automatic selection without making any manual changes.

Mode 2 — Token size bucket selection

In Mode 2, the routing layer selects from six token size buckets — 16, 24, 36, 48, 64, 96 — which map roughly to the legacy size tiers:

Token bucket

Approximate legacy size

16, 24

smimage

36, 48

image

64, 96

xlimage

This approach can allow for more flexibility in the number of tokens generated, which in turn helps to better optimize output quality and efficiency for a given prompt.

See it in action

GPT-image-2 shows improved image fidelity across visual styles, generating more detailed and refined images. But, don’t just take our word for it, let's see the model in action with a few prompts and edits. Here is the example we used:

Prompt: Interior of an empty subway car (no people).
Wide-angle view looking down the aisle. Clean, modern subway car with seats, poles, route map strip, and ad frames above the windows.
Realistic lighting with a slight cool fluorescent tone, realistic materials (metal poles, vinyl seats, textured floor).

Figure 1. Created with GPT-image-1Figure 2. Created with GPT-image-1.5Figure 3. Created with GPT-image-2

 

As you can see, when using the same base prompt, the image quality and realism improved with each model. Now let’s take a look at adding incremental changes to the same image:

Prompt: Populate the ad frames with a cohesive ad campaign for “Zava Flower Delivery” and use an array of flower types.

Figure 4. Created with GPT-image-2

 And our subway is now full of ads for the new ZAVA flower delivery service. Let's ask for another small change:

 

Prompt: In all Zava Flower Delivery advertisements, change the flowers shown to roses (red and pink roses).

 

Figure 5. Created with GPT-image-2

And in three simple prompts, we've created a mockup of a flower delivery ad. From marketing material to website creation to UX design, GPT-image-2 now allows developers to deliver production-grade assets for real business use cases.

Image generation across industries

These new capabilities open the door to richer, more production-ready image generation workflows across a range of enterprise scenarios:

  • Retail & e-commerce: Generate product imagery at exact platform-required dimensions, from square thumbnails to wide banners, without post-processing.
  • Marketing: Produce crisp, rich in color campaign visuals and social assets localized to different markets.
  • Media & entertainment: Generate storyboard panels and scene at resolutions suited to production pipelines.
  • Education & training: Create visual learning aids and course materials formatted to exact display requirements across devices.
  • UI/UX design: Accelerate mockup and prototype workflows by generating interface assets at the precise dimensions your design system requires.

Trust and safety

At Microsoft, our mission to empower people and organizations remains constant. As part of this commitment, models made available through Foundry undergo internal reviews and are deployed with safeguards designed to support responsible use at scale. Learn more about responsible AI at Microsoft.

For GPT-image-2, Microsoft applied an in-depth safety approach that addresses disallowed content and misuse while maintaining human oversight. The deployment combines OpenAI’s image generation safety mitigations with Azure AI Content Safety, including filters and classifiers for sensitive content.

Pricing

Model

Offer type

Pricing - Image

Pricing - Text

GPT-image-2

Standard Global

Input Tokens: $8

Cached Input Tokens: $2

Output Tokens: $30

Input Tokens: $5

Cached Input Tokens: $1.25

Output Tokens: $10

Note: All prices are per 1M token. 

Getting started

Whether you’re building a personalized retail experience, automating visual content pipelines or accelerating design workflows. GPT-image-2 gives your team the resolution control and intelligent routing to generate images that fit your exact needs. Try the GPT-image-2 in Microsoft Foundry today!

Deploy the model in Microsoft Foundry

Experiment with the model in the Image playground

Read the documentation to learn more

Updated Apr 21, 2026
Version 3.0

1 Comment

  • rcanepa's avatar
    rcanepa
    Copper Contributor

    Thanks for providing access to this awesome model so quickly! However, responses take a very long time in comparison to the official OpenAI endpoint. For example, a 1024x1024, high-quality image takes on average between 4-5 minutes. That's too long for a real production application.