Unlock the next generation of visual content creation, advancing visual AI for enterprise scenarios
Developers building with visual AI can often run into the same frustrations: images that drift from the prompt, inconsistent object placement, text that renders unpredictably, and editing workflows that break when iterating on a single asset.
That’s why we are excited to announce OpenAI's GPT Image 1.5 is now generally available in Microsoft Foundry. This model can bring sharper image fidelity, stronger prompt alignment, and faster image generation that supports iterative workflows. Starting today, customers can request access to the model and start building in the Foundry platform.
Meet GPT Image 1.5
AI driven image generation began with early models like OpenAI's DALL-E, which introduced the ability to transform text prompts into visuals. Since then, image generation models have been evolving to enhance multimodal AI across industries. GPT Image 1.5 represents continuous improvement in enterprise-grade image generation. Building on the success of GPT Image 1 and GPT Image 1 mini, these enhanced models introduce advanced capabilities that cater to both creative and operational needs.
The new image models offer:
- Text-to-image: Stronger instruction following and highly precise editing.
- Image-to-image: Transform existing images to iteratively refine specific regions
- Improved visual fidelity: More detailed scenes and realistic rendering.
- Accelerated creation times: Up to 4x faster generation speed.
- Enterprise integration: Deploy and scale securely in Microsoft Foundry.
GPT Image 1.5 delivers stronger image preservation and editing capabilities, maintaining critical details like facial likeness, lighting, composition, and color tone across iterative changes. You’ll see more consistent preservation of branded logos and key visuals, making it especially powerful for marketing, brand design, and ecommerce workflows—from graphics and logo creation to generating full product catalogs (variants, environments, and angles) from a single source image.
Benchmarks
Based on an internal Microsoft dataset, GPT Image 1.5 performs higher than other image generation models in prompt alignment and infographics tasks. It focuses on making clear, strong edits – performing best on single-turn modification, delivering the higher visual quality in both single and multi-turn settings. The following results were found across image generation and editing:
Text to image
|
|
Prompt alignment |
Diagram / Flowchart |
|
GPT Image 1.5 |
91.2% |
96.9% |
|
GPT Image 1 |
87.3% |
90.0% |
|
Qwen Image |
83.9% |
33.9% |
|
Nano Banana Pro |
87.9% |
95.3% |
Image editing
|
Evaluation Aspect |
Modification |
Preservation |
Visual Quality |
Face Preservation | ||
|
Metrics |
BinaryEval |
SC (semantic) |
DINO (Visual) |
BinaryEval |
AuraFace | |
|
Single-turn |
GPT image 1 |
99.2% |
51.0% |
0.14 |
79.5% |
0.30 |
|
Qwen image |
81.9% |
63.9% |
0.44 |
76.0% |
0.85 | |
|
GPT Image 1.5 |
100% |
56.77% |
0.14 |
89.96% |
0.39 | |
|
Multi-turn |
GPT Image 1 |
93.5% |
54.7% |
0.10 |
82.8% |
0.24 |
|
Qwen image |
77.3% |
68.2% |
0.43 |
77.6% |
0.63 | |
|
GPT image 1.5 |
92.49% |
60.55% |
0.15 |
89.46% |
0.28 | |
Using GPT Image 1.5 across industries
Whether you’re creating immersive visuals for campaigns, accelerating UI and product design, or producing assets for interactive learning GPT Image 1.5 gives modern enterprises the flexibility and scalability they need. Image models can allow teams to drive deeper engagement through compelling visuals, speed up design cycles for apps, websites, and marketing initiatives, and support inclusivity by generating accessible, high‑quality content for diverse audiences. Watch how Foundry enables developers to iterate with multimodal AI across Black Forest Labs, OpenAI, and more:
Microsoft Foundry empowers organizations to deploy these capabilities at scale, integrating image generation seamlessly into enterprise workflows. Explore the use of AI image generation here across industries like:
- Retail: Generate product imagery for catalogs, e-commerce listings, and personalized shopping experiences.
- Marketing: Create campaign visuals and social media graphics.
- Education: Develop interactive learning materials or visual aids.
- Entertainment: Edit storyboards, character designs, and dynamic scenes for films and games.
- UI/UX: Accelerate design workflows for apps and websites.
Microsoft Foundry provides security and compliance with built-in content safety filters, role-based access, network isolation, and Azure Monitor logging. Integrated governance via Azure Policy, Purview, and Sentinel gives teams real-time visibility and control, so privacy and safety are embedded in every deployment. Learn more about responsible AI at Microsoft.
Pricing
|
Model |
Pricing (per 1M tokens) - Global |
|
GPT-image-1.5 |
Input Tokens: $8 Cached Input Tokens: $2 Output Tokens: $32 |
Cost efficiency improves as well: image inputs and outputs are now cheaper compared to GPT Image 1, enabling organizations to generate and iterate on more creative assets within the same budget. For detailed pricing, refer here.
Getting started
Learn more about image generation, explore code samples, and read about responsible AI protections here.
Try GPT Image 1.5 in Microsoft Foundry and start building multimodal experiences today. Whether you’re designing educational materials, crafting visual narratives, or accelerating UI workflows, these models deliver the flexibility and performance your organization needs.