microsoft foundry
71 TopicsIntroducing GPT-5.4 in Microsoft Foundry
Today, we’re thrilled to announce that OpenAI’s GPT‑5.4 is now generally available in Microsoft Foundry: a model designed to help organizations move from planning work to reliably completing it in production environments. As AI agents are applied to longer, more complex workflows; consistency and follow‑through become as important as raw intelligence. GPT‑5.4 combines stronger reasoning with built in computer use capabilities to support automation scenarios, and dependable execution across tools, files, and multi‑step workflows at scale. GPT-5.4: Enhanced Reliability in Production AI GPT-5.4 is designed for organizations operating AI in real production environments, where consistency, instruction adherence, and sustained context are critical to success. The model brings together advances in reasoning, coding, and agentic workflows to help AI systems not only plan tasks but complete them with fewer interruptions and reduced manual oversight. Compared with earlier generations, GPT-5.4 emphasizes stability across longer interactions, enabling teams to deploy agentic AI with greater confidence in day-to-day production use. GPT-5.4 introduces advancements that aim for production grade AI: More consistent reasoning over time, helping maintain intent across multi‑turn and multi‑step interactions Enhanced instruction alignment to reduce prompt tuning and oversight Latency improved performance for responsive, real-time workflows Integrated computer use capabilities for structured orchestration of tools, file access, data extraction, guarded code execution, and agent handoffs More dependable tool invocation reducing prompt tuning and human oversight Higher‑quality generated artifacts, including documents, spreadsheets, and presentations with more consistent structure Together, these improvements support AI systems that behave more predictably as tasks grow in length and complexity. From capability to real-world outcomes GPT‑5.4 delivers practical value across a wide range of production scenarios where follow‑through and reliability are essential: Agent‑driven workflows, such as customer support, research assistance, and business process automation Enterprise knowledge work, including document drafting, data analysis, and presentation‑ready outputs Developer workflows, spanning code generation, refactoring, debugging support, and UI scaffolding Extended reasoning tasks, where logical consistency must be preserved across longer interactions Teams benefit from reduced task drift, fewer mid‑workflow failures, and more predictable outcomes when deploying GPT‑5.4 in production. GPT-5.4 Pro: Deeper analysis for complex decision workflows GPT‑5.4 Pro, a premium variant designed for scenarios where analytical depth and completeness are prioritized over latency. Additional capabilities include: Multi‑path reasoning evaluation, allowing alternative approaches to be explored before selecting a final response Greater analytical depth, supporting problems with trade‑offs or multiple valid solutions Improved stability across long reasoning chains, especially in sustained analytical tasks Enhanced decision support, where rigor and thoroughness outweigh speed considerations Organizations typically select GPT‑5.4 Pro when deeper analysis is required such as scientific research and complex problems, while GPT‑5.4 remains the right choice for workloads that prioritize reliable execution and agentic follow‑through. Microsoft Foundry: Enterprise‑Grade Control from Day One GPT‑5.4 and GPT‑5.4 Pro are available through Microsoft Foundry, which provides the operational controls organizations need to deploy AI responsibly in production environments. Foundry supports policy enforcement, monitoring, version management, and auditability, helping teams manage AI systems throughout their lifecycle. By deploying GPT‑5.4 through Microsoft Foundry, organizations can integrate advanced agentic capabilities into existing environments while aligning with security, compliance, and operational requirements from day one. Customer Spotlight Get Started with GPT-5.4 in Microsoft Foundry GPT‑5.4 sets a new bar for production‑ready AI by combining stronger reasoning with dependable execution. Through enterprise‑grade deployment in Microsoft Foundry, organizations can move beyond experimentation and confidently build AI systems that complete complex work at scale. Computer use capabilities will be introduced shortly after launch. GPT‑5.4 <272K input tokens context length in Microsoft Foundry is priced at $2.50 per million input tokens, $0.25 per million cached input tokens, and $15.00 per million output tokens. The GPT‑5.4 >272K input tokens context length in Microsoft Foundry is priced at $5.00 per million input tokens, $0.50 per million cached input tokens, and $22.50 per million output tokens. The GPT-5.4 is available at launch in Standard Global and Standard Data Zone (US), with additional deployment options coming soon. GPT‑5.4 Pro is priced at $30.00 per million input tokens, and $180.00 per million output tokens, and is available at launch in Standard Global. Build agents for real-world workloads. Start building with GPT‑5.4 in Microsoft Foundry today.23KViews4likes2CommentsAnnouncing GPT‑5.2‑Codex in Microsoft Foundry: Enterprise‑Grade AI for Secure Software Engineering
Enterprise developers know the grind: wrestling with legacy code, navigating complex dependency challenges, and waiting on security reviews that stall releases. OpenAI’s GPT‑5.2‑Codex flips that equation and helps engineers ship faster without cutting corners. It’s not just autocomplete; it’s a reasoning engine for real-world software engineering. Generally available starting today through Azure OpenAI in Microsoft Foundry Models, GPT‑5.2‑Codex is built for the realities of enterprise codebases, large repos, evolving requirements, and security constraints that can’t be overlooked. As OpenAI’s most advanced agentic coding model, it brings sustained reasoning, and security-aware assistance directly into the workflows enterprise developers already rely on with Microsoft’s secure and reliable infrastructure. GPT-5.2-Codex at a Glance GPT‑5.2‑Codex is designed for how software gets built in enterprise teams. You start with imperfect inputs including legacy code, partial docs, screenshots, diagrams, and work through multi‑step changes, reviews, and fixes. The model helps keep context, intent, and standards intact across that entire lifecycle, so teams can move faster without sacrificing quality or security. What it enables Work across code and artifacts: Reason over source code alongside screenshots, architecture diagrams, and UI mocks — so implementation stays aligned with design intent. Stay productive in long‑running tasks: Maintain context across migrations, refactors, and investigations, even as requirements evolve. Build and review with security in mind: Get practical support for secure coding patterns, remediation, reviews, and vulnerability analysis — where correctness matters as much as speed. Feature Specs (quick reference) Context window: 400K tokens (approximately 100K lines of code) Supported languages: 50+ including Python, JavaScript/TypeScript, C#, Java, Go, Rust Multimodal inputs: Code, images (UI mocks, diagrams), and natural language API compatibility: Drop-in replacement for existing Codex API calls Use cases where it really pops Legacy modernization with guardrails: Safely migrate and refactor “untouchable” systems by preserving behavior, improving structure, and minimizing regression risk. Large‑scale refactors that don’t lose intent: Execute cross‑module updates and consistency improvements without the typical “one step forward, two steps back” churn. AI‑assisted code review that raises the floor: Catch risky patterns, propose safer alternatives, and improve consistency, especially across large teams and long‑lived codebases. Defensive security workflows at scale: Accelerate vulnerability triage, dependency/path analysis, and remediation when speed matters, but precision matters more. Lower cognitive load in long, multi‑step builds: Keep momentum across multi‑hour sessions: planning, implementing, validating, and iterating with context intact. Pricing Model Input Price/1M Tokens Cached Input Price/1M Tokens Output Price/1M Tokens GPT-5.2-Codex $1.75 $0.175 $14.00 Security Aware by Design, not as an Afterthought For many organizations, AI adoption hinges on one nonnegotiable question: Can this be trusted in security sensitive workflows? GPT-5.2-Codex meaningfully advances the Codex lineage in this area. As models grow more capable, we’ve seen that general reasoning improvements naturally translate into stronger performance in specialized domains — including defensive cybersecurity. With GPT‑5.2‑Codex, this shows up in practical ways: Improved ability to analyze unfamiliar code paths and dependencies Stronger assistance with secure coding patterns and remediation More dependable support during code reviews, vulnerability investigations, and incident response At the same time, Microsoft continues to deploy these capabilities thoughtfully balancing access, safeguards, and platform level controls so enterprises can adopt AI responsibly as capabilities evolve. Why Run GPT-5.2-Codex on Microsoft Foundry? Powerful models matter — but where and how they run matters just as much for enterprise. Organizations choose Microsoft Foundry because it combines Foundry frontier AI with Azure enterprise grade fundamentals: Integrated security, compliance, and governance Deploy GPT-5.2-Codex within existing Azure security boundaries, identity systems, and compliance frameworks — without reinventing controls. Enterprise ready orchestration and tooling Build, evaluate, monitor, and scale AI powered developer experiences using the same platform teams already rely on for production workloads. A unified path from experimentation to scale Foundry makes it easier to move from proof of concept to real deployment —without changing platforms, vendors, or operating assumptions. Trust at the platform level For teams working in regulated or security critical environments, Foundry and Azure provide assurances that go beyond the model itself. Together with GitHub Copilot, Microsoft Foundry provides a unified developer experience — from in‑IDE assistance to production‑grade AI workflows — backed by Azure’s security, compliance, and global scale. This is where GPT-5.2-Codex becomes not just impressive but adoptable. Get Started Today Explore GPT‑5.2‑Codex in Microsoft today. Start where you already work: Try GPT‑5.2‑Codex in GitHub Copilot for everyday coding and scale the same model to larger workflows using Azure OpenAI in Microsoft Foundry. Let’s build what’s next with speed and security.17KViews3likes1CommentIntroducing MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 in Microsoft Foundry
Another Step Towards a Complete AI Platform Since inception, our goal with Microsoft Foundry has been to deliver the most complete AI and app agent factory; giving developers access to the latest frontier models, tools, infrastructure, security, and reliability to confidently build and scale their AI solutions. Today, we're taking another step towards that vision by announcing the public preview of three new models from Microsoft AI in Microsoft Foundry: MAI-Transcribe-1: Our first-generation speech recognition model, delivering enterprise-grade accuracy across 25 languages at approximately 50% lower GPU cost than leading alternatives. MAI-Voice-1: A high-fidelity speech generation model capable of producing 60 seconds of expressive audio in under one second on a single GPU. MAI-Image-2: Our highest-capability text-to-image model, which debuted on #3 on the Arena.ai leaderboard for image model families. These are the same models already powering our own products such as Copilot, Bing, PowerPoint, and Azure Speech, and now they're available exclusively on Foundry for developers to use. We can't wait to see what you create with these new multimedia AI models in public preview. Read on for a deeper look at each model's capabilities and how to start building with them in Foundry! MAI-Transcribe-1 & Voice-1: End-To-End Voice Experiences Voice and speech are rapidly becoming the primary interface for the next generation of AI agents, and building great voice experiences requires models that can both speak and listen with precision. With MAI-Voice-1 and MAI-Transcribe-1, Microsoft is delivering exactly that: a comprehensive, first-party audio AI stack purpose-built for developers. MAI-Voice-1 is a lightning-fast speech generation model capable of producing a full minute of audio in under a second on a single GPU; making it one of the most efficient speech systems available today. On the listening side, MAI-Transcribe-1 supports up to 25 languages and is engineered for enterprise-grade reliability across accents, languages, and real-world audio conditions. But what truly sets it apart is its efficiency: when benchmarked against leading transcription models, MAI-Transcribe-1 delivers competitive accuracy at nearly half the GPU cost; an advantage that translates directly into more predictable, scalable pricing for enterprises 1 . Use cases for MAI-Transcribe-1 and MAI-Voice-1 MAI-Voice-1 and MAI-Transcribe-1 are designed for production use across a broad set of real-world scenarios: Conversational AI & Agent Assist: Enable real‑time transcription for IVR systems, virtual assistants, and call‑center workflows to power voice‑driven interfaces, live agent assist, and post‑call summarization. Live Captioning & Accessibility: Deliver real‑time captions for large events, enterprise meetings, and digital communications to improve accessibility and inclusivity across spoken experiences. Media, Subtitling & Archiving: Automate video subtitling, dialogue indexing, and transcription to support scalable content production, searchability, and long‑term media archiving. Education & Training Platforms: Transcribe lectures, learning modules, and certification programs to enhance discoverability, reviewability, and knowledge retention in e‑learning environments. Customer & Market Insights: Convert spoken interactions across research interviews, focus groups, and support channels into structured data for downstream analytics and business intelligence. We're also applying these model capabilities inside Microsoft's own products. MAI-Voice-1 powers the expressive voice experiences in Copilot's Audio Expressions and podcast features. MAI-Transcribe-1 drives Copilot's Voice Mode transcriptions and the new dictation feature, connecting natural voice input with the generative power of Copilot's language models. Both models are available through Azure Speech, where developers can tap into first-party MAI model quality alongside the enterprise-grade reliability, scalability, and 700+ voice gallery of the Azure Speech ecosystem. Try MAI-Transcribe-1 & Voice-1 Today MAI-Transcribe-1 and Voice-1 are available now through Azure Speech. Here's how to get started: Experiment in MAI Playground: Speak, record, or upload audio to see the models in action at the MAI playground. Build in Foundry: deploy MAI-Transcribe-1 and MAI-Voice-1 in Azure Speech. MAI-Transcribe-1 starts at $0.36 USD per hour, while MAI-Voice-1 pricing starts at $22 USD per 1M characters. Developers looking to create custom voices using MAI-Voice-1 can do so through the Personal Voice feature in Azure Speech — including the ability to clone a voice from a short 10-second audio sample. Note that custom voice creation requires an approval process consistent with Microsoft's responsible AI policies. MAI-Image-2: Limitless Creativity For Every Builder Images are at the center of how developers build compelling AI-powered creative experiences; from marketing tools to content platforms to multimodal agents. MAI-Image-2 is Microsoft's answer to that demand. This model has been developed in close collaboration with photographers, designers, and visual storytellers and debuted in the top-3 text-to-image model families on the Arena.ai leaderboard. It raises the bar across the capabilities that matter most in real creative workflows; more natural, photorealistic image generation, stronger in-image text rendering for infographics and diagrams, and greater precision on complex layouts, detailed scenes, and cinematic visuals. Use cases for MAI-Image-2 Developers can integrate MAI-Image-2 across a range of high-impact workflows: Media & Creative Ideation: Designers, illustrators, and creative teams use text‑to‑image generation to explore visual directions, styles, and compositions early in the creative process—moving from concept to exploration faster. Enterprise Communications & Internal Branding: Organizations create custom visuals for internal campaigns, training materials, and executive communications directly from text, ensuring clarity, polish, and brand alignment without relying on stock imagery. UX & Product Concept Visualization: Product teams visualize interfaces, workflows, environments, and conceptual product scenarios from text descriptions, helping teams communicate ideas and align early—before engineering or design resources are engaged. WPP, one of the world's largest marketing and communications groups, is among the first enterprise partners building with MAI-Image-2 at scale, using it to power creative production workflows that previously required significant manual effort. "MAI-Image-2 is a genuine game-changer. It's a platform that not only responds to the intricate nuance of creative direction, but deeply respects the sheer craft involved in generating real-world, campaign-ready images. WPP has some of the best creative talent in the world and MAI-Image-2 is making them even better." -Rob Reilly, Global Chief Creative Officer, WPP We’re also implementing MAI-Image-2 to power image generation within Microsoft’s own products, including Copilot, Bing Image Creator, and PowerPoint, and now you have access to this powerful, cost effective model for your own apps. Try MAI-Image-2 Today Experiment in the MAI Playground: Preview MAI-Image-2 at MAI playground and share feedback directly with the team. Build in Foundry: deploy MAI-Image-2 via the API and start building your apps and agents! MAI-Image-2 starts at $5 USD per 1M tokens for text input and $33 USD per 1M tokens for image output. We look forward to your feedback on these models in Foundry. References: 1 1 st on overall WER on the FLEURS benchmark. Out of the top 25 global languages, MAI-Transcribe-1 ranks 1st by FLEURS in 11 core languages. It wins against Whisper-large-v3 on the remaining 14 and Gemini 3.1 Flash on 11 of those 14.16KViews1like0CommentsNew Azure Open AI models bring fast, expressive, and real‑time AI experiences in Microsoft Foundry
Modern AI applications, whether voice‑first experiences or building large software systems, rarely fit into a single prompt. Real work unfolds over time: maintaining context, following instructions, invoking tools, and adapting as requirements evolve. When these foundations break down through latency spikes, instruction drift, or unreliable tool calls, both user conversations and developer workflows are impacted. OpenAI’s latest models address this shared challenge by prioritizing continuity and reliability across real‑time interaction and long‑running engineering tasks. Starting today, GPT-Realtime-1.5, GPT-Audio-1.5, and GPT-5.3-Codex are rolling out into Microsoft Foundry. Together, these models reflect the growing needs of the modern developer and push the needle from short, stateless interactions toward AI systems that can reason, act, and collaborate over time. GPT-5.3-Codex at a glance GPT‑5.3‑Codex brings together advanced coding capability with broader reasoning and professional problem solving in a single model built for real engineering work. It unifies the frontier coding performance of GPT-5.2-Codex with the reasoning and professional knowledge capabilities of GPT5.2 in one system. This shifts the experience from optimizing isolated outputs to supporting longer running development efforts; where repositories are large, changes span multiple steps, and requirements aren’t always fully specified at the start. What’s improved Model experiences 25% faster execution time, according to Open AI, than its predecessors so developers can accelerate development of new applications. Built for long-running tasks that involve research, tool use, and complex, multi‑step execution while maintaining context. Midtask steerability and frequent updates allow developers to redirect and collaborate with the model as it works without losing context. Stronger computer-use capabilities allow developers to execute across the full spectrum of technical work. Common use cases Developers and teams can apply GPT‑5.3‑Codex across a wide range of scenarios, including: Refactoring and modernizing large or legacy applications Performing multi‑step migrations or upgrades Running agentic developer workflows that span analysis, implementation, testing, and remediation Automating code reviews, test generation, and defect detection Supporting development in security‑sensitive or regulated environments Pricing Model Input Price/1M Tokens Cached Input Price/1M Tokens Output Price/1M Tokens GPT-5.3-Codex $1.75 $0.175 $14.00 GPT-Realtime-1.5 and GPT-Audio-1.5 at a glance The models deliver measurable gains in reasoning and speech understanding for real‑time voice interactions on Microsoft Foundry. In OpenAI’s evaluations, it shows a +5% lift on Big Bench Audio (reasoning), a +10.23% improvement in alphanumeric transcription, and a +7% gain in instruction following, while maintaining low‑latency performance. Key improvements include: What's improved More natural‑sounding speech: Audio output is smoother and more conversational, with improved pacing and prosody. Higher audio quality: Clearer, more consistent audio output across supported voices. Improved instruction following: Better alignment with developer‑provided system and user instructions during live interactions. Function calling support: Enables structured, tool‑driven interactions within real‑time audio flows. Common use cases Developers are using GPT-Realtime-1.5 and GPT-Audio-1.5 for scenarios where low‑latency voice interaction is essential, including: Conversational voice agents for customer support or internal help desks Voice‑enabled assistants embedded in applications or devices Live voice interfaces for kiosks, demos, and interactive experiences Hands‑free workflows where audio input and output replace keyboard interaction Pricing Model Text Audio Image Input Cached Input Output Input Cached Input Output Input Cached Input Output GPT-Realtime-1.5 $4.00 $0.04 $16.0 $32.0 $0.40 $64.00 $4.00 $0.04 $16.0 GPT-Audio-1.5 $2.50 n/a $10.0 $32.00 n/a $64.00 $2.50 n/a $10.0 Getting started in Microsoft Foundry Start building in Microsoft Foundry, evaluate performance, and explore Azure Open AI models today. Foundry brings evaluation, deployment, and governance into a single workflow, helping teams progress from experiments to scalable applications while maintaining security and operational controls.14KViews1like0CommentsStop Drawing Architecture Diagrams Manually: Meet the Open-Source AI Architecture Review Agents
Designing and documenting software architecture is often a battle against static diagrams that become outdated the moment they are drawn. The Architecture Review Agent changes that by turning your design process into a dynamic, AI-powered workflow. In this post, we explore how to leverage Microsoft Foundry Hosted Agents, Azure OpenAI, and Excalidraw to build an open-source tool that instantly converts messy text descriptions, YAML, or README files into editable architecture diagrams. Beyond just drawing boxes, the agent acts as a technical co-pilot, delivering prioritized risk assessments, highlighting single points of failure, and mapping component dependencies. Discover how to eliminate manual diagramming, catch security flaws early, and deploy your own enterprise-grade agent with zero infrastructure overhead.13KViews6likes5CommentsIntroducing OpenAI's GPT-image-2 in Microsoft Foundry
Take a small design team running a global social campaign. They have the creative vision to produce localized imagery for every market, but not the resources to reshoot, reformat, or outsource that scale. Every asset needs to fit a different platform, a different dimension, a different cultural context, and they all need to ship at the same time. This is where flexible image generation comes in handy. OpenAI's GPT-image-2 is now generally available and rolling out today to Microsoft Foundry, introducing a step change in image generation. Developers and designers now get more control over image output, so a small team can execute with the reach and flexibility of a much larger one. What is new in GPT-image-2? GPT-image-2 brings real world intelligence, multilingual understanding, improved instruction following, increased resolution support, and an intelligent routing layer giving developers the tools to scale image generation for production workflows. Real world intelligence GPT-image-2 has a knowledge cut off of December 2025, meaning that it is able to give you more contextually relevant and accurate outputs. The model also comes with enhanced thinking capabilities that allow it to search the web, check its own outputs, and create multiple images from just one prompt. These enhancements shift image generation models away from being simple tools and runs them into creative sidekicks. Multilingual understanding GPT-image-2 includes increased language support across Japanese, Korean, Chinese, Hindi, and Bengali, as well as new thinking capabilities. This means the model can create images and render text that feels localized. Increased resolution support GPT-image-2 introduces 4K resolution support, giving developers the ability to generate rich, detailed, and photorealistic images at custom dimensions. Resolution guidelines to keep in mind: Constraint Detail Total pixel budget Maximum pixels in final image cannot exceed 8,294,400 Minimum pixels in final image cannot be less than 655,360 Requests exceeding this are automatically resized to fit. Resolutions 4K, 1024x1024, 1536x1024, and 1024x1536 Dimension alignment Each dimension must be a multiple of 16 Note: If your requested resolution exceeds the pixel budget, the service will automatically resize it down. Intelligent routing layer GPT-image-2 also includes an expanded routing layer with two distinct modes, allowing the service to intelligently select the right generation configuration for a request without requiring an explicitly set size value. Mode 1 — Legacy size selection In Mode 1, the routing layer selects one of the three legacy size tiers to use for generation: Size tier Description smimage Small image output image Standard image output xlimage Large image output This mode is useful for teams already familiar with the legacy size tiers who want to benefit from automatic selection without making any manual changes. Mode 2 — Token size bucket selection In Mode 2, the routing layer selects from six token size buckets — 16, 24, 36, 48, 64, 96 — which map roughly to the legacy size tiers: Token bucket Approximate legacy size 16, 24 smimage 36, 48 image 64, 96 xlimage This approach can allow for more flexibility in the number of tokens generated, which in turn helps to better optimize output quality and efficiency for a given prompt. See it in action GPT-image-2 shows improved image fidelity across visual styles, generating more detailed and refined images. But, don’t just take our word for it, let's see the model in action with a few prompts and edits. Here is the example we used: Prompt: Interior of an empty subway car (no people). Wide-angle view looking down the aisle. Clean, modern subway car with seats, poles, route map strip, and ad frames above the windows. Realistic lighting with a slight cool fluorescent tone, realistic materials (metal poles, vinyl seats, textured floor). As you can see, when using the same base prompt, the image quality and realism improved with each model. Now let’s take a look at adding incremental changes to the same image: Prompt: Populate the ad frames with a cohesive ad campaign for “Zava Flower Delivery” and use an array of flower types. And our subway is now full of ads for the new ZAVA flower delivery service. Let's ask for another small change: Prompt: In all Zava Flower Delivery advertisements, change the flowers shown to roses (red and pink roses). And in three simple prompts, we've created a mockup of a flower delivery ad. From marketing material to website creation to UX design, GPT-image-2 now allows developers to deliver production-grade assets for real business use cases. Image generation across industries These new capabilities open the door to richer, more production-ready image generation workflows across a range of enterprise scenarios: Retail & e-commerce: Generate product imagery at exact platform-required dimensions, from square thumbnails to wide banners, without post-processing. Marketing: Produce crisp, rich in color campaign visuals and social assets localized to different markets. Media & entertainment: Generate storyboard panels and scene at resolutions suited to production pipelines. Education & training: Create visual learning aids and course materials formatted to exact display requirements across devices. UI/UX design: Accelerate mockup and prototype workflows by generating interface assets at the precise dimensions your design system requires. Trust and safety At Microsoft, our mission to empower people and organizations remains constant. As part of this commitment, models made available through Foundry undergo internal reviews and are deployed with safeguards designed to support responsible use at scale. Learn more about responsible AI at Microsoft. For GPT-image-2, Microsoft applied an in-depth safety approach that addresses disallowed content and misuse while maintaining human oversight. The deployment combines OpenAI’s image generation safety mitigations with Azure AI Content Safety, including filters and classifiers for sensitive content. Pricing Model Offer type Pricing - Image Pricing - Text GPT-image-2 Standard Global Input Tokens: $8 Cached Input Tokens: $2 Output Tokens: $30 Input Tokens: $5 Cached Input Tokens: $1.25 Note: All prices are per 1M token. There is no billing for output tokens for the GPT-image-2 model. Getting started Whether you’re building a personalized retail experience, automating visual content pipelines or accelerating design workflows. GPT-image-2 gives your team the resolution control and intelligent routing to generate images that fit your exact needs. Try the GPT-image-2 in Microsoft Foundry today! Deploy the model in Microsoft Foundry Experiment with the model in the Image playground Read the documentation to learn more13KViews3likes3CommentsIntroducing OpenAI’s GPT-5.4 mini and GPT-5.4 nano for low-latency AI
Imagine you’re a developer building a research assistant agent on top of GPT‑5.4. The agent retrieves documents, summarizes findings, and answers follow‑up questions across multiple turns. In early testing, the reasoning quality is strong, but as the agent chains together retrieval, tool calls, and generation, latency starts to add up. For interactive experiences, those delays matter—so many teams adopt a multi‑model approach, using a larger model to plan and smaller models to execute subtasks quickly at scale. This is where GPT‑5.4 mini and GPT‑5.4 nano come in. These smaller variants of GPT-5.4 are optimized for developer workloads where latency, cost savings, and agentic design are top of mind. GPT-5.4 mini and GPT-5.4 nano will be rolling out today in Microsoft Foundry, so you can evaluate them in the model catalog and deploy the right option for each workload. GPT-5.4 mini: efficient reasoning for production workflows GPT-5.4 mini distills GPT-5.4’s strengths into a smaller, more efficient model for developer workloads where responsiveness matters. It significantly improves over GPT-5 mini across coding, reasoning, multimodal understanding, and tool use while running about 2X faster. Text and image inputs: build multimodal experiences that combine prompts with screenshots or other images. Tool use and function calling: reliably invoke tools and APIs for agentic workflows. Web search and file search: ground responses in external or enterprise content as part of multi-step tasks. Computer use: support software-interaction loops where the model interprets UI state and takes well-scoped actions. Where GPT-5.4 mini thrives Developer copilots and coding assistants: latency-sensitive coding help, code review suggestions, and fast iteration loops where turnaround time matters. Multimodal developer workflows: applications that interpret screenshots, understand UI state, or process images as part of coding and debugging loops. Computer-use sub-agents: fast executors that take well-scoped actions in software (for example, navigating UIs or completing repetitive steps) within a larger agent loop coordinated by a planner model. GPT-5.4 nano: ultra-low latency automation at scale GPT-5.4 nano is the smallest and fastest model in the lineup, designed for low-latency and low-cost API usage at high throughput. It’s optimized for short-turn tasks like classification, extraction, and ranking, plus lightweight sub-agent work where speed and cost are the priority and extended multi-step reasoning isn’t required. Strong instruction following: consistent adherence to developer intent across short, well-defined interactions. Function and tool calling: dependable invocation of tools and APIs for lightweight agent and automation scenarios. Coding support: optimized performance for common coding tasks where fast turnaround is required. Image understanding: multimodal image input support for basic image interpretation alongside text. Low-latency, low-cost execution: designed to deliver responses quickly and efficiently at scale. Where GPT-5.4 nano thrives GPT-5.4 nano is a strong fit when you need predictable behavior at very high throughput and the task can be expressed as short, well-scoped instructions. Classification and intent detection: fast labeling and routing decisions for high-volume requests. Extraction and normalization: pull structured fields from text, validate formats, and standardize outputs. Ranking and triage: reorder candidates, prioritize tickets/leads, and select best-next actions under tight latency budgets. Guardrails and policy checks: lightweight safety and policy classification, prompt gating, and enforcement decisions before dispatching to tools or larger models. High-volume text processing pipelines: batch transformation, cleanup, deduping, and normalization steps where unit cost and throughput dominate. Routing and prioritization at the edge: select the right downstream workflow (template, queue, or model) for each request under tight latency budgets. Choosing the right GPT-5.4 model Microsoft Foundry makes it possible to deploy multiple GPT-5.4 variants side by side, so teams can route requests to the model that best fits each task. Here’s a practical way to think about the lineup: Model Best suited for Typical workloads GPT-5.4 Sustained, multi-step reasoning with reliable follow-through Agentic workflows, research assistants, document analysis, complex internal tools GPT-5.4 Pro Deeper, higher-reliability reasoning for complex production scenarios High-stakes agentic workflows, long-form analysis and synthesis, complex planning, advanced internal copilots GPT-5.4 mini Balanced reasoning with lower latency for interactive systems Real-time agents, developer tools, retrieval-augmented applications GPT-5.4 nano Ultra-low latency and high throughput High-volume request routing, real-time chat, lightweight automation Responsible AI in Microsoft Foundry At Microsoft, our mission to empower people and organizations remains constant. In the age of AI, trust is foundational to adoption, and earning that trust requires a commitment to transparency, safety, and accountability. Microsoft Foundry provides governance controls, monitoring, and evaluation capabilities to help organizations deploy GPT-5.4 models responsibly in production environments, aligned with Microsoft's Responsible AI principles. Pricing Model Deployment Input (USD $/M tokens) Cached input (USD $/M tokens) Output (USD $/M tokens) GPT-5.4 mini Standard Global $0.75 $0.075 $4.5 GPT-5.4 nano Standard Global $0.20 $0.02 $1.25 The models are also available in Data Zone US. It is rolling out to Data Zone EU. Getting started Explore the models in Microsoft Foundry. Sign in to the Foundry portal and browse the model catalog to evaluate GPT-5.4 mini and GPT-5.4 nano alongside other options, then deploy the right model for each workload.13KViews0likes1CommentIntroducing OpenAI’s GPT-image-1.5 in Microsoft Foundry
Developers building with visual AI can often run into the same frustrations: images that drift from the prompt, inconsistent object placement, text that renders unpredictably, and editing workflows that break when iterating on a single asset. That’s why we are excited to announce OpenAI's GPT Image 1.5 is now generally available in Microsoft Foundry. This model can bring sharper image fidelity, stronger prompt alignment, and faster image generation that supports iterative workflows. Starting today, customers can request access to the model and start building in the Foundry platform. Meet GPT Image 1.5 AI driven image generation began with early models like OpenAI's DALL-E, which introduced the ability to transform text prompts into visuals. Since then, image generation models have been evolving to enhance multimodal AI across industries. GPT Image 1.5 represents continuous improvement in enterprise-grade image generation. Building on the success of GPT Image 1 and GPT Image 1 mini, these enhanced models introduce advanced capabilities that cater to both creative and operational needs. The new image model offer: Text-to-image: Stronger instruction following and highly precise editing. Image-to-image: Transform existing images to iteratively refine specific regions Improved visual fidelity: More detailed scenes and realistic rendering. Accelerated creation times: Up to 4x faster generation speed. Enterprise integration: Deploy and scale securely in Microsoft Foundry. GPT Image 1.5 delivers stronger image preservation and editing capabilities, maintaining critical details like facial likeness, lighting, composition, and color tone across iterative changes. You’ll see more consistent preservation of branded logos and key visuals, making it especially powerful for marketing, brand design, and ecommerce workflows—from graphics and logo creation to generating full product catalogs (variants, environments, and angles) from a single source image. Benchmarks Based on an internal Microsoft dataset, GPT Image 1.5 performs higher than other image generation models in prompt alignment and infographics tasks. It focuses on making clear, strong edits – performing best on single-turn modification, delivering the higher visual quality in both single and multi-turn settings. The following results were found across image generation and editing: Text to image Prompt alignment Diagram / Flowchart GPT Image 1.5 91.2% 96.9% GPT Image 1 87.3% 90.0% Qwen Image 83.9% 33.9% Nano Banana Pro 87.9% 95.3% Image editing Evaluation Aspect Modification Preservation Visual Quality Face Preservation Metrics BinaryEval SC (semantic) DINO (Visual) BinaryEval AuraFace Single-turn GPT image 1 99.2% 51.0% 0.14 79.5% 0.30 Qwen image 81.9% 63.9% 0.44 76.0% 0.85 GPT Image 1.5 100% 56.77% 0.14 89.96% 0.39 Multi-turn GPT Image 1 93.5% 54.7% 0.10 82.8% 0.24 Qwen image 77.3% 68.2% 0.43 77.6% 0.63 GPT image 1.5 92.49% 60.55% 0.15 89.46% 0.28 Using GPT Image 1.5 across industries Whether you’re creating immersive visuals for campaigns, accelerating UI and product design, or producing assets for interactive learning GPT Image 1.5 gives modern enterprises the flexibility and scalability they need. Image models can allow teams to drive deeper engagement through compelling visuals, speed up design cycles for apps, websites, and marketing initiatives, and support inclusivity by generating accessible, high‑quality content for diverse audiences. Watch how Foundry enables developers to iterate with multimodal AI across Black Forest Labs, OpenAI, and more: Microsoft Foundry empowers organizations to deploy these capabilities at scale, integrating image generation seamlessly into enterprise workflows. Explore the use of AI image generation here across industries like: Retail: Generate product imagery for catalogs, e-commerce listings, and personalized shopping experiences. Marketing: Create campaign visuals and social media graphics. Education: Develop interactive learning materials or visual aids. Entertainment: Edit storyboards, character designs, and dynamic scenes for films and games. UI/UX: Accelerate design workflows for apps and websites. Microsoft Foundry provides security and compliance with built-in content safety filters, role-based access, network isolation, and Azure Monitor logging. Integrated governance via Azure Policy, Purview, and Sentinel gives teams real-time visibility and control, so privacy and safety are embedded in every deployment. Learn more about responsible AI at Microsoft. Pricing Model Pricing (per 1M tokens) - Global GPT-image-1.5 Input Tokens: $8 Cached Input Tokens: $2 Output Tokens: $32 Cost efficiency improves as well: image inputs and outputs are now cheaper compared to GPT Image 1, enabling organizations to generate and iterate on more creative assets within the same budget. For detailed pricing, refer here. Getting started Learn more about image generation, explore code samples, and read about responsible AI protections here. Try GPT Image 1.5 in Microsoft Foundry and start building multimodal experiences today. Whether you’re designing educational materials, crafting visual narratives, or accelerating UI workflows, these models deliver the flexibility and performance your organization needs.9.1KViews2likes1CommentBuilding Production-Ready, Secure, Observable, AI Agents with Real-Time Voice with Microsoft Foundry
We're excited to announce the general availability of Foundry Agent Service, Observability in Foundry Control Plane, and the Microsoft Foundry portal — plus Voice Live integration with Agent Service in public preview — giving teams a production-ready platform to build, deploy, and operate intelligent AI agents with enterprise-grade security and observability.8.8KViews2likes0CommentsIntroducing Cohere Rerank 4.0 in Microsoft Foundry
These new retrieval models deliver state-of-the-art accuracy, multilingual coverage across 100+ languages, and breakthrough performance for enterprise search and retrieval-augmented generation (RAG) systems. With Rerank 4.0, customers can dramatically improve the quality of search, reduce hallucinations in RAG applications, and strengthen the reasoning capabilities of their AI agents, all with just a few lines of code. Why Rerank Models Matter for Enterprise AI Retrieval is the foundation of grounded AI systems. Whether you are building an internal assistant, a customer-facing chatbot, or a domain-specific knowledge engine, the quality of the retrieved documents determines the quality of the final answer. Traditional embeddings get you close, but reranking is what gets you the right answer. Rerank improves this step by reading both the query and document together (cross-encoding), producing highly precise semantic relevance scores. This means: More accurate search results More grounded responses in RAG pipelines Lower generative model usage , reducing cost Higher trust and quality across enterprise workloads Introducing Cohere Rerank 4.0 Fast and Rerank 4.0 Pro Microsoft Foundry now offers two versions of Rerank 4.0 to meet different enterprise needs: Rerank 4.0 Fast Best balance of speed and accuracy Same latency as Cohere Rerank 3.5, with significantly higher accuracy Ideal for high-traffic applications and real-time systems Rerank 4.0 Pro Highest accuracy across all benchmarks Excels at complex, reasoning-heavy, domain-specific retrieval Tuned for industries like finance, healthcare, manufacturing, government, and energy Multilingual & Cross-Domain Performance Rerank 4.0 delivers unmatched multilingual and cross-domain performance, supporting more than 100 languages and enabling powerful cross-lingual search across complex enterprise datasets. The models achieve state-of-the-art accuracy in 10 of the world’s most important business languages, including Arabic, Chinese, French, German, Hindi, Japanese, Korean, Portuguese, Russian, and Spanish, making them exceptionally well suited for global organizations with multilingual knowledge bases, compliance archives, or international operations. Effortless Integration: Add Rerank to Any System One of the biggest benefits of Rerank 4.0 is how easy it is to adopt. You can add reranking to: Existing enterprise search Vector DB pipelines Keyword search systems Hybrid retrieval setups RAG architectures Agent workflows No infrastructure changes required. Just a few lines of code.This makes it one of the fastest ways to meaningfully upgrade grounding, precision, and search quality in enterprise AI systems. Better RAG, Better Agents, Better Outcomes In Foundry, customers can pair Cohere Rerank 4.0 with Azure Search, vector databases, Agent Service, Azure Functions, Foundry orchestration, and any LLM—including GPT-4.1, Claude, DeepSeek, and Mistral—to deliver more grounded copilots, higher-fidelity agent actions, and better reasoning from cleaner context windows. This reduces hallucinations, lowers LLM spend, and provides a foundational upgrade for mission-critical AI systems. Built for Enterprise: Security, Observability, Governance As a direct from Azure model, Rerank 4.0 is fully integrated with: Azure role-based access control (RBAC) Virtual network isolation Customer-managed keys Logging & observability Entra ID authentication Private deployments You can run Rerank 4.0 in environments that meet the strictest enterprise security and compliance needs. Optimized for Enterprise Models & High-Value Industries Rerank 4.0 is built for sectors where accuracy matters: Finance - Delivers precise retrieval for complex disclosures, compliance documents, and regulatory filings. Healthcare- Accurately retrieves clinical notes, biomedical literature, and care protocols for safer, more reliable insights. Manufacturing- Surfaces the right engineering specs, manuals, and parts data to streamline operations and reduce downtime. Government & Public Sector - Improves access to policy documents, case archives, and citizen service information with semantic precision. Energy- Understands industrial logs, safety manuals, and technical standards to support safer and more efficient operations. Pricing Model Name Deployment Type Azure Resource Region Price /1K Search Units Availability Cohere Rerank 4.0 Pro Global Standard All regions (Check this page for region details) $2.50 Public Preview, Dec 11, 2025 Cohere Rerank 4.0 Fast Global Standard All regions (Check this page for region details) $2.00 Public Preview, Dec 11, 2025 Get Started Today Cohere Rerank 4.0 Fast and Rerank 4.0 Pro are now available in Microsoft Foundry. Rerank 4.0 is one of the simplest and highest impact upgrades you can make to your enterprise AI stack, bringing better retrieval, better agents, and more trustworthy AI to every application.8.6KViews5likes0Comments