Blog Post

AI - Azure AI services Blog
2 MIN READ

Introducing the GPT-4o-Mini Audio Models: Adding More Choice to Audio-Enhanced AI Interaction

Allan_Carranza's avatar
Feb 05, 2025

We are thrilled to announce the release of the new GPT-4o-Mini-Realtime-Preview and GPT-4o-Mini-Audio-Preview models, both now available in preview. These new models introduce advanced audio capabilities at just 25% of the cost of GPT-4o audio models. Adding on to the existing GPT-4o audio models, this expansion enhances the potential for AI applications in text and voice-based interactions. Starting today, developers can unlock immersive, voice-driven experiences by harnessing the advanced capabilities of all Azure OpenAI Service advanced audio models, now in public preview.

Key Benefits

  • Advanced Audio Capabilities: Enjoy high-quality audio interactions at a fraction of the cost of GPT-4o audio models.
  • Seamless Compatibility: Our new models are compatible with existing Realtime API and Chat Completion API, ensuring smooth integration and consistent functionality across model families.
  • Innovative Interactions: Experience natural and intuitive interactions with our voice-based capabilities, making your interactions more engaging and effective.

Detailed Features

GPT-4o-Mini-Realtime-Preview:

  • Real-Time Voice Interaction: Enable real-time, natural voice-based interactions for a more engaging user experience.
  • When to Use: Ideal for applications requiring immediate, real-time responses, such as customer service chatbots and virtual assistants.

GPT-4o-Mini-Audio Preview:

  • Advanced Audio Capabilities: Provides high-quality audio interactions at a reduced cost.
  • When to Use: Perfect for applications requiring asynchronous audio capabilities, such as recording sentiment analysis and text-to-audio content creation.

Real-World Applications

The potential of our new products spans across various industries, transforming how businesses operate and how users interact with technology:

  • Customer Service: Voice-based chatbots and virtual assistants can now handle customer inquiries more naturally and efficiently, reducing wait times and improving overall satisfaction.
  • Content Creation: Media producers can revolutionize their workflows by leveraging speech generation for use in video games, podcasts, and film studios.
  • Real-Time Translation: Industries such as healthcare and legal services can benefit from real-time audio translation, breaking down language barriers and fostering better communication in critical contexts.

 Ready to get started?

Updated Feb 05, 2025
Version 1.0
No CommentsBe the first to comment