Event banner
AMA: GPT-4o Audio model revolutionizes your Copilot and other AI applications
Event details
Unlock the potential of your applications with the latest GPT-4o-realtime API with Audio, now available on Azure on October 1st, 2024. Join us to explore how this model, integrated as part of the new Copilot Voice product will revolutionize your AI applications to new heights with natural and multilingual conversations, enhanced customer interactions with faster responses, and streamlined workflows and business operations.
Don’t miss out on the future of AI innovation—register now and be part of the transformation!
An AMA is a live text-based online event similar to an “Ask Me Anything” on Reddit. This AMA gives you the opportunity to connect with Microsoft product experts who will be on hand to answer your questions and listen to feedback. The AMA takes place entirely in the comments below. There is no additional video or audio link as this is text-based.
Feel free to post your questions anytime in the comments below beforehand, if it fits your schedule or time zone better, though questions will not be answered until the live hour.
64 Comments
- EricStarkerFormer Employee
- cdrguruOccasional Reader
I've been exploring the new GPT-4o-Realtime API with Audio and wanted to share how I've integrated it into an Azure Function for a solution called AInsights. This setup allows me to tag specific parts of conversations—like "perfect prompts" or key "assistant responses"—and later retrieve them using voice commands. It's been incredibly helpful when I need to quickly reference past demos or insights. For example, I might say:
Remember the demo where [person/company] mentioned [specific keyword/phrase]? Can you recall that and provide the follow-up insight?
The problem is that I forget how to type and spell and find it so much quicker to ask my "AI" to do it. I've created a short demo showcasing how I use GPT-4o's voice capabilities to efficiently search my Azure AI Chat history and streamline my workflow. You can watch it here: https://www.youtube.com/watch?v=9D0i-J-KIa0 - EricStarkerFormer EmployeeThanks for joining us for the GPT-4o-realtime API with Audio AMA! We'll be posting a summary of the questions and answers soon. See you next time!
- Thanks, excellent session!!
- SriniTechBrass ContributorThank you for the AMA session/ It was a brilliant session. Maybe we could do this again sometime.
- Travis_Wilson_MSFT
Microsoft
It's been great talking with you! This is for sure just the beginning; we'll have more coming soon. - LonChen
Microsoft
Glad to hear this is helpful, Srini. We will do it more often in the future. Stay tuned for our future events. - Allan_Carranza
Microsoft
Thanks for joining us - We love opportunities to connect with our wonderful community of developers!
- EricStarkerFormer EmployeeJust ten minutes to go! Get your questions in!
- Allan_Carranza
Microsoft
In case you missed it, Azure AI Search built a RAG + Voice demo utilizing the GPT-4o Realtime API. Check out the blog post below that includes more details, including a code sample to get started with RAG + Voice!
VoiceRAG: An App Pattern for RAG + Voice Using Azure AI Search and the GPT-4o Realtime API for Audio - Microsoft Community Hub - CaptainAmazingCopper Contributor
Will this new feature be included in PowerApps and Power Automate or will this only be available in Azure OpenAI as an API for coding? Follow on from this is the an API syntax or reference site we can reference for some help on how to interface with the API? Are there any specific code or framework requirements to allow for this to be a part of the project we are working on? Finally what sort of costs are being planned for this (will it be subscription or per voice line or volume of data)?
Okay my bad I just saw the SDK discussion earlier. Any chance we can get a view of some of those examples mentioned?- Allan_Carranza
Microsoft
As for pricing, we will be sharing more details very soon! Similar to other Azure OpenAI models, we will start with Pay-as-you-go pricing and introduce others like provisioned, batch, etc. over time. If there are other pricing models that would better help you scale your applications, we always welcome feedback!
- jrwarwickBrass ContributorWill there be an OotB hardware option similar to Amazon Echo Dot or Google Home voice assistant nodes? (this is our chance to have a second shot at having an awesome Cortana implementation). r if not, will there be enough API and persistence to implement something like that?
- Allan_Carranza
Microsoft
As the Azure AI Platform team, it is our responsibility to ensure that state-of-the-art technology is available for any developer to integrate into their exciting products and applications. With the improvements the GPT-4o-realtime API provides in speech and audio capabilities, there are endless opportunities to integrate speech capabilities into any product.....whether old or new. 😁
- CaptainAmazingCopper ContributorEven Text messages! LOL 😉 any chance this will be integrated into Dynamix or PABX solutions or are we just talking the API today?
- riyazlambatOccasional Readeris this only a text based event
- EricStarkerFormer Employee
Yes, this is a text-based event.
- CaptainAmazingCopper ContributorOn a more serious note, are there any Visual Studio examples or tutorials that engage with these audio real time chat API's that we can use to get started with on our own projects?
- Travis_Wilson_MSFT
Microsoft
For some basic "getting started" resources, check out https://github.com/azure-samples/aoai-realtime-audio-sdk -- this has an interactive localhost web demo using a standalone TypeScript SDK library, an interactive console demo with tools using the official .NET SDK's latest beta, and some non-interactive, file-based demonstrations using a standalone Python library. - Allan_Carranza
Microsoft
We have prepared SDKs and samples to help builders get up and running as quickly as possible -> https://github.com/azure-samples/aoai-realtime-audio-sdk. This repository is actively monitored by our team, and we welcome any suggestions and contributions per these guidelines -> https://github.com/Azure-Samples/aoai-realtime-audio-sdk/blob/main/CONTRIBUTING.md. Our goal is to continuously improve the experience and make getting started with any new model as easy as possible!