In July we shared with this audience that OpenAI Whisper would be coming soon to Azure AI services, and today – we are very happy to announce – is the day! Customers of Azure OpenAI service and Azure AI Speech can now use Whisper.
The OpenAI Whisper model is an encoder-decoder Transformer that can transcribe audio into text in 57 languages. Additionally, it offers translation services from those languages to English, producing English-only output. Furthermore, it creates transcripts with enhanced readability.
OpenAI Whisper model in Azure OpenAI service
Azure OpenAI Service enables developers to run OpenAI’s Whisper model in Azure, mirroring the OpenAI Whisper API in features and functionality, including transcription and translation capabilities.
The Whisper model's REST APIs for transcription and translation are available from the Azure OpenAI Service portal.
See details on how to use the Whisper model with the Azure OpenAI Service here: Speech to text with Azure OpenAI Service - Azure OpenAI | Microsoft Learn
OpenAI Whisper model in Azure AI Speech
Users of Azure AI Speech can leverage OpenAI’s Whisper model in conjunction with the Azure AI Speech batch transcription API. This enables customers to easily transcribe large volumes of audio content at scale. This capability is particularly useful for processing extensive collections of audio data stored within the Azure platform.
Users of Whisper in Azure AI Speech benefit from existing features including async processing, speaker diarization, customization (available soon), and larger file sizes.
See details on how to use the Whisper model with Azure AI Speech here: Create a batch transcription - Speech service - Azure AI services | Microsoft Learn
Getting started
Azure AI Studio
Users can use the Whisper model in Azure OpenAI through Azure AI Studio.
Azure AI Speech Studio
Users can experiment with the Whisper model in Azure AI Speech Studio.
In Speech Studio you can find both the Whisper Model in Azure OpenAI Service try-out as well as the Batch speech to text try-out that now includes the Whisper model.
Speech to text try-outs and tools in Speech Studio:
The Batch speech to text try-out allows you to compare the output of the Whisper model side by side with an Azure Speech model as a quick initial evaluation of which model may work better for your specific scenario.
Comparing Whisper to Azure Speech model in the Batch speech to text try-out:
Conclusion
The Whisper model is a great addition to the broad portfolio of capabilities the Azure AI Speech Service offers. We are looking forward to seeing the innovative ways in which developers will take advantage of this new offering to improve business productivity and to delight users.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.