Azure AI speech to text enables developers to quickly and accurately transcribe audio to text in more than 100 languages and variants. It also supports custom models to enhance accuracy for domain-specific terminology. At Microsoft Build we are announcing a new Fast Transcription API in preview in June which enables developers to create accurate transcripts of audio with 40x RTF processing. This means that for example a 10 minute audio file can be transcribed in 15 seconds. With this new simple synchronous REST API, scenarios that require quick generation of a transcript from audio become easy to implement. The Azure AI Speech Service also provides a language ID capability that can be used to identify the spoken language based on the audio which enables developers to simplify the user experience for users that interact with audio in multiple languages.
Azure AI text to speech enables developers to convert text into human like synthesized speech. Neural TTS is a text to speech system that uses deep neural networks to make the voices of computers nearly indistinguishable from recordings of people. It provides human-like natural prosody and clear articulation of words, which significantly reduces listening fatigue during interaction with AI systems. Azure AI text to speech offers more than 400 voices and more than 140 languages and locales. A single pre-built realistic neural voice with multilingual support makes it easy to read content in a broad range of languages in the same voice. You can try the demo and hear the voices in the voice gallery.
OPPO, a leading global technology brand known for its innovative smartphones and smart devices, will announce its new AI phone to pilot a new user experience based on these new technologies. Some of the new features for users are fast transcription of audio recordings for notes and to-dos, as well as read aloud of articles to enable users to use smartphones without eye contact.
With the new fast transcription feature of Azure AI Speech Service, OPPO has been able to use the following architecture to create a smooth user experience for audio recording transcription:
The “article reading” feature can be easily implemented based on Azure Text to Speech (TTS) Service with the pre-built multilingual neural voices:
With its outstanding fast speech transcription capabilities and advanced speech synthesis capabilities, Azure AI Speech Service has greatly elevated the AI speech and text experience on OPPO AI phones.