This post was co-authored by Melissa Ma, Yueying Liu, Anny Dow and Sheng Zhao
Online learning has grown rapidly over the last couple of months as schools and organizations adapt to new ways of connecting and methods of education. Speech technology can play a significant role in making distance learning more engaging and accessible to students of all backgrounds. With Azure Cognitive Services, developers can quickly add speech capabilities to applications, bringing online learning to life.
One key element in language learning is improving pronunciation skills. For new language learners, practicing pronunciation and getting timely feedback is essential to becoming a more fluent speaker. In the current environment, online language learning and the ability to practice anytime, anywhere, has become even more important.
At the Build conference in May, we announced the preview of the pronunciation assessment capability, powered by Speech to Text.
The pronunciation assessment capability evaluates speech pronunciation and gives speakers feedback on the accuracy and fluency of spoken audio, allowing users to benefit from:
With pronunciation assessment, language learners can practice, get instant feedback, and improve their pronunciation. Online learning solution providers or educators can use the capability to evaluate pronunciation of multiple speakers in real-time. Pronunciation assessment currently supports the English language.
Educational organizations, like the Tomorrow Advancing Life (TAL) Education Group, are already building applications using pronunciation assessment to help students practice language learning remotely.
“Effectively and efficiently teaching accurate pronunciation to students of different levels is a big challenge, both in class and outside of class. The Speech service’s pronunciation assessment capability provides a powerful solution to address this challenge. We’ve been highly impressed by the robustness of pronunciation assessment and its ability to deal with noisy environments, and how well it correlates with pronunciation evaluations conducted by our teachers.”
- Xiangyu Hu, AI Scientist of Tomorrow Advancing Life (TAL) Education Group
Learn how you can get started with the pronunciation assessment using our tutorial video and download source code from Github to try out.
Another way that Speech technology can support better online learning experiences is through Text to Speech, a Speech service feature that converts text to lifelike speech. Educators can create interactive materials with highly expressive and humanlike voices using Neural Text to Speech (Neural TTS), now available in 36 voices with 31 languages. (Learn about our most recent languages here.)
With Neural TTS, developers can add natural-sounding voice to learning materials, for scenarios like slide narration. Neural TTS can also be used for reading aloud any content, facilitating new ways for students to interact with material as well as increasing accessibility for students with learning differences. Educational organizations can also use Neural TTS to create AI-powered virtual “teachers” that interact with students to make online courses more engaging.
With the Custom Neural Voice capability, online learning solution providers can further create interactive learning experiences for their students in a voice that represents their brand, or develop unique voices for different characters. For example, Duolingo, one of the world’s most popular language learning apps, is creating unique voices for different characters used in the lessons.
Using SSML or the Audio Content Creation tool, users can further finetune audio characteristics like speaking rate, pitch, and pronunciation to fit their scenarios—no code required. Neural TTS also supports different speaking styles—like cheerfulness and empathy—making it easier to bring audiobooks to life. Recently we have just added 10 new voice styles, available in Chinese (Xiaoxiao voice) and will be expanded to other languages. With these new styles, online education solution providers can create more engaging interactive courses that express rich emotions.
To learn more about Audio Content Creation, watch the video tutorial.
To learn more and get started adding speech to your educational applications, check out our resources below:
Pronunciation Assessment
Text to Speech
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.