Educational organizations, like the Tomorrow Advancing Life (TAL) Education Group, are already building applications using pronunciation assessment to help students practice language learning remotely.
“Effectively and efficiently teaching accurate pronunciation to students of different levels is a big challenge, both in class and outside of class. The Speech service’s pronunciation assessment capability provides a powerful solution to address this challenge. We’ve been highly impressed by the robustness of pronunciation assessment and its ability to deal with noisy environments, and how well it correlates with pronunciation evaluations conducted by our teachers.”
- Xiangyu Hu, AI Scientist of Tomorrow Advancing Life (TAL) Education Group
Learn how you can get started with the pronunciation assessment using our tutorial video and download source code from Github to try out.
Developing interactive courses with Text to Speech
Another way that Speech technology can support better online learning experiences is through Text to Speech, a Speech service feature that converts text to lifelike speech. Educators can create interactive materials with highly expressive and humanlike voices using Neural Text to Speech (Neural TTS), now available in 36 voices with 31 languages. (Learn about our most recent languages here.)
With Neural TTS, developers can add natural-sounding voice to learning materials, for scenarios like slide narration. Neural TTS can also be used for reading aloud any content, facilitating new ways for students to interact with material as well as increasing accessibility for students with learning differences. Educational organizations can also use Neural TTS to create AI-powered virtual “teachers” that interact with students to make online courses more engaging.
Experience the Neural Voices with the new Edge browser
With the Custom Neural Voice capability, online learning solution providers can further create interactive learning experiences for their students in a voice that represents their brand, or develop unique voices for different characters. For example, Duolingo, one of the world’s most popular language learning apps, is creating unique voices for different characters used in the lessons.
Using SSML or the Audio Content Creation tool, users can further finetune audio characteristics like speaking rate, pitch, and pronunciation to fit their scenarios—no code required. Neural TTS also supports different speaking styles—like cheerfulness and empathy—making it easier to bring audiobooks to life. Recently we have just added 10 new voice styles, available in Chinese (Xiaoxiao voice) and will be expanded to other languages. With these new styles, online education solution providers can create more engaging interactive courses that express rich emotions.
To learn more about Audio Content Creation, watch the video tutorial.
To learn more and get started adding speech to your educational applications, check out our resources below:
Pronunciation Assessment
- Try out our demo
- Learn more with our documentation
- Check out easy-to-deploy samples
- Watch the video introduction and video tutorial
Text to Speech
- Check out our demo
- Learn more with our documentation
- Follow the QuickStart
- Learn more about responsible deployment of Custom Neural Voice
- Video tutorial for the Audio Content Creation tool