Dragons have been fascinating the world for centuries, we find them in the folklore of many cultures worldwide from Mesopotamian art and literature to nearly all Indo-European, Near Eastern, and of course the Chinese mythology. Wouldn't it be cool if we were able to (safely) summon our own dragon in the comfort of our home? Turns out it is possible and probably less complicated than you might think. In this article, we will discuss the high-level concepts and technologies needed and I will share a demo for you to try in your browser (best on mobile) or your Mixed Reality setup (Augmented/Virtual reality).
If, like me, you tend to be a little impatient, you can start the free "Create and deploy a voice activated WebXR app with Babylon.js and Azure Cognitive Services" Learn Module to learn how to create this application, or you can try the browser demo (click this link or scan the QR code below to open it on your phone) to get a feel of what you will be learning to make. (Hint: In the demo, after you summon your dragon, try telling the dragon to “go red” or “go blue” *wink wink*) If you are feeling adventurous, check out the source code of the demo.
Let’s take a moment to talk about Extended Reality on the web. Mixed, Extended, or Cross Reality (often abbreviated XR) is a term referring to the use of Virtual Reality (VR) and/or Augmented Reality (AR). The term Virtual Reality probably reminds you of headsets that allow gamers to be fully immersed into the world of games such as Beat Saber (which I am personally a huge fan of).
On the other hand, Augmented Reality is a technology you probably have seen or even used many times, especially from your mobile phone. You could have been reviewing a piece of furniture in your living room before ordering it online, playing the famous Pokémon Go mobile game, or even just applying TikTok/Snap/Instagram filters.
In short, the module will teach you how to code a web app that will listen for a vocal command and render a virtual, animated dragon that will float in the space in front of you. Finally, you will learn how to easily deploy such an app to preview it from the web (no app installation needed) from your mobile phone, headset, or even your desktop/laptop browser (though losing the cool Augmented Reality effect).
To summon a virtual dragon we use four key technologies: Speech-to-text, WebXR, Babylon.js, and Azure Blob storage.
Speech-to-text: the process of transcribing spoken audio to text.
In the Learn module, we use a great cloud service provided by Azure to be truly cross platform, accurate, and performant. For the linked demo above, we used an experimental browser API illustrating that using a cloud service is avoidable, but such services work much better, more accurately, and across all platforms. This transcription is important when building experiences without a mouse and keyboard, and the way for applications to understand the user's commands/answers is to convert the audio signal into text that your code can easily process.
WebXR: a low level Extended Reality web API available on all modern browsers. As described in the MDN entry: "WebXR is an API for web content and apps to use to interface with mixed reality hardware such as VR headsets and glasses with integrated augmented reality features. This includes both managing the process of rendering the views needed to simulate the 3D experience and the ability to sense the movement of the headset (or other motion-sensing gear) and providing the needed data to update the imagery shown to the user." WebXR, however, is not a rendering technology, and most developers don't use WebXR directly. They instead rely on higher-level rendering frameworks using WebGL or WebGPU. So while this is a key technology, it is hidden behind the nice APIs provided by the rendering framework.
Azure Blob Storage: Microsoft cloud solution to easily deploy and serve our app/assets. Azure Blob Storage is designed to be an optimal solution for storing massive amounts of unstructured data, such as text or binary data, which makes it great for storing and sharing images or videos for distributed access. However, its Static Website feature also allows us to host a website in just two steps: (1) enabling the feature from Azure portal and (2) uploading the files to the storage container.
As a developer, the coolest part is how Babylon offers a straightforward and efficient API that makes 3D and Extended Reality development trivial and close to traditional web development, while offering advanced features. Furthermore, WebXR offers such a great user experience since users don’t need to install anything to try/use a Virtual Reality/Augmented Reality app. Users simply need to open their browser, go to the site, and everything just works. This means that native app developers can also embed a web view and extend their app super quickly.
Using this set of technology, you can safely summon a pet dragon in your living room (or wherever you like). Whether it has been your dream to become a wizard, or the cool speech and extended reality technology just caught your attention, check out the Learn module for creating this app to get going on the journey of becoming a cool WebXR magician!
Special thanks to Matt Aimonetti for his help in making this blog post and demo app come to life.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.