Microsoft Mesh - A Technical Overview
Published Mar 02 2021 08:38 AM 124K Views
Senior Member

Today we announced Microsoft Mesh, a new platform built on Microsoft Azure, enabling developers to build immersive, multiuser, cross-platform mixed reality apps. Microsoft Mesh enables its users to connect with presence, share across space, and collaborate in an immersive way as if they were in person regardless of physical location. Customers can leverage Mesh to enhance virtual meetings, conduct virtual design sessions, assist remotely better, learn together virtually, host virtual social gatherings and meet-ups.


In this blog, we’ll peek under the hood of Microsoft Mesh to understand the building blocks of the platform. But first, let’s look at why we need such a platform in the first place.


Mixed Reality is the fourth wave in computing followed by Mainframes, PCs, and Smartphones. Mixed Reality is going mainstream across consumers and commercial, liberating screen-bound experiences into instinctual interactions in your space, among your things, with your people. 100's of millions of Niantic explorers around the world have experienced MR through the devices in your pocket. As we miss in-person social connections, meet-ups like concerts and fitness training are moving to the virtual world. More than 50% of the Fortune 500 organizations have deployed HoloLens and other Mixed Reality solutions to drive material RoI to their business. Given such big numbers, one would assume that there would be hundreds, if not thousands, of Mixed Reality experiences being developed today. But that is not the case. There are some underlying hard problems that prevent developers from creating these immersive experiences, notably:


  • Representing people in MR with appropriate realism requires a lot of time and resources.
  • Keeping a hologram stable in a shared MR space across time and device types is a non-trivial problem.
  • It is hard to bring high-fidelity 3D models into MR to support the file formats our customers have.
  • Synchronizing actions and expressions of people in a geographically distributed MR session is complex.

These challenges have prevented developers from enabling MR experiences for multiuser scenarios. It is this set of challenges that Microsoft Mesh intends to solve. Mesh provides a platform for developers to design immersive multiuser MR apps without having to worry about complex technical problems. Let’s look at the core components of the Microsoft Mesh platform for developers:


Microsoft Mesh developer platformMicrosoft Mesh developer platform


Multi-device support: First and foremost, Mesh can meet users where they are. It supports a range of devices from the fully immersive head-mounted displays (HMDs) like Microsoft HoloLens, HP Reverb G2, or Oculus Quest 2 for 3-dimensional volumetric experience, to phones and tablets on iOS or Android for convenience, through to the fully tethered experience in PCs and Mac for 2-dimensional viewpoint, users can connect from anywhere.


Developer Platform: Next, is the comprehensive developer platform and tooling that Mesh enables. The core of the developer platform is Azure. With identity services like Azure Active Directory and Microsoft Accounts, it brings duly authenticated and authorized users into a secure and trusted session. The Microsoft Graph continues to flow with the users to allow them to bring in their connections, content, and preferences, both from the commercial and consumer space. And, as a developer, you don’t need to worry about core infrastructure around billing, audio/video transmission, and the underlying live-state management capabilities.


Beyond the core platform, we have key AI-powered capabilities that allow Mesh to address some of the most complicated technical challenges with enabling massive multiuser online (MMO) scenarios for mixed reality. These include immersive presence, spatial maps, holographic rendering, and multiuser sync.


Microsoft Mesh AI-powered capabilitiesMicrosoft Mesh AI-powered capabilities


Immersive presence: A fundamental aspect of multiuser scenarios is to be able to represent participants in distinct forms depending on the device that they’re joining from. Mesh delivers the most accessible 3D presence with representative avatars via inside-out sensors of the devices. The Mesh platform comes with an avatar rig and a customization studio so you can use the out-of-the-box avatars. The platform is capable of powering existing avatar rigs too with its AI-powered motion models to capture accurate motions and expressions consistent with the user's action.


Alongside avatars, Mesh also enables the most photorealistic 360o holoportation with outside-in sensors. These outside-in sensors can be a custom camera setup like the Mixed Reality Capture Studio, which helps capture in 3D with full fidelity or it could be Azure Kinect that captures depth-sensed images to assist in producing the holographic representations. Once the holograms are produced these can be used within Mesh-enabled apps on immersive mixed reality headsets or everyday phones, PCs, and tablets, to holoport users in their most life-like representations and deliver a sense of true presence.


Spatial maps: Building apps that persist holographic content in the real world requires a common perspective of the space around each participant as well as an understanding of the physical world. Whether that is service records for a technician or wayfinding for a customer, placing holograms reliably that can persist across time, space, and devices is a common need. This is all enabled in Mesh via Spatial maps. Prior to Mesh, each device has its own local view of the world. With Mesh, these local caches are merged and optimized to have a global understanding of space/environment they’re in. This framework enables content to be anchored, device point-of-views to be shared, and 3D models to be collaborated on.


Mesh helps you create a map of your world that is orders of magnitude more accurate than GPS, and it can even work in places without GPS access. It helps deliver ‘world locked holograms’ that can be tied to specific points of interest. Additionally, Mesh can generate the same understanding aligned to the precise layout and geometry of a given object, allowing developers to easily build apps that may require overlaying objects with visual information like instructions, service records, and other important data, precisely aligned to the components of the object.


Holographic rendering: A quintessential instantiation of the intelligent edge and intelligent cloud architecture, holographic rendering delivers uncompromised fidelity powered by Mesh regardless of the device’s compute and thermal budget. Mesh allows the choice between local stand-alone rendering or cloud-connected remote rendering seamlessly within your app, for each scene and model. This gives the flexibility to design apps that can optimize for latency vs fidelity depending on the device it is being experienced on. Not only that, but holographic rendering also supports most 3D file formats to natively render in Mesh-enabled apps, solving the challenge of bringing in users’ existing 3D models for collaboration.


Multiuser sync: Creating a common perspective of the hologram and each other within a collaborative session is a big challenge. Within Mesh this shared context is enabled via multiuser sync. This is what lights up any pose updates, motions, and expressions from participants or any holographic transform that is happening in the space. All this happens within 100 milliseconds of latency, irrespective of whether the user is in the same physical space or halfway around the world. All this is augmented with spatial audio in Mesh that creates a sense of being in the same physical space in a multiuser scenario.


To leverage these capabilities and the core platform features, Mesh provides a cross-platform developer SDK so that developers can create apps targeting their choice of platform and devices – whether AR, VR, PCs, or phones. Today it supports Unity alongside native C++ and C#, but in the coming months, Mesh will also have support for Unreal, Babylon, and React Native. Beyond just accessing the capabilities, the SDK also provides pre-built UX constructs for developers to utilize in apps. These pre-fabs are designed to make the development process simpler and faster for engaging mixed reality experiences.


Mesh-enabled apps: On top of the development platform, Microsoft Mesh also delivers some app experiences that bring the platform alive. The HoloLens 2 Mesh app and AltspaceVR with new enterprise capabilities are instantiations of the collaborative experience Mesh can light up for immersive headsets. These are just the first amongst many other experiences on their way, built by Microsoft and our partners.


The Mesh developer platform is comprehensive, and the tools and capabilities are designed to help developers get started quickly and deliver engaging multiuser mixed reality experiences. As we learn from early adopters and preview customers, we’ll continue to evolve the SDK to support more engines and frameworks. If you have a compelling app scenario and would like to onboard to the preview, please join the MR developer program. This allows you to build your MR app with the help of some of the pioneers in the MR space and contribute to the Mesh platform along the way.


For more resources on Microsoft Mesh, check out the following links:

Version history
Last update:
‎Mar 02 2021 08:17 AM
Updated by: