Announcing public preview of Azure Virtual Desktop RDP Shortpath for public networks

Microsoft

Today I have the pleasure of announcing the public preview of RDP Shortpath for public networks. This Remote Desktop Protocol (RDP) feature establishes a direct UDP data flow between the Remote Desktop Client and Session host. RDP uses this data flow to deliver Remote Desktop and RemoteApp.

 

Why does UDP matter? What is wrong with using TCP?

 

Reliability

First of all, TCP is an unreliable transport for long-living user sessions. That is right, let me repeat – TCP is unreliable. If you know networking, you might think I'm crazy saying that. But trust me, it's true. TCP is an excellent protocol for guaranteed delivery of small amounts of data. It's easy to implement. Applications like browsers or email clients just send the data and forget about it. They don't need to implement the logic to verify that data is delivered or is delivered in time and with no errors. The protocol will ensure the packet consistency; order, and retry the transmission if delivery fails. However, RDP uses long running connections and long-running TCP connections are problematic. Let me explain this.

 

When Remote Desktop Client establishes the reverse connect session, it consists of two TCP connections, one from the client to the gateway and another from the session host to the same gateway. It looks straightforward, but let's check what is going on over the wire.

 

Let's take the connection from the session host to a gateway as an example. First, Remote Desktop Service opens a local TCP socket on the local network interface. Then, it sends a TCP SYN to the gateway. What happens with the packet?

The packet goes out of the Virtual Machine NIC. Then, it travels over the Azure Virtual Network, reaching the NAT gateway, Load Balancer, Azure Firewall, or other NVA. All those virtual elements perform either connection tracking or network address translation, which means that virtual appliances track the status of the TCP connection in memory. Then, after the NAT, the TCP packet travels over the Azure backbone to the Azure Virtual Desktop Gateway. The gateway is not a single big VM. Instead, it is a distributed cluster of applications running on Azure App Service. On the backend, multiple load balancers and firewalls are tracking the TCP session and translating the packet again to the private IP address and port of the App Service instance. And believe me, this is a simplified description. Software-Defined networks perform a lot more translations and packet encapsulations while tracking the state of the connections.

 

On the client side, a similar story. First, the packet is sent to the home router that performs the translation on the client. Then it may pass the packet inspection firewall. At some point, the packet will reach the AVD gateway and pass through those load balancers again.

 

Microsoft is doing a lot to improve the reliability of the Azure part of the path your TCP packet takes, including fault-tolerant load balancers and scalable NAT gateways. However, not all components are in our control. For example, customers deploy force tunneling on-premises, Zero Trust Network services, and use deep packet inspection.

 

This is complicated even more by dynamic routing, VPNs, and software-defined networking setups.

Any one of those dozens of physical or virtual appliances on the way of RDP flow may fail or may need to be serviced. In such cases, the TCP session could be dropped. However, such network failures always would come as a surprise. This is because the TCP protocol stack will never report any network errors to the application on the higher level until it reaches the point where the connection is not recoverable.

We take this seriously at Azure Virtual Desktop. We have proactive monitoring of the session and a fast reconnect for TCP-based transport. However, even if the sessions are automatically re-established, it takes some time and affects the user experience.

 

The solution comes with using UDP-based transport. First, the tracking of UDP streams is done differently on the load balancers, firewalls, and NAT devices. Second, because of the connectionless nature of UDP, those network devices cannot reset the UDP flow by sending the RST signal. Each packet in the UDP stream is independent of each other and could be lost without affecting the health of the entire flow. Third, UDP is more tolerant to the temporary network interruptions caused by wireless interference or by changes in dynamic routing.

 

UDP does not care about each individual packet's packet order or delivery. It does not have built-in congestion or rate control, which means that if you want to use UDP, you need to implement all of this on your own. And that is what we did by implementing URCP for RDP Shortpath.

With this setup, we have better visibility into the network. We see delays in every packet we send and immediately recognize if some data was lost in transit. However, we resend it only if we need to do that.

 

Bandwidth

TCP is great for local networks but not on the Internet. Yes, if the packet is lost, it will be retransmitted, but that's not the worst thing that could happen.

Bandwidth availability is an essential factor. Unfortunately, TCP congestion control algorithms limit the ability to saturate the network. It is also highly inefficient in window scaling, especially on high latency networks.

Knowing the network better and not being protected by TCP algorithms, we can signal back to the RDP stack. This will adjust the encoding parameters or change the frame rate of the graphics stream.

This is not news for those who manage VoIP or real-time communications like Teams. Most of those applications use UDP as the primary transfer.

Not just graphics is improved by UDP. Your file transfers, print jobs, MMR, and device redirection take advantage of increased bandwidth and reduced latency. In addition, you can now use VoIP applications on your remote desktops even if they have no specific optimizations for VDI environments.

 

Latency

So UDP is suitable for RDP, but is UDP enough? Customers implement UDP-based gateways in many on-premises deployments and other virtualization products. Is it good? It's easy to implement. But in the case of the multitenant cloud service like Azure Virtual Desktop, it would require the inbound firewall rules to be configured, which is unacceptable by most customers. On top of that, such a gateway is just another address translation device that acts as a performance bottleneck and reduces the available bandwidth. It also requires packet travel for the gateway location and increases the network latency.

 

Solution

We understand the challenges of remote protocols in the cloud. Because of that, when we developed RDP Shortpath, we focused not just on enabling UDP for your user sessions but also on enabling it most efficiently. For this, we focused on establishing a direct UDP flow between client and session host, bypassing all unnecessary gateways.

Many of you are familiar with RDP Shortpath for managed networks. IT works great for many customers, with users accessing their remote desktops from the enterprise and office settings. However, the feedback that we hear from you clearly shows that while RDP Shortpath is great for managed networks such as ExpressRoute, it is a non-starter for users who travel or work from their homes. We recognize these challenges, and our protocol team worked hard on the feature released to the public preview today.

 

Meet RDP Shortpath for public networks.

Like its oldest brother, this feature establishes direct UDP flow for RDP. However, it does not require any inbound ports to be opened on the firewall. Instead, it will automatically select the network conditions. It uses a combination of NAT traversal protocols such as STUN and UPnP and the process of Interactive Connectivity Establishment (ICE). RDP then would establish the direct UDP flow in most network setups.

As a result, your users would get lower latency, better network utilization, and high tolerance to packet loss or network configuration changes.

To demonstrate the benefits of RDP Shortpath, I recorded a video that shows the commercial for Microsoft Flight Simulator. I watched the video over two RDP sessions. One with reverse connects TCP transport, another with RDP Shortpath. To keep the setup closer to reality, I used WAN emulator software to introduce a packet loss. For reference, I added the original video to the bottom of the screen.

 

 

As you can see, UDP, even with a horrible 10% packet loss, gives you smoother playback and better image quality.

 

How does RDP Shortpath work?

RDP Shortpath for public networks performs dynamic analysis of your network. It works in many cases, but some configurations are not compatible. For sure, you must have the UDP traffic flowing on your network. But even if UDP is allowed on the network, RDP Shortpath may fail if you use double NAT setups. This includes a Carrier-Grade NAT used by some cellular operators. It also may fail because some firewalls specifically block NAT traversal protocols or are configured to prevent port reuse.

In such cases, you may increase the chance of establishing the Shortpath connection by enabling the native IPv6 or using Teredo networking. You may also use Azure load balancer for the outbound network access or assign a public IP address to a VM.

There's no need to allow any inbound connectivity in all these cases. No need to open port 3389 or any other port.

If RDP Shortpath fails to establish, the user wouldn't notice a thing and will continue to use the TCP -based reverse connection transport.

 

Getting started with RDP Shortpath for public networks

You can find information about RDP Shortpath configuration in Azure Virtual Desktop documentation. It also includes recommendations for troubleshooting.

 

Thanks

This release results from the work of multiple teams at Microsoft, and I would like to thank all my colleagues for their outstanding work. I am also grateful to all customers and MVPs that participated in the private previews and provided their feedback.

 

28 Replies
Congratulations on rolling out this great feature. With one simple registry change, I can made this feature work...
Great work!

Is this coming to XBOX cloud-play as well? :eyes:

Nice feature, will it be possible to use it, while using Azure NAT Gateways?
I understand that during the preview, RDP Shortpath for managed networks is incompatible with RDP Shortpath for public networks.
How will paths be prioritized if and when they can both be used side by side?
Yes, we are working with Azure Networking on improving the connectivity over the Azure NAT Gateway
When we would enable both transports side by side, managed path would get a higher priority
Great work, this should really help our remote workers. So my understanding is that this is supported by UDRs to NVAs provided they don't double NAT on the Firewall. Is this what "Local NAT does not use Port Preservation - custom port range may not work worth shortpath" means on the troubleshooting script provided? I'm not defining a custom port range on my session hosts, and I'm seeing STUN traffic successfully establish a binding request at both ends but no UDP.
Also would using the AzureVirtualDesktop service tag on a custom UDR (to send that straight out of the session host and bypass NVA inspection) help? I've tried and it doesnt appear to.
Any additional logs other than the Log Analytics ones.?

thanks

@fdwl 

 

Earlier in the AVD / WVD lifecycle, we could enable some server fast path UDP support by setting fUseUdpPortRedirector (and UdpPortNumber).

 

Does this replace those settings? Are they different somehow?

@fdwl 
Great stuff! Managed to get this working in under 10 minutes and the performance improvement is noticeable immediately.

A few questions:

  • Which clients support the use of RDP Shortpath (Windows/MacOS/iOS etc) and if not all of them is that in scope for the future?
  • Is there a way in AVD Insights to show how many users are making use of Shortpath vs falling back to TCP?
  • Are there any plans to make the STUN network addresses into a service tag for simpler management of Azure Firewalls

Thanks!
Alex

update: it seems like this has now just started to work - even though I've made no additional host or network changes, so maybe a backend change. Either way its looking good!

@pcluskey 
We started testing this feature today and also getting the same error.

 

Failed to communicate [2a01:111:202f::155]:3478 with error: Unable to send data, check if InterNetworkV6 is configured
Local NAT does not use port preservation, custom port range may not work with Shortpath

 

Anyone else running into this?

I also have the same error. This script seems to verify if your VM have a public IP or a NAT that is used for UDP binding. Seems like that the NAT preservation is a pre-requesite or one of the multiple ways that the agent try to connect in UDP. It would be nice if we would have a verbose log to know where the problem came from. As far as I understand without NAT translation or port preservation or UPnP it will not work. (Didn't manage to get it work)
well I spoke too soon. no changes on our side, client on the same network as yesterday now using TCP again. I can't get it to work on my home broadband (Virgin) unless I disable UDR to our NVA (so can't do that in production). I'm wondering if my home mesh TPlink Deco is causing problems.
Great update! For now we are unable to make use of it with a "standard setup" behind an azure firewall. troubleshooting script complains about port preservation. we'll wait for an update and see if we can get more advanced troubleshooting capabilities. great feature though! we are using it on managed networks successfully already.
This is exactly the error we are seeing and we have opened all firewall etc.
Yup same error - unsure what to do next - our network team has confirmed we are allowing UDP, and can see it making outbounds to STUN. Still no UDP sessions
We have the same, and whilst outbound to the STUN servers is visible, I believe there also needs to be an outbound connection to the clients public IP. If I remove my UDRs then I see that communication, but with the UDR to our NVAs it's not happening. I suspect partly its because our NVA doesn't support STUN, and also down to lack of NAT for UDP traffic outbound. I've raised a support ticket (even though its in preview) - hopefully MS can have a look at my matching wireshark traces and advise.
Will this work for AVD classic session hosts?

@dextraa9791 

I'm seeing the same error message.

Could anyone here find out the reason?