Jun 27 2022 07:10 AM - edited Jan 30 2023 12:50 PM
Jun 27 2022 07:10 AM - edited Jan 30 2023 12:50 PM
The use of VPN or Secure Web Gateway (SWG) client software or agents to provide tunneled access to On-Premises resources in addition to providing protected internet access via a cloud based Secure Web Gateway (SWG) or a legacy VPN & on-premises proxy path is very commonly seen in Windows 365 deployments. This is especially the case when deployed in the Microsoft Hosted Network (MHN) model where the Cloud PC is located on a network with direct, open high-speed internet available. The more modern, cloud based SWG solutions fit very well with this modern Zero-Trust approach and generally perform at a higher level than traditional, legacy VPN software, where internet browsing is hairpinned through On-Premises proxies and back out to the internet.
As we have many Windows 365 customers using such solutions as part of their deployment, there are some specific configuration guidelines which are outlined in this post which Microsoft recommends are applied to optimize key traffic and provide the highest levels of user experience.
What is the Problem?
Many of these VPN/SWG solutions build a tunnel in the user context, which means that when a user logs in, the service starts and creates the tunnels required to provide both internet and private access as defined for that user. With a physical device the tunnel is normally up and running before or shortly after the user sees their desktop on screen, meaning they can then quickly get on with their work without noticing its presence.
However, as with any virtualized device which needs a remote connection to access, the above model poses several challenges:
1. Additional Latency
Firstly, the remote desktop traffic is latency sensitive, in that delay to the traffic reaching its destination can feasibly translate into a poor user experience, with lag on actions and desktop display. Routing this traffic through a tunnel to an intermediary device to reach its destination inevitably adds latency and can restrict throughput regardless of how well configured or performing said device is. Modern SWG solutions tend to perform at a much higher levels than a traditional VPN/Proxy approach, but the highest level of experience is always achieved via a direct connection and avoiding any inspection or intermediary devices. Much like Teams media traffic, the RDP traffic in the Windows 365 case should be routed via the most optimal path between the two endpoints so as to deliver the very highest levels of performance, this is almost always the direct path via the nearest network egress.
2. RDP Connection Drops
An additional challenge comes from the use of user-based tunnels. As the user initiates a connection to the Cloud PC, the connection reaches the session host without issue and the user successfully sees the initial logon screen. However, once the user login starts, and the client software then builds the tunnels to the SWG/VPN for the user, the user then experiences a freeze of the login screen. The connection then drops, and we have to go through the reconnection process to re-establish the connection to the Cloud PC. Once this is complete, the user can successfully use the Cloud PC without further issue. Users however may also experience disconnects of the remote session if there is any issue with the tunnel, for example if the tunnel temporarily drops for some reason. Overall, this doesn’t provide a great user experience with the Cloud PC, especially on initial login.
Why does this occur?
It occurs because the tunnels built to route internet traffic to the SWG generally capture all internet bound traffic unless configured not to do so, a forced tunnel or ‘Inverse split tunnel’. This means the initial login works without issue but as soon as this tunnel is established upon user logon, the RDP traffic gets transferred into it and as it’s a new path, requires reconnecting. Equally, as the traffic is inside this tunnel, if the tunnel drops momentarily and needs to reconnect, this also causes the RDP session to require reconnecting inside the re-established tunnel.
In the diagram below, you can see a simplified representation of this indirect connectivity approach with a forced tunnel in place. RDP traffic has to traverse the VPN/SWG resources before hitting the gateway handling the traffic. Whilst this is not a problem for less sensitive traffic and general web browsing, for latency critical traffic such as Teams and the RDP traffic, it is non-optimal.
What’s the Solution?
Microsoft strongly recommends implementing a forced tunnel exception for the critical RDP traffic which means that it does not enter the tunnel to the SWG or VPN gateway and is instead directly routed to its destination. This solves both of the above problems by providing a direct path for the RDP traffic and also ensuring it isn’t impacted by changes in the tunnel state. This is the same model as used by specific ‘Optimize’ marked Office 365 traffic such as Teams media traffic.
The diagram below shows this direct path being taken to the RDP Gateway for the RDP traffic whereby any other internet/On-Premises bound traffic continues via the VPN/SWG.
What exactly do I need to bypass from these tunnels?
The critical traffic which carries the RDP stream is contained within the following endpoints:
You can obtain the IP information for the WindowsVirtualDesktop tag manually via the Azure IP Ranges JSON file. However, to make this process simpler and quicker, my talented colleague Donna Ryan has written a PowerShell script to obtain this information and format it in a CSV format which is what’s needed for most solutions.
In some network equipment/software we can configure bypass using wildcard FQDNs alone, and we’d recommend that this method is used if available, as the FQDN does not change over time. However, some solutions do not deal with wildcard FQDNs so it’s common to see only IP addresses used for this bypass configuration.
How do I implement the RDP bypass in common VPN/SWG solutions?
Microsoft is working with several partners in this space to provide bespoke guidance but thanks to assistance from our friends at Zscaler, the guide for bypassing the RDP traffic in the Zscaler Client Connector can be found below. We’ll add detailed guidance for other solutions here as we get them confirmed.
Zscaler Client Connector
In some network equipment/software we can configure bypass using FQDNs alone, however it’s common to see IP addresses used for this functionality and this is the case for use with the Zscaler Client Connecter and Tunnel 2.0 which can do the bypass very efficiently using the ‘Hostname or IP Bypass for VPN Gateway’ function.
a. Firstly, we’d recommend you are running the latest version of the Client Connector available to ensure you have all the latest updates from Zscaler.
b. Next, you’ll need to obtain an up to date list of IP addresses and get them into the right format to cut and paste into the Zscaler Client Connector Portal. The PowerShell script found here will provide this information for you in the right format programmatically. It outputs a CSV file which you can then use ‘Select All’ to copy the contents after opening in notepad.
c. In the Zscaler Client Connector Portal go to ‘App Profiles’ then choose the policy to be applied to the Cloud PCs and click Edit
d. In the App Profile you’ve selected, copy and paste the IP addresses from step two into the ‘HOSTNAME OR IP ADDRESS BYPASS FOR VPN GATEWAY’ field and click the plus sign. Within a few seconds the IP addresses should be successfully added to the configuration.
e. It is also required to offload two IP addresses used for critical communication to the Azure fabric so these should be added to the configuration also.
169. 254.169.254 - Azure Instance Metadata Service endpoint
18.104.22.168 - Cloud PC Health Monitoring
f. Once this is done, simply update the policy on the Zscaler client connector. If you have persmissions you can do this instantly in the More > About section of the client connector. If Zscaler is already connected, you’ll find a disconnect occurs if this is successful as the traffic gets redirected out of the tunnel. Once reconnected it will remain outside of the tunnel.
Other VPN/SWG solutions
Microsoft is currently working with other partners in this space to provide detailed guidance for other VPN/SWG solutions and will list them here as they are complete. Please let us know in the comments if you’d like us to list a particular solution and we’ll aim to prioritize based on feedback.
Q: In a Microsoft Hosted Network deployment, is there anything else I need to do?
A: Unless the local firewall is configured to block access to the IP addresses noted, there should be nothing else required, the network the virtual NIC sits in has direct, high speed connectivity Microsoft’s backbone and the internet.
Q: In an Azure Network Connection scenario, is there anything further I need to do?
A: In this scenario the recommended path for the traffic is directly out of the VNet into Microsoft’s backbone. Depending on the configuration it may require allowing the IP addresses noted in this article through a firewall or NSG. The WindowsVirtualDesktop service tag or FQDN tag may help with automating rules in firewalls or configuring User Defined Routing.
Q: Do I need to configure the bypass on just the Cloud PC?
A: It is strongly advised that the bypass is applied to both the Cloud PC and the connecting client if that also uses the SWG/VPN to connect. If both are using the same configuration profile then this should happen automatically.
Q: How often do the IP addresses Change?
A: The Gateway addresses change roughly once a month. We aim to improve the script over time to provide better assistance with automation of a check for changes in this data.
Q: Can I add more than the RDP traffic to the bypass.
A: Microsoft only provides IP addresses for the RDP connectivity at present. However if your solution is capable of configuration by FQDN alone, then you can add other service endpoints to your optimized path, these can be found on this Microsoft docs page.
Q: Im using a true split tunnel, does this impact me?
A: The above advice is for a forced tunnel scenario (inverse split tunnel) where the default path is via the tunnel and only defined exceptions are sent direct, which is often referred to as a split tunnel in common parlance and is the most commonly seen deployment model of such solutions. However a split tunnel in the technically accurate sense of the words, where the default path is the internet and only defined endpoints (such as corp server ranges/names) are sent down the tunnel, shouldn’t need such configuration as the RDP traffic should follow the default path to the internet.
Q: Does this also optimize RDP shortpath?
A: RDP shortpath for Public Networks is currently in public preview and works to provide a direct UDP connection between the client and Cloud PC if enabled and achievable. This connection is in addition to the TCP based connection described above and the dynamic virtual channels such as graphics, input etc are switched into the UDP connection if deemed optimal. The diagram below shows this in place, with the TCP based connection continuing via the Gateway and an additional direct UDP connection. UDP shortpath will only work if direct UDP connectivity to any public IP address is available on both the client and the Cloud PC, so may not be possible in some enterprise environments where tunnels such as those above are in use (as all UDP traffic would have to be offloaded from the tunnel). More detail can be found here on endpoint requirements for this feature.
Feb 16 2023 01:59 AM
Feb 16 2023 05:47 AM - edited Feb 16 2023 05:56 AM
@Paul CollingeThank you so much for the speedy response and the clarification! Apologies, it was not clear for me from the article that either the IP's OR the URL are used, but both are not needed. We are using zscaler so URLs are supported (step #b confused me since it mentioned using the IPs, but I guess that was included for informational purposes and is not required). When we create an exclusion for *.wvd.microsoft.com do we also still need to exclude 169. 254.169.254 and 22.214.171.124 or does that URL cover them as well?
If we do still exclude those 2 IPs, is there any potential for them to ever change?
Feb 17 2023 05:51 AM
Feb 18 2023 03:43 PM
@Paul CollingeYes, haha, I found out the hard way wild cards are not supported when I tried to add it!
I have engaged with our Zscaler team and added the context that this impacting our entire agency cloud VDI journey. They informed me they do already have an existing ER (enhancement request), and they added us to it. Anyone reading this that is a customer, please do call in and get added to the ER. They also are researching internally with the engineering team to see if there is any other way they could exclude wildcard based traffic from ZIA and I will share here if they come up with anything.
Also, since URL exceptions are supported by zscaler, I'm wondering if there is any way we could find out the appropriate itemized non-wildcard URLs, would the AVD URL checker do the trick?