12-12-2019 11:59 AM - edited 12-12-2019 12:06 PM
12-12-2019 11:59 AM - edited 12-12-2019 12:06 PM
Hello we have been experiencing some random but consistent disconnects from our WVD Pool. We have roughly 10 users and have been getting different event viewer logs for when they disconnect. We have Thin Clients on Windows 10 version 1607. When the users disconnect it will happen multiple times per day, however some days they do not disconnect. Attached are the event viewer logs
12-19-2019 08:57 AM
12-20-2019 09:19 AMSolution
@Sbarnet: Thank you for sharing the logs. Unfortunately there isn't enough information to debug the disconnects. I recommend to open a support case and provide activity IDs for the connections you see issues with.
03-09-2020 01:08 PM
@Sbarnet Hi, we are also experiencing these issues for two weeks now (50 user environment).
- Happens from multiple locations (so independent of network or ISP).
- Happens on both Windows 10 and HTML5 clients.
- Seems to happen completely random.
- Problem so far not related to a particular WVD host in the pool.
- Some days no problems at all, the next day it could me 80+ disconnects on one day.
- It seems to be related to a session. When user A gets is in the morning, chances are high he will get it multiple time a day, until he completely logs off and on again, so a completely new sessions is build up.
Already have a Microsoft case running for this, but so far no solution, or not even a cause. I suspect Azure's WVD gateway, but can't prove this yet.
Does anyone else experience this in the West Europe region?
03-09-2020 04:08 PM
@Marco Brouwer We've been having the same issue for about a week and a half. We have about 100 users on WVD. Last week everyone got kicked off and unable to connect. I log them off via the WVDAdmin tool and they can get back in.
Today we had a similar issue but only a had full of users got the sessions kicked off. I believe also this has to do with the WVD Session Gateway.
03-09-2020 04:20 PM - edited 03-09-2020 04:22 PM
Thanks for the heads up! We do not experience the same. When we are disconnected, users auto-reconnect, or they connect again to their session manually.
- Sometimes it happens just for one user.
- Sometimes it happens for multiple users simultaneously (but not all). Those users are not (always) on same WVD host, and not on same client network / ISP (we have a few clients connecting from a 4G LTE router to rule out the customers network or ISP).
- Today it happened for all logged users at once, for the first time.
What really sticks out, is that is seems connected to particular sessions. User A get kicks out today 4 times, but tomorrow he's fine. Instead, user B gets kicked out tomorrow multiple times.
We asked users to completely log off and and log back in again after experiencing a disconnect, that "seems" to help. They don't get kicked out again that day.
You almost should say that a "new" session is being brokered / load balancing through another server in the Azure gateway cluster, which does not have the problem of disconnecting users at random.
Do you have contact with Microsoft about this? Can you tell me what troubleshooting steps have been taken so far?
03-10-2020 01:38 PM
We are experiencing the same issue, sometimes users disconnect randomly during the day, others can work without any issue. It looks like an issue with one or a couple of RD Gateway servers serving multiple customers in West-EU WVD infrastructure . I cannot find any information however in the logs of the WVD hosts themselves.
Has anyone had luck with Microsoft Support going through this issue?
Thanks in advance
03-11-2020 02:52 AM
Hi, thanks for the response! I really wonder if we experience the problems at the same time.
- Friday March 6th: No problems
- Monday 9th: Over 75 disconnects (16 concurrent users)
- Yesterday: No problems
- Today: No problems (yet).
Can you run this:
Import-Module -Name AzureAD
Import-Module -Name Microsoft.RDInfra.RDPowerShell
Add-RdsAccount -DeploymentUrl https://rdbroker.wvd.microsoft.com
Get-RdsDiagnosticActivities -TenantName <<TENANTNAME>> -Outcome Failure -Detailed -Type Connection -starttime 07-03-2020 | Sort-Object endtime | ft
This shows you when the disconnects took place. Maybe it happens exactly the same time, which proves something was wrong in Azure network / WVD gateway / WVD load balancer or broker.
Please let me know!
03-11-2020 03:19 AM
We had monday also problems, between 9:30 - 11:30 (Dutch time). Last 2 days till this moment no problems.
03-11-2020 03:34 AM
Hi, let's get a bit more precise. Last monday, the disconnects happened in batches. Like multiple / all users disconnecting at once.
Just got into the logs, we see this disconnect waves at these times (according to Powershell command result):
- 11:03:19 AM
- 11:03:22 AM
- 11:12:20 AM
- 11:24:29 AM
- 11:37:52 AM
- 13:45:18 PM
Between these times, we also has multiple disconnects for just one user at a time.
I really wonder if you see the exact same times of disconnect waves. That would defenitely prove it's something in the WVD plane.
03-11-2020 03:36 AM
03-11-2020 03:41 AM - edited 03-11-2020 03:42 AM
That seems to be a pattern :). These are our results.
No problems today and yesterday.
Was 26th of February also a "bad day" for you?
03-11-2020 04:42 AM
03-11-2020 06:42 AM
Same result here. Multiple WVD deployments all reported connection loss at the same time of yours. Like yourself they're also WVD Deployments in the Europa West region so I figure it was a local and temporary issue. That's what you get for outsourcing the Gateway I guess, but some sort of health monitoring would be a good thing. At the very least we know the issue is out of our hands.
03-11-2020 06:54 AM
Did anyone also experience that much disconnects on 26 February like we did?
03-11-2020 07:00 AM
03-11-2020 07:10 AM
My experience with MS Support isn't that great to be honest regarding issues like these. I don't even bother opening tickets about performance issues like these. They had similar latency problems with Exchange Online during the same period. Most people wouldn't have noticed them very much with Outlook running in Cached Mode but we also run a 200+ user RDS deployment without Cached Mode enabled and they got hit hard during the same period. Given the reports of you two I'm writing it off as local issues in the EU West region.
Experience seems to have improved a tad over the past day or two but that's not based on any actual data I've collected. I don't handle the frontline helpdesk myself but this is what they tell me.
03-12-2020 03:48 AM
Several users (internal, external) had some performance issue between 11:00 and now (11:45). Anyone also expires performance issues at that moments? There where no disconnects at that moment.
03-12-2020 04:05 AM
Hi, yes screen performance seems a bit slow last hours. Like dragging windows over the screen, it shocks. Seems a connection lag, not a CPU problem or something like that. So WVD Gateway again I guess.
Also over here no disconnects yet today!
A reaction from Microsoft we got today:
"we have some disconnection issues but the Product group didn’t provide any updates regard changes but all we know that they are investigating and should provide a fix."
Well, if they want to provide a fix, at least they acknowledge something is broken :).
03-12-2020 04:12 AM
Altough the problem seems widespread in EU West it isn't the case for all WVD connections. Our own internal WVD works fine and in one client we're even seeing some users with 4 bars of WVD connectivity and decent (not great) performance while others in the same office are experiencing 1 bar and piss-poor performance. FYI, standard RDP connectivity remains normal at those times. It's 100% an issue with (some of?) their WVD Gateway Entrypoints in EU West
03-12-2020 04:18 AM
Did anyone maybe try to make a second WVD tenant within the same Azure (onmicrosoft.com) tenant? Would that make a difference?
I really wonder how MS configures this. It seems some customers are really assigned to certain gateways (clusters).
03-12-2020 04:53 AM
I doubt that would make for a good testcase. We have different behaviour within a single tenant so my guess at the moment would be that the load balancing mechanism isn't based on tenancy.
They do need some sort of health monitoring mechanism though. It's bad enough that we have to tell our customers it's an Azure problem and out of our hands but not even having a clear monitoring mechanism other the this forum to determin if it is Azure-side is worrysome.
03-12-2020 05:03 AM
Apparently they are aware of issues in EU West with Recovery Vault. The problem is just more widespread as they know/are acknowledging. Or their attempts at fixing the issue is causing issues elsewhere in the same datacenter.
03-12-2020 05:41 AM
03-12-2020 05:44 AM - edited 03-12-2020 05:53 AM
Your filter is too narrow. On your screen you can click through to the only active Health Issue at the moment: the Recovery Vault one.
I logged a case for the WVD performance degradation as well. I'd suggest we all report them. If enough people complain they will create a Health issue for it I guess. Given that the problems have been happening for more then a week now and they're still not "aware" of the issue it's the only way forward I see.
03-13-2020 03:41 AM
The Backup health case has been "resolved" but the latency issues with WVD remain.
I have an open ticket but awaiting feedback. If you run https://azure.microsoft.com/en-us/services/virtual-desktop/assessment/ it sometimes peaks over 100 ms but I'm not sure if that's a valid test. Would be nice to get some sort of historic graph on that. I assume it just pings an end-point so we could do it ourselves but haven't found what exact end-point to ping yet.
03-13-2020 09:13 AM
03-16-2020 02:01 AM - edited 03-16-2020 02:02 AM
Just so we're all on the same page here. Everyone experiencing these issues are contacting infrastructure in the Europe West region?
Further breakdown. Users usually have 1-2 bars on their WVD connection while experiencing the issue. Although that's a poor metric ofcourse. Sometimes the latency is present with 4 bars as well. I'm 100% certain it's the gateway infrastructure though. If I RDP straght to the VM's I get 4 bars and no latency issues at all.
I have escalated the case (through Ingram) to severity A but still not even an acknowledgement from MS. Does anyone else have any open tickets with MS? Mine is: 120031223001948 if you want to link them.
03-16-2020 03:48 AM
I will try to open a ticket and link.
My biggest clients are logging in now from a site 2 site vpn, then its ok.
03-16-2020 05:54 AM - edited 03-16-2020 05:55 AM
Just got complaints from our client aswel, working with 9 per VM and feedback is slow.
Thinking about migrating them back to our own datacenter environment where we can troubleshoot the whole infrastructure.
03-16-2020 06:19 AM - edited 03-16-2020 06:19 AM
just got off the line with a MS Engineer. He did little more then run Psping:
psping -t rdweb.wvd.microsoft.com:443
You see a lot of spikes depending on which site you run it from it gets worse and worse but they all experience the spiking. Naturally you don't see such spikes to other internet infrastructure to exclude the possibility of a local issue.
After seeing that he is escalating once more internally and will provide feedback.
I suggest we all try running these continuously and keep comparing. This is a screenshot of the current "performance" although I do have to mention it's a whole lot better then this morning.
03-16-2020 06:27 AM
03-16-2020 07:10 AM
03-16-2020 07:17 AM
I just tested more on our environment.
The only difference is that I am now going through MS RD Gateway infrastructure.
This is exactly the issue that our users are experiencing (I even think this is also related to the disconnect issue)
03-16-2020 08:28 AM
03-16-2020 08:43 AM
03-16-2020 09:11 AM
@knowlite We've had multiple reports regarding high latency in the europe region and are investigating. Thanks for letting us know. I will circle back to this thread once I know more.
03-16-2020 09:55 AM
Status update for everyone:
MS Engineer told me that the PG Team have applied a policy change to mittigate the latency issues we are seeing. They did ask us to reboot the WVD hosts to apply this change so he advises everyone on this thread to do so.
Afterwards the WVD deployments I rebooted have decent performance but then we had decent performance before the reboot as well given the time of the day. I allowed him to downgrade the issue to B status and we will check tomorrow morning if the problem persists or not.
He did also acknowledge that given the Corona outbreak the WVD infrastructure has a peak in usage and the infrastructure team are looking at improvements to assist in that area.
So I'm not sure if their fix actually resolves the issue or it's just a case of increasing capacity for WVD after increased usage the last weeks. If it's the second case I assume the problems will persist for some time. I can take single server customers to VPN-Vanilla RDP without too much changes but customers who require loadbalancing will need a special temp solution then :-S
03-16-2020 11:32 AM
03-16-2020 11:39 AM
While I know it is after business hours and after rebooting the host. Current connection status:
03-16-2020 01:20 PM
We have the same issue in Canada. Different customers, somes in Canada Central and Canada East. Both getting random but consistent disconnects. This is not related to a specific VM, i can RDP to 3389 without any disconnect the whole day and my customer complaining for multiple disconnect (over 5 per user per day) that can take 3 minutes up to 30 minutes to come back. When a disconnect occurs, it affect some users with the error 14 "Unexpected network disconnect" or "50331694 AutoReconnect due to Network Error" at the same time on different servers and differents start session time.
Do you have any ETA about the fix for this problem? Customers need to work remotely more than ever due to the current situation. The Azure gateway is very unreliable. Do you support a way to connect a Windows Server 2019 Gateway to WVD?
03-16-2020 02:56 PM