Anyone experiencing session hosts becoming unavailable at random

Brass Contributor

Since the end of last week we have had three occasions where one of the session hosts randomly becomes unavailable. This happened in two separate AVD environments.

  • Users get kicked out of their session and cannot reconnect.
    • The user sessions are still marked as Active/Disconnected according to the Azure portal.
  • We cannot RDP to the session host through the internal network.

After we shutdown and reboot the session host, everything will work fine again.

 

We noticed the following notable things:

  1. There are no event logs generated at all, starting 30-60 min prior to the 'crash'.
  2. Since the 28th of October Event Viewer is getting spammed by the following warning:
    1. Microsoft.RDInfra.RDAgent.Service.AgentUpdateStateImpl
      1. Unexpected last recorded state
  3. The "Remote Desktop Services Infrastructure Agent" has been updated on the 25th of October, to version 1.0.5555.1008
  4. The "Remote Desktop Services SxS Network Stack" has been updated on the 31st of October, to version 1.0.2208.17300
    1. This is also the first day that we experienced the problem.

 

I have yet to find anything on this problem. Is anyone else experiencing this with their AVD environments?

92 Replies

@ITCE_Bert 

Thanks Bert, I got the below info today, we had a Teams call yesterday. We should deliver a memory dump but unfortunately the issue has not returned today so that makes me wonder..

Yesterday:

I have check Internally and I have some problem related with the last update of Windows Defender 4.18.2210.4 and 4.18.2210.5 and FSLogix.
But for confirm it is related, I need a full dump to check.
But you can Roll back to a previous version of the Defender AV platform to try to mitigate the issue. Last know good version is 4.18.2209.7


And a few minutes ago I got this:

I hope that you are great today.

So good news, if the issue is related with Defender the Product Group are working to delivery a new update to fix this.

 

I have gotten a hopeful response from Microsoft.

They believe it's been caused by Windows Defender. There has now been an update released that is available to download through Windows Update.

(version 4.18.2210.6)
Indeed, had the same feedback. All our AVD hosts are updated, let's see how the day goes.
i checked some hosts. here as well updated to 4.18.2210.6. Fingers crossed.
So far, no issues on Monday and Tuesday, same everywhere? I asked Microsoft to keep the case open until end of this week. (As the issue was very random)
here the same. Monday and Tuesday no problems.
However, the eventlog continues to be flooded with warnings about:
Microsoft.RDInfra.RDAgent.Service.AgentUpdateStateImpl
Unexpected last recorded state

Is this the same for you guys?
I don't see that warning anymore on our session hosts. (Others yes, but not that one)
Correction! We still see that error when the RDAgent version is 1.0.5555.1008, we don't see it on the downgraded versions. I will report this to Microsoft in our ticket also.
What is still interesting about our error that still exists.

Windows User with Remote Desktop Client: Connection NOT possible
Windows User via Web: Connection possible

Mac User with Remote Desktop Client: Connection possible
Mac User via Web: Connection possible

Same Session Host.
Agent version 1.0.5555.1008
Defender version 4.18.2210.6
Strange! If you have a case open with MS, please talk to me in private message, don't mind sharing our MS ticket numbers..

@KristofH FYI, the feedback I got from Microsoft after mentioning this:

 

I see this error, Microsoft.RDInfra.RDAgent.Service.AgentUpdateStateImpl too on my Lab. I have questioned the SME and this error does not impact at all the agent. Is just a warning for the agent updater.

 

This error is showing because the agent tries to check the repository if they have a new agent and, if they don’t have any new one, they have this unexpected error. When they have a new on the repository, they update without any issue.

 

I of course responded I find this "not normal" as this only occurs since this version and it also occurs 6 times in 2 minutes. And it shouldn't be a warning.

 

hi guys, glad to see the end of these issues. Just a note that we have had this agent version since Monday morning: 1.0.5555.1010. In the connection logs for a user it has a little more detail, calling it "SessionHostAgentVersion 1.0.5555.1010_hotfixProdR1" which I find interesting.

Not sure if anyone else has this version. Either way it seems stable.
Nothing in the KB about it, but that seems normal to be out of date.
https://learn.microsoft.com/en-us/azure/virtual-desktop/whats-new-agent
Hi Paul, how strange... This is the one we are seeing through the AVD deep insights workbook, we don't see that suffix in the AVD session hosts overview: 1.0.5555.1008_hotfixProdR1

@KristofH Yes I witnessed it being installed on Monday morning.

We are in Aus East and have no scheduled updated configured.

 

 

Hi, same problem today again. 2 out of 5 Session Hosts went unvailable. I have already created a MS Ticket a month ago, on the first time of the issue. Microsoft said this is not a global problem. This should be the solution they said:

- Remove the session host from the host pool
- Generate a new registration key for the VM
- Reinstall the agent and boot loader
- Restart Session Hosts

I have done all this steps one month ago. But now same problem...

Anyone got a other idea?

Greets Mario

@mariolener , I am also seeing this event in my logs. I had to restart our VDI instance for a user who was seeing issues, but at the expense of the others who weren't (it's a shared instance).  We're using Sophos endpoint instead of Defender, though I'm sure some of the process is still present, even if AV duty has been handed off. 

We have not seen this issue arise again yet (AVD in Aus East, Meta data in East US)

 

note: our SH agent versions is now listed as: 1.0.5739.9800, as normal there in nothing in the whats new KB about it so no idea if this has any fixes/includes new issue What's new in the Azure Virtual Desktop Agent? - Azure | Microsoft Learn

 

Agent appears to have been updated sometime this week.

We do not have the issue again so far.
Our session hosts (in North+West Europe, UK West ) are all 1.0.5555.1010 and now getting auto upgraded to 1.0.5739.9800. Probably in Feb 2023 this will be mentioned in the release notes. :(

@tomdw We have the same problem on AVD agent: v1.0.5739.9800 that started today. 

We are running everything in the North Europe region. We have submitted a ticket with MS.