Anyone experiencing session hosts becoming unavailable at random

Brass Contributor

Since the end of last week we have had three occasions where one of the session hosts randomly becomes unavailable. This happened in two separate AVD environments.

  • Users get kicked out of their session and cannot reconnect.
    • The user sessions are still marked as Active/Disconnected according to the Azure portal.
  • We cannot RDP to the session host through the internal network.

After we shutdown and reboot the session host, everything will work fine again.

 

We noticed the following notable things:

  1. There are no event logs generated at all, starting 30-60 min prior to the 'crash'.
  2. Since the 28th of October Event Viewer is getting spammed by the following warning:
    1. Microsoft.RDInfra.RDAgent.Service.AgentUpdateStateImpl
      1. Unexpected last recorded state
  3. The "Remote Desktop Services Infrastructure Agent" has been updated on the 25th of October, to version 1.0.5555.1008
  4. The "Remote Desktop Services SxS Network Stack" has been updated on the 31st of October, to version 1.0.2208.17300
    1. This is also the first day that we experienced the problem.

 

I have yet to find anything on this problem. Is anyone else experiencing this with their AVD environments?

91 Replies

@tomdw 

What Azure region and what is your AV/firewall stack?

We are running everything in the West Europe region. We don't have any special AV/firewall.
We noticed the same issue on multiple tenants. Similar timeframe and same versions of RDAgent and SxS Network Stack.
Did you open a ticket at MS? We noticed this in West Europe and UK West.
Good to hear we are not alone. We have submitted a ticket with MS today.
We have the same problem here in Belgium....

@tomdw Yes we have the same issue with the version 1.0.5555.1008 AVD Unavailable to connect and takes long time in the Upgrating Process latam Regions AVD 

Yes we had this today for the first. only affecting 1 host out of 10, completely random. all using same agent version 1.0.5555.1008.

 

We drain bad host and user can log into another one.

 

Meta Data in East US, VMs in AUS East

Just curious, any useful feedback? (I also opened a ticket, nothing yet so far)
We are also waiting for a Microsoft reply.
Now, as "bandage" we are rolling back to SxsStack 1.0.2207 whenever it occurs.
This makes the users very unhappy!

Same here, I have one version with 1.0.5555.1008 that works fine and the other one with 1.0.5555.1200 is always failing...

 

 

@All Fyi,
Multiple AVD hosts with version 1.0.5555.1008 showing same issues
As testcase we reverted the RD Agent to version 1.0.4739.1000 for one AVD VM, stable for now ...
Hi,
we had also the same Problem yesterday on 2 out of 5 Session Hosts.
Also created a MS Ticket.

Has anyone heared something from the microsoft yet?
Just curious, which way do you follow to log an ms ticket?

@ITCE_Bert 

 

How did you reverted the RD Agent version? Thank you!

@tomdw Any updates to this from anyone? to be honest I have not seen a repeat since friday (it is now Tues afternoon here in NZ).

 

Agents still on 1.0.5555.1008 but our 10 session hosts have been rock solid over 4 days ** TOUCH WOOD **

We still had repeats over the weekend and yesterday (7/11)
We have been reverting RD Agents to version 1.0.4739.1000 because they remain stable and we disabled the scheduled agent updates in the host pool settings.
This docs page was updated today:
https://learn.microsoft.com/en-us/azure/virtual-desktop/whats-new-agent
They now talk about v 1.0.5555.1008, but no word about v 1.0.5555.1200
...
We go for our fix for now.
Thanks mate, good info. perhaps we have just been lucky but a ticking timebomb.

can you please advise the high level method of how you force the older agent?