SOLVED

Started a few days ago at most: Agent on Session host no longer communicates properly with WVD

Brass Contributor

My host pool and tenant are probably from the May timeframe. The agent has version 1.0.1006.2006.

 

I notice that the agent posts a heartbeat on a restart of session host and then no more.

 

When connecting, the RDBroker appears to decide that no session host is available. 

 

I also notice that session hosts that are shut down are marked as Available in Status according to Get-RdsSessionHost. 

 

Have you made some recent (incompatible/bad) upgrade to WVD that I should be aware of. Do I need to do anything or this an issue on the MSFT side?

 

Here are diagnostics from one connect attempt:

PS C:\Users\johan> (Get-RdsDiagnosticActivities -TenantName "prod-fdt-avance" -ActivityId 8b9460d6-a900-4940-85ac-7b46031e0000 -Detailed).Errors


ErrorSource : RDBroker
ErrorOperation : OrchestrateSessionHost
ErrorCode : -2146233088
ErrorCodeSymbolic : ConnectionFailedRDAgentBrokerConnectionNotFound
ErrorMessage : RD Agent from host 6ce85f83-63c8-4c87-adb2-1d7d33dc12f7 is not connected to the Broker instance is
associated with during orchestration request.
ErrorInternal : False
ReportedBy : RDGateway
Time : 8/13/2019 12:02:32 PM

 

Here is the session host status (Last two are shut down since end of May):

PS C:\Users\johan> Get-RdsSessionHost -TenantName "prod-fdt-avance" -HostPoolName "avance"


SessionHostName : prod-rds-vm00.fdtcloud.se
TenantName : prod-fdt-avance
TenantGroupName : Default Tenant Group
HostPoolName : avance
AllowNewSession : True
Sessions : 1
LastHeartBeat : 8/13/2019 10:01:52 AM
AgentVersion : 1.0.1006.2006
AssignedUser :
OsVersion : 10.0.14393
SxSStackVersion : rdp-sxs190614002
Status : Available
UpdateState : Succeeded
LastUpdateTime : 8/6/2019 8:06:44 PM
UpdateErrorMessage :

 


SessionHostName : prod-rds-vm01.fdtcloud.se
TenantName : prod-fdt-avance
TenantGroupName : Default Tenant Group
HostPoolName : avance
AllowNewSession : True
Sessions : 0
LastHeartBeat : 5/28/2019 2:11:07 PM
AgentVersion : 1.0.407.7
AssignedUser :
OsVersion : 10.0.14393
SxSStackVersion : rdp-sxs190329002
Status : Available
UpdateState : Succeeded
LastUpdateTime : 5/28/2019 9:10:18 AM
UpdateErrorMessage :

 

 

SessionHostName : prod-rds-vm02.fdtcloud.se
TenantName : prod-fdt-avance
TenantGroupName : Default Tenant Group
HostPoolName : avance
AllowNewSession : True
Sessions : 0
LastHeartBeat : 5/28/2019 2:19:21 PM
AgentVersion : 1.0.407.7
AssignedUser :
OsVersion : 10.0.14393
SxSStackVersion : rdp-sxs190329002
Status : Available
UpdateState : Succeeded
LastUpdateTime : 5/28/2019 9:28:22 AM
UpdateErrorMessage :

 

23 Replies

@Johan_Eriksson When we started the two shut down session hosts, both went on to upgrade and got an upgrade failure. This put them both in Status UpgradeFailed. 

The odd thing is that this caused connections to succeed. (Maybe sessions were incorrectly being routed to session hosts that are shut down.)

This leads me to suspect that the marking of shut down session hosts as "Available" is part of the bug that was somehow introduced in a recent deploy by MIcrosoft.

@Johan_Eriksson I'm having what I think is a related issue. Our setup is also from May 2019, and in the last 48 hours we're having constant issues.

 

Ever since our virtual desktop VMs auto-updated to Remote Desktop Services Infrastructure Agent 1.0.1006.2006, they keep falling into a bad state of "NoHeartbeat" after only a few minutes. When the VMs first boot, the status is "Available" and "LastHeartBeat" reflects the time of boot. We're able to connect to the pool without issue and can maintain sessions, but cannot start new sessions because the status changes to "NoHeartbeat".

 

If an admin restarts the "RDAgentBootloader" service, the status changes to "Available" and "LastHeartBeat" reflects the time of the service restart.

 

For reference, I am checking on the status of the pool with the "Get-RdsSessionHost" cmdlet.

As a quick workaround for my issue, I'm had to setup a task with Task Scheduler that runs NET STOP and NET START of RdAgentBootloader every 1 min, starting at system boot and continuing indefinitely. This gives us about 55 seconds out of every minute where the session hosts are "Available", and 5 seconds where they're "NoHeartbeat". It's an awful hack, but it's better than nothing until MSFT can push a version of the RD Agent that can actually post heartbeats.
Thanks for bringing this to our attention. Let me review this and get back to you on this.
Thanks for bringing this to our attention. Can you reply back with output of Get-RDSSessionHost please? Thanks.
This was from a screenshot I took right before implementing the workaround that restarts the RD Agent Bootloader every 1 min.

@Roop_Kiran_Chevuri 

I am also having this issue, screenshot of the rdsSessionHost, working yesterday failing today. I ahve tried rebooting the desktops.Cannot_login.PNG

 

Thanks. Can you explain what’s the exact scenario you are seeing?

@Roop_Kiran_Chevuri Reiterating what was written above -- all session hosts start out as "Available" up to 1 min after boot, and then fall into "NoHeartbeat" status. Once this happens, it's not possible to connect to the host pool using Remote Desktop -- it displays this message:

 

14f8e1a0-7a28-45a5-b6f5-4f29097f5794.png

 

If new session host VMs are created from an image that worked back in May, as soon as they update to Remote Desktop Services Infrastructure Agent 1.0.1006.2006, they drop to "NoHeartbeat" status after 1 min. Restarting the new session hosts exhibits the same behavior as the existing session hosts -- they only stay Available for up to 1 min after boot.

@Roop_Kiran_Chevuri 

No worries,

Yesterday I set up a fresh pool of dedicated machines, I entitled three users and they logged in successfully. All three users haven't done anything more than login and open a webbroswer

This morning, we are unable to connect with any of the users, we can connect to the pool gateway but none of the session hosts.

No_Resources.PNG

 

@Roop_Kiran_Chevuri 

 

We do have the same problem as described by GuyPaddock. Session host starts with status 'Available' and that changes shortly after to 'NoHeartbeat'. Consequently our users did get the '...no available resources' message. As said restarting the RdAgentBootloader service helps, for a minute...

 

'Hope this will be solved soon.

Do you mind sending me your task scheduler settings please. I am in a bind and would like to get through the testing of the task scheduler you already did by reaching out. Thanks a million! @GuyPaddock

@iconicmlee Sure!

 

I believe you will need to configure the task to run with highest privs under an admin account, since it needs to be able to stop and start system services:

<?xml version="1.0" encoding="UTF-16"?>
<Task version="1.2" xmlns="<a href="http://schemas.microsoft.com/windows/2004/02/mit/task" target="_blank">http://schemas.microsoft.com/windows/2004/02/mit/task</a>">
  <RegistrationInfo>
    <Date>2019-08-14T17:06:39.646478</Date>
    <Author>CLOUD\guy</Author>
    <URI>\ITSA-46 Workaround</URI>
  </RegistrationInfo>
  <Triggers>
    <BootTrigger>
      <Repetition>
        <Interval>PT1M</Interval>
        <StopAtDurationEnd>false</StopAtDurationEnd>
      </Repetition>
      <Enabled>true</Enabled>
    </BootTrigger>
  </Triggers>
  <Principals>
    <Principal id="Author">
      <UserId>S-1-5-21-2615066866-973913876-715106564-1108</UserId>
      <LogonType>Password</LogonType>
      <RunLevel>HighestAvailable</RunLevel>
    </Principal>
  </Principals>
  <Settings>
    <MultipleInstancesPolicy>IgnoreNew</MultipleInstancesPolicy>
    <DisallowStartIfOnBatteries>true</DisallowStartIfOnBatteries>
    <StopIfGoingOnBatteries>true</StopIfGoingOnBatteries>
    <AllowHardTerminate>true</AllowHardTerminate>
    <StartWhenAvailable>false</StartWhenAvailable>
    <RunOnlyIfNetworkAvailable>false</RunOnlyIfNetworkAvailable>
    <IdleSettings>
      <StopOnIdleEnd>true</StopOnIdleEnd>
      <RestartOnIdle>false</RestartOnIdle>
    </IdleSettings>
    <AllowStartOnDemand>true</AllowStartOnDemand>
    <Enabled>true</Enabled>
    <Hidden>false</Hidden>
    <RunOnlyIfIdle>false</RunOnlyIfIdle>
    <WakeToRun>false</WakeToRun>
    <ExecutionTimeLimit>PT72H</ExecutionTimeLimit>
    <Priority>7</Priority>
  </Settings>
  <Actions Context="Author">
    <Exec>
      <Command>NET</Command>
      <Arguments>STOP RdAgentBootloader</Arguments>
    </Exec>
    <Exec>
      <Command>NET</Command>
      <Arguments>START RdAgentBootloader</Arguments>
    </Exec>
  </Actions>
</Task>

 

Also: once the task is created, you have to restart the VM to kick it off. Just running the task manually doesn't cause it to recur. It also seems like editing the task causes it to stop running until the next reboot.

One other power user tip: I found a Powershell cmdlet that makes it really easy to monitor the status of the session hosts so you can see the scheduled task in action.

 

You can get the cmdlet here:
http://wragg.io/watch-for-changes-with-powershell/

 

You can run it to monitor session hosts this way:

{ Get-RdsSessionHost -TenantName "TENANT" -HostPoolName "HOST POOL" } | Watch-Command -Verbose -Continuous

 

If any values like status, heartbeat time, session count, etc change, it prints the output.

@GuyPaddock Thanks for reporting the issue. This is service side issue and we have fixed it few minutes back. Can you please verify if it working now. We would like to further investigate if that's not the case. Thanks.

@soloji Thanks for reporting the issue. This is service side issue and we have fixed it few minutes back. Can you please verify if it working now. We would like to further investigate if that's not the case. Thanks.

best response confirmed by Christian_Montoya (Microsoft)
Solution

@ClearForward Thanks for reporting the issue. This is service side issue and we have fixed it few minutes back. Can you please verify if it working now. We would like to further investigate if that's not the case. Thanks.

1 best response

Accepted Solutions
best response confirmed by Christian_Montoya (Microsoft)
Solution

@ClearForward Thanks for reporting the issue. This is service side issue and we have fixed it few minutes back. Can you please verify if it working now. We would like to further investigate if that's not the case. Thanks.

View solution in original post