SOLVED

WVD outage / hosts unavailable with status "NoHeartbeat"

%3CLINGO-SUB%20id%3D%22lingo-sub-1030437%22%20slang%3D%22en-US%22%3EWVD%20outage%20%2F%20hosts%20unavailable%20with%20status%20%22NoHeartbeat%22%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1030437%22%20slang%3D%22en-US%22%3E%3CP%3ELast%20night%2C%20our%20WVD%20instance%20became%20unavailable%2C%20with%20clients%20unable%20to%20connect.%20The%20initial%20error%20is%20%22ConnectionFailedNoHealthyRdshAvailableErrorMessage%22%2C%20and%20looks%20like%20the%20RD%20server%20%2F%20endpoint%20is%20now%20in%20status%20%22%3CSTRONG%3ENoHeartbeat%3C%2FSTRONG%3E%22%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EHowever%20the%20endpoint%20is%20up%20%26amp%3B%20healthy%20when%20we%20connect%20directly%20in%20Azure%20%2F%20via%20tunneled%20RDP.%26nbsp%3B%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EThe%26nbsp%3BRdAgentBootloader%20%26amp%3B%20RDAgent%20services%20are%20running.%20We've%20tried%20restarting%20the%20machine%20%26amp%3B%20those%20services%2C%20with%20no%20change%20in%20the%20status.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CPRE%3ESessionHostName%20%26nbsp%3B%20%26nbsp%3B%3A%20***%3CBR%20%2F%3ETenantName%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%3A%20***%3CBR%20%2F%3ETenantGroupName%20%26nbsp%3B%20%26nbsp%3B%3A%20Default%20Tenant%3CBR%20%2F%3EGroupHostPoolName%20%26nbsp%3B%3A%20***%3CBR%20%2F%3EAllowNewSession%20%26nbsp%3B%20%26nbsp%3B%3A%20True%3CBR%20%2F%3ESessions%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%3A%201%3CBR%20%2F%3ELastHeartBeat%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%3A%2011%2F25%2F2019%203%3A46%3A07%20AM%3CBR%20%2F%3EAgentVersion%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%3A%201.0.1534.2000%3CBR%20%2F%3EAssignedUser%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%3A%3CBR%20%2F%3EOsVersion%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%3A%2010.0.18363%3CBR%20%2F%3ESxSStackVersion%20%26nbsp%3B%20%26nbsp%3B%3A%20%3CSTRONG%3Erdp-sxs190927002%3C%2FSTRONG%3E%3CBR%20%2F%3E%3CSTRONG%3EStatus%3C%2FSTRONG%3E%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%3A%20%3CSTRONG%3ENoHeartbeat%3C%2FSTRONG%3E%3CBR%20%2F%3EUpdateState%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%20%26nbsp%3B%3A%20Succeeded%3CBR%20%2F%3ELastUpdateTime%20%26nbsp%3B%20%26nbsp%3B%20%3A%2011%2F19%2F2019%2012%3A06%3A32%20PM%3CBR%20%2F%3EUpdateErrorMessage%20%3A%3C%2FPRE%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EAny%20idea%20how%20we%20can%20debug%20this%20further%3F%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-1030563%22%20slang%3D%22en-US%22%3ERe%3A%20WVD%20outage%20%2F%20hosts%20unavailable%20with%20status%20%22NoHeartbeat%22%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1030563%22%20slang%3D%22en-US%22%3E%3CP%3E%3CA%20href%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fuser%2Fviewprofilepage%2Fuser-id%2F54051%22%20target%3D%22_blank%22%3E%40Nicholas%20Semenkovich%3C%2FA%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EWe%20experienced%20the%20same%20issue%20today.%20Came%20in%20this%20morning%20and%203%20session%20hosts%20had%20a%20status%20of%20NoHeartbeat.%20Restarting%20the%20services%2Fsession%20host%20had%20no%20effect.%20Only%20way%20to%20fix%20it%20was%20to%20remove%20the%20session%20hosts%20from%20their%20pools%20and%20re-add%20them.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EHappened%20again%20to%20a%204th%20session%20host%20this%20afternoon.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EI've%20raised%20a%20ticket%20with%20Microsoft%20and%20sent%20them%20some%20logs.%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-1030767%22%20slang%3D%22en-US%22%3ERe%3A%20WVD%20outage%20%2F%20hosts%20unavailable%20with%20status%20%22NoHeartbeat%22%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1030767%22%20slang%3D%22en-US%22%3E%3CP%3E%3CA%20href%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fuser%2Fviewprofilepage%2Fuser-id%2F54051%22%20target%3D%22_blank%22%3E%40Nicholas%20Semenkovich%3C%2FA%3E%26nbsp%3B%3C%2FP%3E%3CP%3EThese%20are%20the%20steps%20I%20followed%3A%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E1.%20Remove%20the%20session%20host%20from%20the%20host%20pool%3A%3C%2FP%3E%3CP%3ERemove-RdsSessionHost%20-TenantName%20%5BTenantName%5D%20-HostPoolName%20%5BHostPoolName%5D%20-Name%20%5BSessionHostName%5D%20-Force%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E2.%20Get%20the%20host%20pool%20registration%20token%20(replace%20with%20New-RdsRegistrationInfo%20if%20the%20token%20has%20already%20expired)%3C%2FP%3E%3CP%3EExport-RdsRegistrationInfo%20-TenantName%5BTenantName%5D%20-HostPoolName%20%5BHostPoolName%5D%20%7C%20Select%20-Expand%20Token%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E3.%20Log%20into%20the%20session%20host%20and%20update%20the%20following%20registry%20keys%3A%3C%2FP%3E%3CP%3EKey%3A%20HKEY_LOCAL_MACHINE%5CSOFTWARE%5CMicrosoft%5CRDInfraAgent%3C%2FP%3E%3CP%3EName%3A%20RegistrationToken%3C%2FP%3E%3CP%3EValue%3A%26nbsp%3B%3CSTRONG%3EToken%20Goes%20Here%3C%2FSTRONG%3E%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EKey%3A%20HKEY_LOCAL_MACHINE%5CSOFTWARE%5CMicrosoft%5CRDInfraAgent%3C%2FP%3E%3CP%3EName%3A%20IsRegistered%3C%2FP%3E%3CP%3EValue%3A%26nbsp%3B%3CSTRONG%3E0%3C%2FSTRONG%3E%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E4.%20Restart%20the%20RDAgentBootLoader%20service%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-1030798%22%20slang%3D%22en-US%22%3ERe%3A%20WVD%20outage%20%2F%20hosts%20unavailable%20with%20status%20%22NoHeartbeat%22%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1030798%22%20slang%3D%22en-US%22%3E%3CP%3E%3CA%20href%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fuser%2Fviewprofilepage%2Fuser-id%2F354980%22%20target%3D%22_blank%22%3E%40DanRobb%3C%2FA%3E%26nbsp%3BWorked%20perfectly%20--%20many%20thanks!%26nbsp%3B%3CIMG%20class%3D%22lia-deferred-image%20lia-image-emoji%22%20src%3D%22https%3A%2F%2Fgxcuf89792.i.lithium.com%2Fhtml%2Fimages%2Femoticons%2Fsmile_40x40.gif%22%20alt%3D%22%3Asmile%3A%22%20title%3D%22%3Asmile%3A%22%20%2F%3E%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-1030841%22%20slang%3D%22en-US%22%3ERe%3A%20WVD%20outage%20%2F%20hosts%20unavailable%20with%20status%20%22NoHeartbeat%22%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1030841%22%20slang%3D%22en-US%22%3E%3CP%3E%3CA%20href%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fuser%2Fviewprofilepage%2Fuser-id%2F354980%22%20target%3D%22_blank%22%3E%40DanRobb%3C%2FA%3E%26nbsp%3B%3C%2FP%3E%3CP%3ENice%20one%2C%20I%20had%20gone%20for%20the%20more%20manual%20method%20of%20re-installing%20the%20agent.%20This%20option%20is%20much%20better%2C%20thanks!%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EI've%20got%20a%20support%20case%20open%20with%20MS%20as%20well%2C%20hopefully%20they%20will%20come%20back%20with%20something%20soon.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EBen%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-BODY%3E
Contributor

Last night, our WVD instance became unavailable, with clients unable to connect. The initial error is "ConnectionFailedNoHealthyRdshAvailableErrorMessage", and looks like the RD server / endpoint is now in status "NoHeartbeat"

 

However the endpoint is up & healthy when we connect directly in Azure / via tunneled RDP. 

 

The RdAgentBootloader & RDAgent services are running. We've tried restarting the machine & those services, with no change in the status.

 

SessionHostName    : ***
TenantName         : ***
TenantGroupName    : Default Tenant
GroupHostPoolName  : ***
AllowNewSession    : True
Sessions           : 1
LastHeartBeat      : 11/25/2019 3:46:07 AM
AgentVersion       : 1.0.1534.2000
AssignedUser       :
OsVersion          : 10.0.18363
SxSStackVersion    : rdp-sxs190927002
Status             : NoHeartbeat
UpdateState        : Succeeded
LastUpdateTime     : 11/19/2019 12:06:32 PM
UpdateErrorMessage :

 

Any idea how we can debug this further?

7 Replies

FWIW we've followed the simple steps here without any luck: https://docs.microsoft.com/en-us/azure/virtual-desktop/troubleshoot-vm-configuration#error--windows-...

 

  • RDAgentBootLoader is running; and nothing happens if we restart it or the machine
    • The Event Viewer shows some RD Agent logs; but they all seem to be healthy / info messages -- and are similar to those before we stopped getting heartbeats
  • psping to test port 443 to the WVD selfhost url works fine (psping.exe rdbroker.wvdselfhost.microsoft.com:443)
  • qwinsta seems to show everything working & awaiting a connection:
    • PS C:\> qwinsta
      SESSIONNAME USERNAME ID STATE TYPE DEVICE
      services 0 Disc
      console 1 Conn
      >rdp-tcp#1 nick 2 Active
      31c5ce94259d4... 65536 Listen
      rdp-tcp 65537 Listen
      rdp-sxs190927002 65538 Listen

@Nicholas Semenkovich 

 

We experienced the same issue today. Came in this morning and 3 session hosts had a status of NoHeartbeat. Restarting the services/session host had no effect. Only way to fix it was to remove the session hosts from their pools and re-add them.

 

Happened again to a 4th session host this afternoon.

 

I've raised a ticket with Microsoft and sent them some logs.

@DanRobb Thanks!

 

When you say re-add session hosts (haven't done that before), do you re-install the agent and use New-RdsRegistrationInfo & Export-RdsRegistrationInfo ?

best response confirmed by Nicholas Semenkovich (Contributor)
Solution

@Nicholas Semenkovich 

These are the steps I followed:

 

1. Remove the session host from the host pool:

Remove-RdsSessionHost -TenantName [TenantName] -HostPoolName [HostPoolName] -Name [SessionHostName] -Force

 

2. Get the host pool registration token (replace with New-RdsRegistrationInfo if the token has already expired)

Export-RdsRegistrationInfo -TenantName[TenantName] -HostPoolName [HostPoolName] | Select -Expand Token

 

3. Log into the session host and update the following registry keys:

Key: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\RDInfraAgent

Name: RegistrationToken

Value: Token Goes Here

 

Key: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\RDInfraAgent

Name: IsRegistered

Value: 0

 

4. Restart the RDAgentBootLoader service

 

 

@DanRobb Worked perfectly -- many thanks! :smile:

@DanRobb 

Nice one, I had gone for the more manual method of re-installing the agent. This option is much better, thanks!

 

I've got a support case open with MS as well, hopefully they will come back with something soon.

 

Ben

 

Perfect guys! Thanks for this smart solution.