SCVMM update host agent upgrade fails

%3CLINGO-SUB%20id%3D%22lingo-sub-2008716%22%20slang%3D%22en-US%22%3ESCVMM%20update%20host%20agent%20upgrade%20fails%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2008716%22%20slang%3D%22en-US%22%3E%3CP%3EHi%20all%2C%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EWe%20recently%20updated%20SCVMM%202019%20to%20SCVMM%202019%20UR2.%20The%20update%20ran%20smoothly%2C%20and%20SCVMM%20was%20back%20up%20and%20running.%20As%20a%20post-update%20task%20the%20SCVMM%20host%20agents%20of%20the%20HyperV%20hosts%20managed%20by%20SCVMM%20have%20to%20be%20updated.%20We%20have%20done%20this%20during%20previous%20updates%20and%20upgrades%20and%20never%20before%20ran%20into%20an%20issue.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EIn%20this%20particular%20case%2C%20we%20did%20run%20into%20an%20issue%2C%20and%20it%20had%20a%20major%20impact.%26nbsp%3BAfter%20updating%20the%20agent%20on%20a%20single%20host%2C%20and%20noticing%20the%20update%20was%20applied%20correctly%2C%20we%20started%20applying%20the%20update%20to%20multiple%20hosts%20simultaneously%20(all%20running%20WS2019%20DTC).%20Suddenly%20we%20noticed%20the%20loss%20of%20connectivity%20to%20vm's%20on%202%20of%20our%20Hyper-V%20clusters.%201%20hosts%20in%20cluster%201%20and%202%20host%20in%20cluster%202%20failed%20the%20SCVMM%20agent%20update%20job%2C%20and%20also%20failed%20to%20drain%20themselves%20of%20their%20running%20virtual%20machines.%20Closer%20inspection%20showed%20vmms%20(Virtual%20Machine%20Management%20Service)%20was%20stuck%20in%20the%20Stop%20Pending%20state%20on%20all%203%20hosts.%20After%20manually%20killing%20the%20service%2C%20it%20started%20back%20up%2C%20and%20the%20vm's%20eventually%20became%20available%20again.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EWe%20soon%20noticed%20that%20all%20affected%20vm's%20had%20suffered%20an%20unexpected%20shutdown%2C%20which%20resulted%20in%20considerable%20time%20spent%20to%20check%20if%20they%20were%20still%20functioning%20correctly.%20In%20some%20cases%20they%20weren't%2C%20and%20we%20had%20manually%20fix%20the%20application%20or%20resort%20to%20a%20restore.%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3ENow%20we%20have%20already%20decided%20for%20ourselves%20that%20in%20future%20updates%2Fupgrades%20we%20will%20drain%20hosts%20before%20attempting%20to%20update%20their%20SCVMM%20agent.%20I%20am%20however%20left%20wondering%20if%20this%20is%20a%20new%20issue%2C%20or%20if%20it%20was%20known%20before%2C%20and%20we%20somehow%20overlooked%20it%20in%20the%20countless%20times%20we%20read%20through%20the%20SCVMM%20update%2Fupgrade%20manual.%3C%2FP%3E%3C%2FLINGO-BODY%3E
New Contributor

Hi all,

 

We recently updated SCVMM 2019 to SCVMM 2019 UR2. The update ran smoothly, and SCVMM was back up and running. As a post-update task the SCVMM host agents of the HyperV hosts managed by SCVMM have to be updated. We have done this during previous updates and upgrades and never before ran into an issue.

 

In this particular case, we did run into an issue, and it had a major impact. After updating the agent on a single host, and noticing the update was applied correctly, we started applying the update to multiple hosts simultaneously (all running WS2019 DTC). Suddenly we noticed the loss of connectivity to vm's on 2 of our Hyper-V clusters. 1 hosts in cluster 1 and 2 host in cluster 2 failed the SCVMM agent update job, and also failed to drain themselves of their running virtual machines. Closer inspection showed vmms (Virtual Machine Management Service) was stuck in the Stop Pending state on all 3 hosts. After manually killing the service, it started back up, and the vm's eventually became available again.

 

We soon noticed that all affected vm's had suffered an unexpected shutdown, which resulted in considerable time spent to check if they were still functioning correctly. In some cases they weren't, and we had manually fix the application or resort to a restore.

 

Now we have already decided for ourselves that in future updates/upgrades we will drain hosts before attempting to update their SCVMM agent. I am however left wondering if this is a new issue, or if it was known before, and we somehow overlooked it in the countless times we read through the SCVMM update/upgrade manual.

0 Replies