Azure VMs stuck at status unknown or updating (solution)

Steel Contributor

Hi all,

 

I want to share this action plan.


Issue after deallocating Azure VMs:

- Azure Portal: VMs status is "Unknown"

- PowerShell: VMs ProvisioningState is "Updating"

 

Cause:

The VMs are stuck because of an issue with the deallocation process.

 

Resolution:

Please raise a ticket with Azure support to get the VMs locked.

 

Update 05/18/2023

The Azure PG will fix the code to avoid these failures in the future. They said:

"When there are multiple thread updating the same internal resource, the code take a lock to avoid saving conflict, one of the code paths fail to release the lock in one specific partition where this customer’s data lives, it make consequently update for same resource return a retry error, we fix the code by ensuring the lock is released”. 

 

1 Reply
Hello @MathieuVandenHautte 1. Verify the Azure Service Health: Check the Azure Service Health dashboard or the Azure status page to see if there are any ongoing service issues or outages that could be affecting the VMs' status. If there are any known issues, wait for the resolution before proceeding. 2. Check the VM Diagnostics: Use Azure Monitor or Azure Diagnostics extension to gather diagnostic information about the VMs. Check the logs and metrics to identify any specific errors or issues that could be causing the VMs to be stuck in an unknown or updating state. 3. Restart the VMs: Try restarting the VMs from the Azure Portal or using PowerShell commands. This can help refresh the VM's state and resolve any temporary issues. Wait for a few minutes to see if the VMs' status changes to running or another appropriate state. 4. Review the VM Extensions: If you have any VM extensions installed, such as custom script extensions or Azure monitoring extensions, check if they are causing any conflicts or errors. Temporarily disable or remove the extensions, restart the VMs, and observe if the status updates correctly. 5. Check the Virtual Machine Scale Sets (VMSS) (if applicable): If the VMs are part of a VMSS, review the scale set configuration, including the health probes, load balancer settings, and autoscale rules. Ensure that the scale set configuration is correct and not causing any issues with the VMs' status. 6. Reimage or Redeploy the VMs: If the above steps don't resolve the issue, consider re-imaging or redeploying the VMs. Reimaging the VM will retain the OS disk and reinstall the OS, while redeploying will recreate the VM instance with a new underlying host. Be cautious as this may result in downtime, data loss, or configuration changes. Make sure to take proper backups before performing these actions. 7. Contact Azure Support: If none of the above steps resolve the issue, contact Azure Support for further assistance. Provide them with the details of the issue, the troubleshooting steps you have already taken, and any relevant diagnostic information you have gathered. Remember to always exercise caution when making changes to your VMs and ensure you have backups or snapshots of critical data before performing any potentially disruptive actions.