Jun 09 2021 07:28 AM
Jun 09 2021 07:28 AM
Summary: I'm having errors during the session host deployment process caused by a failure of the new VM to connect to the Internet and download a .zip file. A Microsoft support engineer is advising me to use a completely different deployment method. Before I start arguing with them, I want to get a reality check from the community.
Details: I want to make available a Windows 10 multi-session desktop that has my company's applications installed. It takes quite a bit of time to install and configure these apps, so I don't want to start from scratch with the latest bare-bones Azure gallery image each time there's an application or OS update. Instead, I want to grab an Azure image one time, customize it, and then keep updating that image using the steps outlined in Windows Virtual Desktop (WVD) – Image Management : How to manage and deploy custom images (including.... That article is a year old and there have been some important changes to the UI since then, but basically the idea is that you customize and update a non-domain-joined golden image, and then when you're ready to put the updated image into production you can add one or more VMs to an existing host pool, and they will join the domain as part of the deployment process. Automation is important here because if you're trying to quickly add multiple VMs in response to new load demands, you don't want to have to do a lot of fiddling with each new VM after it's spun up.
I've had some success with this procedure, but twice in the last six weeks I've had failed deployments. The VMs are created, but there is an error during the process:
'The DSC Extension failed to execute: Error downloading https://wvdportalstorageblob.blob.core.windows.net/galleryartifacts/Configuration_3-10-2021.zip after 29 attempts: Unable to connect to the remote server.'
In a nutshell, the newly-created VM has a problem accessing the Internet, can't get this configuration file, and fails to complete the deployment process correctly. It doesn't join the domain and it isn't accessible to remote desktop clients.
I created a ticket with Azure support in April. They never found the root cause and after a couple weeks I successfully deployed a session host using the same procedures as I had when it was failing. So we chalked it up to some transient back-end failure and closed the ticket. Now the problem is back and I have a second ticket open with a new technician. They also seem unable to explain the root cause. They are telling me I should not be adding my new VMs to the host pool using the Host Pool > Session Hosts > Add button, but instead manually adding the VMs to the pool after they're spun up, using the process described at https://docs.microsoft.com/en-us/azure/virtual-desktop/create-host-pools-powershell#register-the-vir....
Specifically, they said: "The Manual WVD Hostpool registration process is recommended by Microsoft and is applicable for custom / sysprep image scenario. The VM machine contains .dll files and other OS building system files which on multiple syspreps and snapshots might broke it. This has been seen on many previous cases and as per those case analysis I’m sharing the information with you. Microsoft always recommend to create a new vm with the latest snapshot (latest version VM) because of the azure VM URL whitelisting."
Can this be true? If so, it basically invalidates Robin Hobo's recommendations for image management and suggests that deploying updated session hosts is going to be a ton of work. I'm not sure why the tech is referencing "multiple syspreps," because I never sysprep a VM more than once. My sysprepped machines are captured to my Shared Image Gallery and deleted as part of that process. It's true that there are multiple snapshots involved, but the snapshots are always pre-sysprep.
I don't think the problem has to do with Azure VM URL Whitelisting, anyway (Azure Virtual Desktop required URL list - Azure | Microsoft Docs). On the VM whose deployment failed, it wasn't just the configuration file that couldn't be reached. I can't access any web sites using Edge. Name resolution is working, and I can ping other VMs within my vnet, but there's clearly something wrong with Internet connectivity in general.
Anyway, I would love to know your reaction to the comment that deployment failures like this have "been seen on many previous cases." Should I push back on this with MS support or is this in fact a widely known problem? And if it's widely known, what's the fix? Please tell me it's not starting with a fresh Azure gallery image each time and then manually registering each host to the host pool!
Thanks for reading all this.
Jun 15 2021 02:37 PM
@David Schrag Hi David, we are working on solutions to make an image easier to customize and apply to existing host pools. Meanwhile, both options you listed are supported and should work without issues. There are many reasons why the VM isn't able to access the URL you provided, including proxy or firewall settings within your Azure subscription or Windows image. I won't try to troubleshoot this issue over a forum but please try to exclude as many on your end that could impact network for Windows to download the required package.
Microsoft support should be able to help you troubleshoot, if you can share your CSS ticket through a private message I can look if there's anything I can help with.
Jun 15 2021 02:46 PM
Jun 22 2021 03:15 PM