Forum Discussion
Azure Virtual Desktop - Black Screens on logins - What we've tried so far
TLDR - Azure Virtual Desktop Black Screens. Could be 2 Min long, could be much longer. Tried removing stuck profiles, spun up all new VMs to see if that would fix it, finally disabled an application service that was polluting the Event logs constantly with appcrashes. Hoping that maybe the event logs weren't able to keep up so we had a black screen while events caught up. Grasping at straws.
We started getting reports of black screens when users login to one of our AVD Host Pools. Our users are using FSLogix for profiles, but we've also seen the issue when logging via RDP with a local admin account. We tested and saw similar results where you login, FSLogix Prompt goes by, then to Preparing Windows, then black screen.
- In a normal login, this black screen will last 10-20 seconds before desktop comes available and user can begin their session.
- With this issue, we were seeing black screens that just stayed there until you forced a logout of your account.
We saw some profile issues with the VMs in the pool appearing to be stuck on a VM when it should be removed upon logoff with FSLogix and we saw some stuck local_username FSLogix profiles still in the users folder. Instead of finding the needle in a haystack, we spun up a new group of VMs and put the others in drain mode / excluded.
With the new VMs, logins from RD Client were working fine yesterday afternoon, evening and this AM. But later in the morning, we saw some issues with users getting a black screen lasting 90 sec - 2 min before desktop loaded in. I had it happen to me when logging in, but it seemed to go away once I tried a couple more times. I even directly RDPd into the host that I had the 2 min black screen for me and was able to get in quickly. So issue appears to still be showing, but not as bad.
We looked in event logs and saw that one particular application - the Aspen Multicase Web service was polluting the service event logs with appcrash errors every few seconds. So we've disabled that application service on all the VMs in the pool and logins have been normal since. We read event logs that were event 4625 (failed login) but the event said event logs couldn't keep up and needed to stop duplicate events...so we were thinking that this service was constantly writing to event logs, could the slow logins happen when the service is trying to run, failing and writing to event logs. the logs wouldn't be able to write the login info.
But every other change we made things seem fine afterward for a while, but then the black screen will come back for at least 90sec - 2 min.
Any suggestions on things we can try / look at that could be causing this?
- paulosilva_PICopper Contributor
Microsoft has advised that the fix will be released in the October patch, meanwhile a KIR (Known Issue Rollback) has been release to fix this Appx Crash issue.
Please download this KIR and follow instructions here.
This will install a new computer Local Group Policy which looks like this,
...which then need to be set as disabled, please reboot and you'll see that the Appx Crash will stop.
This has been the solution from Microsoft to deal with the issue until next windows patch release.
Give it a try and let me know if it works.
- djordan1910Copper Contributor
So far so good... we just updated all the hosts in one of the hostpools. The black screens *seem* to be resolved. I'll update after a full day of usage tomorrow. Thanks for the assist!
- paulcruwysCopper Contributor
djordan1910 interested to hear feedback before we apply and get this headache closed down
- KevHalIron Contributor
Thanks for this, I asked Microsoft support rep for an update and he still said wait until October patches, I then forwarded him this so will see what he says. Guess they don't talk to each other.
All my AVD hostpools have the Appx issue, some pools seem to be getting through it regardless of the appx problem whilst others just crash and burn, I think ts just pot luck whether your pool is affected or not.
I have applied this fix to two of the most affected pools and will see what happens tomorrow, thanks paulosilva_PI - tristian1250Copper ContributorDo you know if there is a similar KIR for Win11 hosts? Dealing with the exact same thing.
- jlou65535Iron Contributor
Look likes Microsoft should fix it soon :
Microsoft has identified a bug affecting Azure Virtual Desktop. Users are now experiencing intermittent black screens for 30 seconds to 3 minutes during sign-in.
This issue is linked to the patch update KB5043064 (OS Builds 19044.4894 and 19045.4894), installed on September 10, 2024
After reviewing application logs, we found that the AppXSvc service is crashing due to an uninitialized m_targetUserSidString in the OSIntegrationManagerHost::Initialize. (event ID 1000)
Per Microsoft, a permanent fix is pending and is most likely to be fixed with October patch.
Here are some workarounds suggested by Microsoft :
Uninstall the problematic updates: Go to Settings → Windows Update → Update History → Uninstall Updates. Select KB5043064 then click on Uninstall and reboot your system.
- NicolaiCopper Contributor
We also have a Microsoft ticket open, because even after installing the fix we still had problems with the login in connection with the AAD.Broker plugin.
Microsoft has now written me this workaround, which we will implement today:
"""
RCA seems to be understood, we hit the deadlock in auto repair due to a corruption that took place due to the 7D-known-Issue (10D was the resolution for).
One workaround that has brought good result is a combination of:
a.) Redeployment of Hosts with 10D
b.) Run this script as part of the UserLogon
"Add-AppxPackage -Register -Path "C:\Windows\SystemApps\Microsoft.AAD.BrokerPlugin_cw5n1h2txyewy\AppxManifest.xml" -DisableDevelopmentMode"
Background information:
When we fall into the Event ID 10 - Scenario, we are in the progress of "AutoRepair" that will try to do additional steps that could lead to the deadlock. By calling this as a "User Logon Script" we avoid the auto repair and should see an improvement.
"""
Let's see if it gets better.
- chrismagaCopper Contributor
Does it work for you?
We added the logon script via GPO yesterday for one Pool as testing.
For the moment no problems were reported. Fingers crossed 🙃
- cathaldubCopper Contributor
Added to the User-GPO-logon scripts🤞
- cathaldubCopper Contributor
Just in case anyone else still experiencing the issues, this script fixed our issues, it didn't work immediately, was like some users still had cached issues but now 1 week with no tickets in an environment of 100 users
- MattNowickiCopper Contributor
We use App Attach with Virtual Desktops and are seeing very similar issues, all of which started on Monday at approximately 2pm Central. When did this issue start for you? Your post, coupled with another below from Monday, suggest that there is an Azure-wide problem that is not specific to my environment or yours.
We are in the process of destroying and rebuilding our host pools and hosts, but it sounds like you've already gone through that exercise. We did switch to our failover environment yesterday, and it worked fine all day. But this morning, the same problem exists in that environment as well.
We've been engaged with Microsoft support since Monday and they have been absolutely no help at all.
Hoping you can post back here if/when your situation changes for the better! - addysidd27Copper ContributorFor the issue of black screens on Azure Virtual Desktop, here's a suggestion you could try:
1)FSLogix Profiles: Ensure profiles are properly cleaned up after logoff. You could also clear stuck profiles from the VMs to prevent login delays.
2)GPU-Related Issues: If your VMs are GPU-enabled, try disabling or adjusting GPU settings, as these can sometimes cause black screens.
3)Event Logs: Check for applications causing event log bloat. Disabling unnecessary services may reduce log-in times.- marve435Copper Contributor
We have been facing the exact same problems reported here, also since Monday
also, by any chance did you notice that session hosts also lost assigned ASR rules? (also occurred for us around the same time) - JPlendoBrass ContributorWe have actually been through 2 of these 3
1. FSLogix Profiles - We started seeing lots of profiles stuck on VMs, both local profiles and the FSLogix local_username profiles. We went through the VMs and removed everything, then started to delete VMs and allow new ones to spin up
2. Event Logs - We had a particular application (Aspen Multicase Web Service) that was causing repeated errors in the Event logs, every few seconds. We also saw some event logs talking about needing to catch up. We disabled that service and have seen much better logins since. We see a black screen here and there, lasts 90s - 2 min and eventually logs in.
So with brand new VMs, verifying profiles are working properly and being removed after logout and disabling that service, we are seeing much better results, but still get the occasional black screen issues
- KevHalIron ContributorWe are seeing this with a couple of our clients now, Microsoft needs to get a grip of this!
- JPlendoBrass ContributorWe were on the phone with MS, who is playing dumb and have no answers for us. They wanted FSLogix logs, but came back with nothing. We sent them emails yesterday asking detailed questions. Have they responded? No....cause MS Support, even at Sev A is a friggin joke. Its disgusting their support is this bad now and its OK.
- henrikmc2Copper Contributor
JPlendoThe buggy component is the App Readiness service, and this is not the first time we seem black screens and the same service responsible. Kill that service and you fly in on the desktop, but then SSO stops working so causes other problems. They need to fix it.
But its troublesome that they dont post any public articles on this.
Fix will be KB5045594
To help on a faster logon that dosent wait for App Readiness, the below reg keys works (But dosent fix the appsvc service crash)
Windows Registry Editor Version 5.00
; --------------------------------------------------------------------------------------
; #Reason: Fix App Readiness with timeout; --------------------------------------------------------------------------------------
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer]
"AppReadinessPreShellTimeoutMs"=dword:00060000
"AppReadinessGlobalTimeoutMs"=dword:00120000[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Terminal Server]
"fRunAppReadiness"=dword:00000000
; --------------------------------------------------------------------------------------- JPlendoBrass Contributor
henrikmc2You mentioned KB5045594 being released as a fix. Is that only for Windows 10? Do you know if they will release anything for Windows 11?
What does that reg fix you listed do if its not helping the appreadiness service crashing? Does it allow for logins even if the service has hung?
- dit-chrisBrass ContributorHi JPlendo
So as I understand it that just says if the AppReadiness has run for 60sec at PreShell just kill it, if it runs for 120 seconds in total then again just kill it, in theory that should cut that black screen delay by simply stopping the running of the hanging process... but that will then cause issue with AppX packages deploying and SSO etc as a side effect. Another fix with the same effect is probably to just set the AppReadiness service to Disabled in services... yes you'll get much quicker login times probably but a load of stuff won't work - which might include the start menu as there is an AppX packages for that is appears!
- KristofHBrass ContributorThis one KB5045594 is now released in preview/beta without any mention of the issue: https://blogs.windows.com/windows-insider/2024/10/14/releasing-windows-10-build-19045-5070-to-beta-and-release-preview-channels/
- ricoooooCopper Contributor
We also have a Microsoft ticket open and we notice that after installing the black screen issue fix (KB5045594 Preview) SSO issues and losing network connectivity between Office apps still happening on our customers.
Microsoft provided us the following workaround/fix (see below screenshot) for the SSO issues regarding the AAD Broker plug-in but we are seeing that the policy is not being applied. Do you guys have the same issue?
Workaround/Fix for SSO Failures
- Install the October 22, 2024 update (KB5045594): October 22, 2024—KB5045594 (OS Build 19045.5073) Preview - Microsoft Support
- Reboot the machine.
- Open Group Policy Manager as administrator
4. Navigate to User Configuration > Windows Settings > Scripts (Logon/Logoff) > Logon
5. Click on PowerShell Scripts > Add
6. Type a script name such as “AAD Broker plug-in fix” and on the Script Parameters section add the following command line “Add-AppxPackage -Register -Path "C:\Windows\SystemApps\Microsoft.AAD.BrokerPlugincw5n1h2txyewy\AppxManifest.xml" -DisableDevelopmentMode”
Click Ok.
7. Select Run Windows PowerShell scripts first and then Apply:
8. Reboot the machine and ensure the policy is active for all users.
After doing these steps all users should get the fix applied at logon and the issues described on the previously indicated documentation should seize. If you detect a user that experiences the issue, first ensure that the logon script is being applied, if it isn’t, enforce the logon script to the user and ask them to sign out and sign in.
- mgortonCopper Contributor
Check your system logs to see if there are logon script failures.
“Add-AppxPackage -Register -Path "C:\Windows\SystemApps\Microsoft.AAD.BrokerPlugincw5n1h2txyewy\AppxManifest.xml" -DisableDevelopmentMode” is incorrect for us btw.
There's a missing underscore in the path:
"Add-AppxPackage -Register -Path "C:\Windows\SystemApps\Microsoft.AAD.BrokerPlugin_cw5n1h2txyewy\AppxManifest.xml" -DisableDevelopmentMode"
- PaulGMVPSteel Contributorhello,
you dont know how "happy" i am to find this post....
We are experiencing exactly same issue. Black screen during login to AVD session hosts.
Problem started this week - Monday.
Some technicalities :
- We have 25 hostpools. 24 are located in West Europe. One is located in India Central - and both regions affected by this.
- Vms are different sizes. We have mostly D4 but also D8, B4, B8 - problem with black screen exist on all of them
- We have FsLogix configured for multisession hostpools, and all of them are randomly experiencing this black screen. Looks like problem is not happening on Personal hostpools without FSL.
What about your pools ? Where are they located etc ?
Maybe there is some pattern here, or maybe MS is just hiding some service malfunction that happened recently...
We can also confirm that creating a brand new session host is stopping black screens for some time, like 24hr or something.
We Suspect that it may have something to do with Defender for endpoint. As soon as Defender engine will update to newest version - 1.1.24080.9 - some of the rules (ASR) are reporting OFF status.
So far we did not find any permanent solution for this. We have a lot end user incidents related with this black screen on AVD, we are shuffling users one host to another using drain mode.... its crazy....- KevHalIron Contributor
The only time i've seen issues like this was caused by not applying Defender Exceptions, we have them all added and then some. I wonder if there was a definition update on Monday that is causing this. The first few logons are fine, but then one further user may get the black screen, the App Readiness service stalls, this then causes a cascading effect with further logons. You can sometimes free or complete logons by restarting the App Readiness service.
I have no other workaround, its just pot luck.- JPlendoBrass ContributorYeah, this could be the culprit. We had a similar issue back in 2020. MS ended up making a patch for us specifically, which we applied and it cleared up the issues. They eventually released the patch in October 2020. The non security patch was for Windows 10, so wonder if they need a new one.
What really ticks me off is that this started after the outages and when we asked MS if the outages could have an effect on this they of course said "no, the outages wouldnt do that"
MS support is so incompetent, they didnt announce the outages the last two weeks until hours after people started complaining about it. Then when the outage was cleared, MS support called me to "work on the issue"....work on what issue? The issue was resolved by YOU. How does your support stafff not know that? I will tell you how, cause MS could care less about giving great support. They have such a hold on so many companies and they know no one is gonna leave. What are you gonna do, make all your non IT workers learn how to use Linux OS? Yeah, good luck with that. MS knows it and shows it by just not caring about support levels any longer. They dont stick to their SLAs and the support staff you do get on the phone is normally not very good. And of course MS is the BEST as asking you questions you've answered in the ticket you submitted or asking you to run processes you already ran and showed them, again. Its sickening how little they seem to give a crap about users anymore.
- dub452Copper Contributor
We're having the same issue. Beginning of the week, we saw slow logons (5 to 20 minutes), mostly during the AppX-LoadPackages phase of the logon (according to the ControlUp AnalyseLogonDuration script). The situation escalated last two days, we now see black screens after a few users per AVD are able to login succesfully. Restarting appreadiness service doesn't help, sometimes disconnecting and moving the logon process to another host helps.
Users that the logon process stalls seem to have issues loading some AppX packages (according to their Get-AppxPacakge count). We had the same issue a few months ago, we managed to stabilize by cleaning up unwanted packages in Get-AppxProvisionedPackage -Online) We're trying right now to do the same right now, per-user.
JPlendo - KevHalIron ContributorAny movement on this, I have just had a very angry customer on who have just been hit with the black screens. AVD credibility is getting hit pretty bad at the moment.