Nov 09 2021 12:28 PM
Nov 09 2021 12:28 PM
We are in the process of moving our .NET ERP system to Azure. We will be running it as a virtual desktop published app. We have it working perfectly when you log into the session desktop, but when you try to run it as a published app we get a lot of flickering, temporary black screen and periodically complete app crashing. It seems the harder the app is working the worse the artifacts are.
We really need this to work work. Are there any known fixes or best practices to get published apps to run smoothly?
Nov 10 2021 09:32 PM - edited Nov 10 2021 09:33 PM
@dcline97 Does the ERP system happen to be SAP? Those sound exactly like issues I had with SAP GUI years ago in a Terminal Services environment.
If so, you may want to look in 3 places for lists of tweaks needed to keep SAP GUI as a published app from exhausting GDI handles, which is what it did back then without tweaks. I haven't used SAP in a long while, so I can't say how much it's changed since, but worth looking into:
(most important setting of these is setting the Classic or Enjoy theme instead of the Signature theme, but all apply. I know the article says Citrix, but it's all still Terminal Services at the core as far as session / GDI handle memory management goes, which is what those fix.)
2) SAP Note 200694 - "Notes on SAP GUI when used via terminal server." SAP support notes are behind a support paywall, so I can't post the whole PDF ,but the key takeaways were:
3) SAP Note 138869 - "SAP GUI on Windows Terminal Server (WTS)" Again, behind a paywall, but the one takeaways that isn't in #2 is:
Again, the above only matters if you're using SAP, and SAP's desktop app may have changed to overcome these, but worth checking with SAP support to get the latest version of the equivalent documents to outline tweaks needed for SAP in Terminal Servers, if that's what you're using.
If not SAP, you may wish to look into troubleshooting GDI handle leaks using an app like Bear (https://the-sz.com/products/bear/index.php). Usually, in my experience, when an application has flickering that gets worse the harder it's working, the more instances there are, or the longer it's been running, that has been an issue with GDI handle exhaustion (which is what SAP was doing and the above settings fixed for that particular app).
Are you seeing UI elements also show up in the wrong place, e.g. buttons out of place and such? If so, that's a surefire sign you're dealing with GDI handle exhaustion (though it doesn't _always_ happen when you're are the GDI handle cap, it's a pretty common occurrence in that situation). In RDS (and therefore AVD), a single process is limited to 10,000 GDI objects, and there are 65,535 total allowed per session.
If Bear seems to indicate you're coming close to the 65535 per session or 10K per process limit when it's going wonky, then the fix options are going to be either:
1) Tweaking the app to disable any themes / animation / graphical effects (to try to reduce the number of elements drawn and therefore able to be leaked)
2) Getting the developer to fix their app to not have GDI handle leak bugs
3) Moving to publishing full Win10 / 11 desktops with the app in them, instead of publishing the app (see below)
I've dealt with a major clinical care app that had a GDI exhaustion issue the developer couldn't / wouldn't fix, and the only solution, once I was able to prove it was GDI handle exhaustion, was to use a full Windows 10 desktop to provide access to the app, with the session limits set to 1 user per desktop (e.g. pooled 1:1). This then allowed me to increase the number of GDI handles per app by changing the registry setting noted here: https://docs.microsoft.com/en-us/windows/win32/sysinfo/gdi-objects
It didn't necessarily _fix_ the root problem, but it was a bandaid to let me let the (badly coded) app leak way more handles before going wonky, and only have one user on each box running it, so that, for 99% of users, they could get through an 8 hour day before the app exhausted the handle limit with its leaks. I then had the boxes setup to automatically reboot on user logoff, and had idle / disconnect timers setup to ensure the users were logged off every night, so that the leaks would be cleaned up on user logout via a reboot, to get the machines ready for the next day.