Ongoing FSLogix Profile Issues

Brass Contributor

Hi,

 

We have recently gone live with a WVD setup of 2 Session hosts using FSlogix's profiles on an Azure File Server VM. 

 

It's not a massive deployment. Maximum of 38 users with general usage 10-15 at any one time. 

 

However, we are having never-ending issues of one sort or another. Primarily relating to profiles.  It seems like all is well for a few days then everything falls apart. The profiles stop connecting and users are unable to log in we need to reboot the servers to resolve. 

 

The general error we are seeing is:

fslogix failed to open virtual disk the process cannot access the file because it is being used by another process 0x20

 

To try to simplify things, we have shut down one host altogether and loaded all users onto 1 VM because of the unreliability of the profiles connecting. However, while this ran fine for a few days we are back to the same issue. 

 

It is critical we can get some kind of stability before the customer relationship goes south altogether, 

 

I would appreciate any help you might be able to provide. 

 

I would post more thorough logs but at this time I can't even RDP to the machine to get them! 

26 Replies

@R_AkersCan you post what registry keys you have under your FSLogix/Profiles reg key?  There are several that should be considered here.  Also, are you just using the profile container VHD or are you using the Office container?

@jasonhand Hi Jason, 

 

Reg keys below:

 

clipboard_image_0.png

 

We are using Profile containers. 

Hi,

Can you tell me which FSLogix agent version you are using (I suppose the latest version?)

Which type of disk are you using for the profile storage on the Azure VM?

Any reason you added the DisableRegistryLocalRedirect entry?
Can't seem to find this in the current documentation but I found that it idd may help with stability.

Which Windows 10 version did you deploy?

I deployed an environment of similar size last year and they are working fine.
We could go over the environment together to find the differences.

@knowlite Thanks for the reply:

 

Can you tell me which FSLogix agent version you are using (I suppose the latest version?):  I've just updated to 2.9.7237.48865 after reading this post https://techcommunity.microsoft.com/t5/windows-virtual-desktop/workaround-for-non-responsive-windows...

 

I can't remember what it was on prior but I did update it a few weeks ago when we first started having issues. I'll see if I can find out.


Which type of disk are you using for the profile storage on the Azure VM?

127gb Premium SSD 

Any reason you added the DisableRegistryLocalRedirect entry? 
Can't seem to find this in the current documentation but I found that it idd may help with stability.

 

I read somewhere that on Win 7 or over you can add DisableRegistryLocalRedirect and set it to 1 to force FSLogix to keep the NTUSER.dat and associated files in the profile container. Its apparently used to reduce profile corruption.

Which Windows 10 version did you deploy?

 

It's currently running Windows 10 Enterprise for Virtual Desktops 1903 It was the deployment package if I remember right with O365 Pro Plus from the market place. 

I deployed an environment of similar size last year and they are working fine.
We could go over the environment together to find the differences.

Are you using an 127GB SSD to house 38 profiles? That is like 3.3GB average per user, isn't that kinda small? Hower this is not the cause of your issue I think.

Do I understand correctly that you updated your FSLogix agent just now after reading the post above?

What is the exact experience, do users get an error message or can't they login at all?

Did you also disable the BrokerInfrastructure scheduled task from the post above?
schtasks /change /tn "\Microsoft\Windows\BrokerInfrastructure\BgTaskRegistrationMaintenanceTask" /disable

@knowlite 

 

Are you using an 127GB SSD to house 38 profiles? That is like 3.3GB average per user, isn't that kinda small? Hower this is not the cause of your issue I think.

 

It is but storage is ok at the moment. They are using it primarily to access a couple of applications rather than storing large amounts of data. 

Do I understand correctly that you updated your FSLogix agent just now after reading the post above?

That's correct

What is the exact experience, do users get an error message or can't they login at all?

Either they log in with what looks like a local profile despite the reg keys or they get a message advising the fslogix profile container is unavailable. the messages I see in the FXtray app is below

clipboard_image_0.png

 

 

clipboard_image_1.png



Did you also disable the BrokerInfrastructure scheduled task from the post above?
schtasks /change /tn "\Microsoft\Windows\BrokerInfrastructure\BgTaskRegistrationMaintenanceTask" /disable

 

I did just this morning.

Please give feedback end of the week while testing the latest version, looking forward to the results!

Are your users working in a full desktop environment on WVD or launching Remote Apps?
I've seen this error when trying to launch a remote app when another app is already using the FSLogix profile.
I will do. I'm hoping desperately this resolves it! Thnk you for helping.

It's a Remote App setup. These error messages seem to happen when it's trying to sign the user in.

@knowlite 

 

So after a week of everything running pretty smoothly with one session host. Something happened over the weekend that has sent it off in a huff again. 

 

Its seems to be working after a reboot of the profile server. Logs below:

 

clipboard_image_0.png

 

From what I can see so far there was no issues with the domain controllers over the weekend.

Again after running fine for a few days the fslogix service seems to have randomly stopped. I'm looking to see if I can find a reason now. The unreliability is really starting to cause issues.
Again a re-occurrence of 'the process cannot access the file because it is used by another process'
A reboot of the profile server has resolved this issue. Does anyone have any suggestions of what is going on here? I can't believe it's this unreliable for others.
Can you try to troubleshoot (I know, this is a inconvenient time to troubleshoot) with process explorer which process is using the VHD file that FSLogix is trying to open?
What AV solution are you using on the fileserver? Did you make any exclusions for FSLogix?

@R_AkersOne of the things we do to keep things running smoothly is a nightly reboot of the Hosts.  I have found that after a few days the hosts start to have strange issues but nightly reboot keeps that from happening.  

 

I would suggest using a larger Profile container on your file server and dedicated to the profiles.  I hope you aren't using the OS volume for profile VHD's.  

 

Also, as someone else posted, make sure your AV on any of these is excluding the .VHD's from being scanned. We also disable real-time scanning on the hosts and just do scheduled scans.

 

One more thing, I would use the disable concurrent logins reg key to keep it from allowing multiple logins.

Thanks for the response. We can extend the profile container to give it a go. Its a dedicated Azure VM only running the fslogix profile storage. Does anyone know what would be involved in migrating them to an Azure File share, we started this before it was an option during the public preview.

The VHDX files are excluded by the AV. The AV we are using is webroot.

I had considered the nightly reboots but I had hoped the need for such a thing had long gone! I will give it a go as a mitigation however a more permanent solution would be ideal.

@knowlite 

 

I'll have a look at it now, the log looks as follows

 

clipboard_image_0.png

@R_AkersI don't believe using an Azure file share is ready yet. We also use a dedicated Azure VM running the file server role and have an OS volume and then a separate Data volume where the profiles are housed.

 

Here are the reg keys we are using although we are doing full desktops and not remote apps. 

@jasonhand Thank you for the reply,

 

The registry currently looks like the below: 

 

clipboard_image_0.png

 

Could I ask how you perform your nightly reboots? Do you just use a normal scheduled task or an Azure Automation runbook?

@R_AkersThe nightly reboots are handled through a Powershell script that runs from the Azure AD controller.  It looks up the pool members by -like {name*} and then uses test-connection to verify each is online before issuing the reboot. Has been working well. 

 

We use scaling sets so this lookup then returns only those that are present since it creates different named instances whenever they are scaled down then back up.

 

If yours are static a simple remote reboot would work and you can log it into a file as well to keep track of it.