ATA Client on a Server 2019 Domain Controller

Brass Contributor

We have noticed that when installing the ATA client on a Windows Server 2019 domain controller the Lsass.exe service crashes every 10-25 minutes and causes the server to reboot. We also noticed that when we installed the client on multiple 2019 domain controllers they all have Lsass.exe crash at the same time and they reboot within a few moments of each other. 

28 Replies

Interesting.

ATA does not install anything that I am aware of that should effect lsass at all.

Please open a case with MS support ASAP, and ask the responding support engineer to add me to the email thread.

This is something that will need to be investigated using crash dumps etc, which is not applicable on this forum thread. once we find root cause we can update this thread with results.

Also, we might want to engage both an ATA engineer and a platform engineer to take a look on the crash dumps.

 

Questions:

Can you tell me if uninstalling the ATA gateway resolves the issue and lsass stop crashing?

Any 3rd party security apps installed on the machines?

Are those physical machines / VMs or both?

Do you have other DCs (< 2019) where everything works fine?

Do you have other DCs (< 2019) which experience the same problem?

If you have crash dumps already, please zip and upload to the secured workspace that will be provided by the support engineer.

Also, attach any logs & blg files you can find from the gateway service on the crashing machine:

See those for how to collect these files:

https://docs.microsoft.com/en-us/advanced-threat-analytics/troubleshooting-ata-using-logs#ata-gatewa...

https://docs.microsoft.com/en-us/advanced-threat-analytics/troubleshooting-ata-using-logs#ata-deploy...

 

Eli

Adding some important info to set expectations:

Officially, (and also according to ATA docs) ATA is not yet supported on 2019.

(When the latest ATA was released, Server 2019  was not GA yet).

In spite of that, we are interested in this case because this is not something we thought was possible,

so researching it is interesting, but eventually the support on this will be "best effort".

Can you tell me if uninstalling the ATA gateway resolves the issue and lsass stop crashing?

            Yes it did stop the reboots

Any 3rd party security apps installed on the machines?

            None. These are dedicated AD controllers

Are those physical machines / VMs or both?

            We tried both. The interesting thing is they all seemed to reboot at the same time.

Do you have other DCs (< 2019) where everything works fine?

            Yes we have other 2016 DC’s that the Azure ATA client works just fine on

Do you have other DCs (< 2019) which experience the same problem?

            No. The 2016 servers are acting as expected

If you have crash dumps already, please zip and upload to the secured workspace that will be provided by the support engineer.

            Not yet. I will work on this.

Also, attach any logs & blg files you can find from the gateway service on the crashing machine:

            I will work on this

To be clear I'm talking about Azure Advanced Threat Protection and not the on prem version of ATP.

Same procedure please. Let support know it's AATP and not ATA.

Same situation on ours, our 3 x 2019 Domain Controllers with AATP Sensor installed also crash lsass causing a reboot.

 

As for third party software, nothing else, config is:

 

DC as a VM in Hyper-V [Host is 2019], VM as Gen 2 ver 9

DC installed as Core with FoD + IE11

Roles: DNS on all 3. DHCP on 2 servers [with Scope failover in failover mode]

 

+ AATP Sensor on all 3

 

Nothing additional

Looking at timing it could be triggered by activety and amount of.

 

Reboots more frequently during the day, then last reboot was around 19:05 last night, next reboot was 07:05 this morning, then after that around every 30 - 60 mins.

Same in our '19 environment, AATP sensor causes lsaas crashing and reboot on 2019 DC's. No 3rd party software installed.

Please open a support case , and mentioned to the assigned engineer to add me to the thread as well.

We have some progress with the initial case, I want to make sure you are failing on the same thing.

Question: Do you have Windows hello for business installed there?

No, do not have Windows Hello for Business configured yet.

We need a support case with this one to  have a secured workspace where we can exchange instructions and dump files to check exactly what happened.

Thanks Eli, have opened a support case via the Azure portal [seems to be some issue on the AATA portal with opening a case]

If you can share the case # in a private message, I will do my best to get this case prioritized.

What timezone are you located in?

Forum has limitations on PM, so putting details directly here:

GMT +1 for me [Malta, EU]

Details of case from Azure Support Portal:

Case Name: 2019 DC's reboot after Sensor causes lsass.exe crash
Case ID: 118112725001888
‎Created: Tue‎, ‎27‎ ‎Nov‎ ‎2018‎ ‎13‎:‎13‎:‎53

I can confirm this is still an issue two months later!  We have an open case with MS support on this issue.     We upgraded two dedicated DCs from 2016 to 2019 and they were fine until Monday morning when they got user load then lsass became very unhappy.

 

It is very frustrating when MS tech breaks other MS tech, especially when it is tech specifically designed to run on a particular server role like this.

We have Windows Hello for Business deployed on our domain.  The upgrade to 2019 broke that as well: WhfB works when off the network but not at the sites that have the upgraded DCs.  Is that related somehow?  We have an incident opened with MS support on this but have not got anywhere with it yet.

Hi Doug,

There is a reason AATP is still not stating support for Windows Server 2019 Domain Controllers,

and this is because it hasn't cleared testing yet.
Sadly, there is a bug in lsass.exe that gets triggered easily when the sensor is installed.

There is a private fix for it that wasn't publicly released yet, so if you are already in this situation support will be able to provide it to you for mitigation  but this is "best effort" support for now as it's officially not yet a supported configuration.

Once the lsass fix will be publicly released, hoping that AATP will pass 2019 testing, we will work quickly to officially support it.

Doug,

 

Windows Hello and ATA are both broken on 2019 DC's. I have gotten word that a fix is coming in Feb and I was able to open a premier ticket and get a private hotfix to hold us over until the public fix comes out. 

Just to be clear, in this case lsass is borken, The Sensor just acts as a fast trigger for it. in theory it can happen even if the sensor is not installed. This is why we have to wait for the windows fix cycle.