This is the first article in a series:
Stop Worrying and Love the Outage, Vol I: Group Policy and Sharing Violations
Stop Worrying and Love the Outage, Vol II: DCs, custom ports, and Firewalls/ACLs
Stop Worrying and Love the Outage, Vol III: Cached Logons
Hello! Chris Cartwright here from the Directory Services support team. Recently, we have seen an uptick in cases related to sharing violations when processing or editing group policies. Most of these issues are caused by locks on policy-related files within the SysVol share, from either security products or environmental conditions. Security product mitigations are already covered by exclusions and need not be repeated here. Our focus will be on the environmental conditions, latency and/or packet loss. Failure to follow this guidance may result in unexpected behaviors of both group policy processing and group policy editing. I can save you some time by making the following recommendation:
Make sure you only edit Group Policy from a single Domain Controller, PDC by default. If you are in an environment with a high number of clients, or clients with a high amount of latency between them and the DC, put this DC (PDC by default) in a separate Active Directory Site, a site that does not cover these subnet(s). Additionally, make edits using computers with low network latency to the DC (Again, the PDC by default).
Digging in:
First, let's look at just some of what occurs during GPO creation using the Group Policy Management Console (GPMC). When we create a GPO, a folder structure and several files are created, including the Machine folder and gpt.ini file. We will look at a few of those pieces using Process Monitor to get detailed tracking information:
Direct Create GPT.ini
Machine folder:
Take note of ShareMode. These file objects are opened with a ShareMode of Read, Write, and sometimes delete.
Now, let's take a look at just some of what happens when a group policy client downloads group policy files from a DC. We will also take a look at registry.pol (Think settings in admin templates) and gpttmpl.inf (Think User Rights Assignments), files that are created not during GPO creation, but through modifications made to the GPO post creation.
Gpupdate GPT.ini:
Machine folder:
GptTmpl.inf:
Registry.pol:
So, what are these share modes anyway? Share modes come in 3 flavors: Delete (also allows rename), Read, and Write. (There is also a ShareMode of None, which means no one can do anything with it at all until the handle is closed.) Share modes are applied to something called handles. Handles are references to a resource, such as a file, a window, or something like a OK button on dialog box.
When a file handle is acquired, it'll be done so with one of these modes. Any future attempts to acquire a handle to the file will fail if the acquisition doesn’t match the existing share mode. You can’t open a file with only ShareMode Read, keep it open, and then attempt to write to it via a separate handle.
That gets us to the crux of the problem: If a file is locked with ShareMode Read, you can’t write to it. If a file is locked with ShareMode Write, and another entity wants to read the file and obtain a lock for it with ShareMode Read or Read, Delete, it won’t be able to because it does not match the existing lock. This means that if all your clients keep locking your group policy files, you won’t be able to edit them. It also means that it is possible that clients will be prevented from reading group policy files while you edit them.
Error examples
Here's some examples of how this might present when trying to modify a GPO that has files locked by clients.
An attempt to modify a security setting in the Group Policy Editor that must write to GptTmpl.inf:
Access is denied
Failed to save
\\<domain name>\SysVol\<domain name>\Policies\<GPO GUID>\Machine\Microsoft\Windows NT\SecEdit\GptTmpl.inf. Make sure that you have the right permissions to this object.
An attempt to modify an administrative template configuration in the Group Policy Editor that must write to Registry.pol:
Unhandled exception has occurred in a component in your application. If you click Continue, the application will ignore this error and attempt to continue.
Access is denied
Here’s examples of Event Log errors on clients from files locked for writing:
GptTmpl.inf from Group Policy Operational Log
Event ID 7016
Completed Security Extension Processing in <value> milliseconds.
On the details tab for this same event (ErrorCode 1252):
EventData |
CSEElaspedTimeInMilliSeconds |
453 |
ErrorCode |
1252 |
CSEExtensionName |
Security |
CSEExtensionId |
{827d319e-6eac-11d2-a4ea-00c04f79f83a} |
Here’s a table for the most common Event Log errors you may observe on clients including the one above.
File Locked |
Event Log |
Event Source |
Event ID |
Error Code on General Tab |
Error Code on Details Tab |
GptTmpl.inf |
Microsoft-Windows-GroupPolicy/Operational |
GroupPolicy |
7016 |
N/A |
1252 |
GptTmpl.inf |
Application |
SceCli |
1001 |
32 |
N/A |
Gpt.ini |
Microsoft-Windows-GroupPolicy/Operational |
GroupPolicy |
7017 |
N/A |
N/A |
Gpt.ini |
System |
GroupPolicy |
1058 |
N/A |
32 |
Registry.pol |
Microsoft-Windows-GroupPolicy/Operational |
GroupPolicy |
7016 |
N/A |
2147500037 |
Registry.pol |
System |
GroupPolicy |
1096 |
N/A |
N/A |
Knowing about these sharing violation interactions, lets take a look at some example lock times…
The following tests for read file lock time length were done against a GPO with a registry.pol around 70KB in size. I have chosen registry.pol as it tends to be order(s) of magnitude larger than either gpt.ini or GptTmpl.inf and thus, more susceptible to problems.
Latency 1ms
.016s
Latency 50ms
0.814s
Latency 100ms
1.555s
Latency 150ms, 1% Packet loss:
2.364s
Latency 200ms, 2% Packet loss:
3.996s
Theory time. Aka There be dragons
There’s no need to keep reading at this point. However, I wanted to try and provide some odds of these conflicts, but I’m unsure if the formulas below are correct.
Odds of read lock:
Odds of write lock:
Odds of either happening:
Given the time to access the file above, with 2000 clients accessing a single DC over the slowest link with the worst packet loss.
1 |
71.8881% |
2 |
92.0972% |
3 |
97.7784% |
4 |
99.3755% |
5 |
99.8244% |
500 clients:
1 |
27.18% |
2 |
46.98% |
3 |
61.39% |
4 |
71.89% |
5 |
79.53% |
The fastest link with 2000 clients:
1 |
0.5069% |
2 |
1.0112% |
3 |
1.5130% |
4 |
2.0123% |
5 |
2.5090% |
Doing it the correct way (Assume 5 clients with lowest latency):
1 |
0.0015% |
2 |
0.0031% |
3 |
0.0046% |
4 |
0.0061% |
5 |
0.0076% |
If my formulas are correct, "doing it the correct way" would take over 3800 GPO modifications in a single group policy refresh period to reach a 1% chance of a conflict preventing GPO modification. And if the formulas are incorrect, that’s okay too. The guidance in the second paragraph above still stands.
References
Introduction to Network Trace Analysis 3: TCP Performance - Microsoft Community Hub
CreateFileA function (fileapi.h) - Win32 apps | Microsoft Learn
Creating and Opening Files - Win32 apps | Microsoft Learn
Chris “Sharing isn’t always caring” Cartwright