First published on TechNet on Aug 24, 2012
Heya folks, Ned here again. Rather than continue the lie that this series comes out every Friday like it once did, I am taking the corporate approach and rebranding the mail sack. Maybe we’ll have the occasional Collector’s Edition versions.
This week month, I answer your questions on:
- The semi-myth of Kerberos time skew
- Finding all DCs in a domain with Windows PowerShell
- Why RIDs
- Inter-forest folder redirection
- USMT and monitor settings migration
- Blacking out group policy refresh
- Collapsing snapshots into new disks for VDC
- Recover or rebuild
- Determining Windows Server 2012 Full, Core, Minimal Shell via WMI
- Other stuff
Let’s incentivize our value props!
Question
Everywhere I look, I find documentation saying that when Kerberos skew exceeds five minutes in a Windows forest, the sky falls and the four horsemen arrive.
I recall years ago at a Microsoft summit when I brought that time skew issue up and the developer I was speaking to said no, that isn't the case anymore, you can log on fine. I recently re-tested that and sure enough, no amount of skew on my member machine against a DC prevents me from authenticating.
Looking at the network trace I see the KRB_APP_ERR_SKEW response for the AS REQ which is followed by breaking down of the kerb connection which is immediately followed by reestablishing the kerb connection again and another AS REQ that works just fine and is responded to with a proper AS REP.
My first question is.... Am I missing something?
My second question is... While I realize that third party Kerb clients may or may not have this functionality, are there instances where it doesn't work within Windows Kerb clients? Or could it affect other scenarios like AD replication?
Answer
Nope, you’re not missing anything. If I try to logon from my highly-skewed Windows client and apply group policy, the network traffic will look approximately like:
Frame |
Source |
Destination |
Packet Data Summary |
1 |
Client |
DC |
AS Request Cname: client$ Realm: CONTOSO.COM Sname: |
2 |
DC |
Client |
KRB_ERROR - KRB_AP_ERR_SKEW (37) |
3 |
Client |
DC |
AS Request Cname: client$ Realm: CONTOSO.COM Sname: krbtgt/CONTOSO.COM |
4 |
DC |
Client |
AS Response Ticket[Realm: CONTOSO.COM, Sname: krbtgt/CONTOSO.COM] |
5 |
Client |
DC |
TGS Request Realm: CONTOSO.COM Sname: cifs/DC.CONTOSO.COM |
6 |
DC |
Client |
KRB_ERROR - KRB_AP_ERR_SKEW (37) |
7 |
Client |
DC |
TGS Request Realm: CONTOSO.COM Sname: cifs/DC.CONTOSO.COM |
8 |
DC |
Client |
TGS Response Cname: client$ |
When your client sends a time stamp that is outside the range of Maximum tolerance for computer clock synchronization , the DC comes back with that KRB_APP_ERR_SKEW error – but it also contains an encrypted copy of his own time stamp. The client uses that to create a valid time stamp to send back. This doesn’t decrease security in the design because we are still using encryption and requiring knowledge of the secrets, plus there is still only – by default – 5 minutes for an attacker to break the encryption and start impersonating the principal or attempt replay attacks. Which is not feasible with even XP’s 11 year old cipher suites, much less Windows 8’s.
This isn’t some Microsoft wackiness either – RFC 4430 states:
If the server clock and the client clock are off by more than the policy-determined clock skew limit (usually 5 minutes), the server MUST return a KRB_AP_ERR_SKEW. The optional client's time in the KRB-ERROR SHOULD be filled out.
If the server protects the error by adding the Cksum field and returning the correct client's time, the client SHOULD compute the difference (in seconds) between the two clocks based upon the client and server time contained in the KRB-ERROR message.
The client SHOULD store this clock difference and use it to adjust its clock in subsequent messages. If the error is not protected, the client MUST NOT use the difference to adjust subsequent messages, because doing so would allow an attacker to construct authenticators that can be used to mount replay attacks.
Hmmm… SHOULD . Here’s where things get more muddy and I address your second question. No one actually has to honor this skew correction:
1. Windows 2000 didn’t always honor it. But it’s dead as fried chicken, so who cares.
2. Not all third parties honor it.
3. Windows XP and Windows Server 2003 do honor it, but there were bugs that sometimes prevented it (long gone, AFAIK). Later Windows OSes do of course and I know of no regressions.
4. If the clock of the client computer is faster than the clock time of the domain controller plus the lifetime of Kerberos ticket (10 hours, by default), the Kerberos ticket is invalid and auth fails.
5. Some non-client logon application scenarios enforce the strict skew tolerance and don’t care to adjust, because of other time needs tied to Kerberos and security. AD replication is one of them – event LSASRV 40960 with extended error 0xc000133 comes to mind in this scenario, as does trying to run DSSite.msc “replicate now” and getting back error 0x576 “There is a time and / or date difference between the client and the server.” I have recent case evidence of Dcpromo enforcing the 5 minutes with Kerberos strictly, even in Windows Server 2008 R2, although I have not personally tried to validate it. I’ve seen it with appliances and firewalls too.
With that RFC’s indecisiveness and the other caveats, we beat the “just make sure it’s no more than 5 minutes” drum in all of our docs and here on AskDS. It’s too much trouble to get into what-ifs.
We have a KB tucked away on this here but it is nearly un-findable.
Awesome question.
Question
I’ve found articles on using Windows PowerShell to locate all domain controllers in a domain, and even all GCs in a forest, but I can’t find one to return all DCs in a forest. Get-AdDomainController seems to be limited to a single domain. Is this possible?
Answer
It’s trickier than you might think. I can think of two ways to do this; perhaps commenters will have others. The first is to get the domains in the forest, then find one domain controller in each domain and ask it to list all the domain controllers in its own domain. This gets around the limitation of Get-AdDomainController for a single domain (single line wrapped).
(get-adforest).domains | foreach {Get-ADDomainController -discover -DomainName $_} | foreach {Get-addomaincontroller -filter * -server $_} | ft hostname
The second is to go directly to the the native .NET AD DS forest class to return the domains for the forest, then loop through each one returning the domain controllers (single lined wrapped).
[system.directoryservices.activedirectory.Forest]::GetCurrentForest().domains | foreach {$_.DomainControllers} | foreach {$_.hostname}
This also lead to updated TechNet content . Good work, Internet!
Question
Hi, I've been reading up on RID issuance management and the new RID Master changes in Windows Server 2012 . They still leave me with a question, however: why are RIDs even needed in a SID? Can't the SID be incremented on it's own? The domain identifier seems to be an adequately large number, larger than the 30-bit RID anyway. I know there's a good reason for it, but I just can't find any material that says why there are separate domain ID and relative ID in a SID.
Answer
The main reason was a SID needs the domain identifier portion to have a contextual meaning. By using the same domain identifier on all security principals from that domain, we can quickly and easily identify SIDs issued from one domain or another within a forest. This is useful for a variety of security reasons under the hood.
That also allows us a useful technique called “SID compression”, where we want to save space in a user’s security data in memory. For example, let’s say I am a member of five domain security groups:
DOMAINSID-RID1
DOMAINSID-RID2
DOMAINSID-RID3
DOMAINSID-RID4
DOMAINSID-RID5
With a constant domain identifier portion on all five, I now have the option to use one domain SID portion on all the other associated ones, without using all the memory up with duplicate data:
DOMAINSID-RID1
“-RID2
“-RID3
“-RID4
“-RID5
The consistent domain portion also fixes a big problem: if all of the SIDs held no special domain context, keeping track of where they were issued from would be a much bigger task. We’d need some sort of big master database (“The SID Master”?) in an environment that understood all forests and domains and local computers and everything. Otherwise we’d have a higher chance of duplication through differing parts of a company. Since the domain portion of the SID unique and the RID portion is an unsigned integer that only climbs, it’s pretty easy for RID masters to take care of that case in each domain.
You can read more about this in coma-inducing detail here: http://technet.microsoft.com/en-us/library/cc778824.aspx .
Question
When I want to set folder and application redirection for our user in different forest (with a forest trust) in our Remote Desktop Services server farm, I cannot find users or groups from other domain. Is there a workaround?
Answer
The Object Picker in this case doesn’t allow you to select objects from the other forest – this is a limitation of the UI the that Folder Redirection folks put in place. They write their own FR GP management tools, not the GP team.
Windows, by default, does not process group policy from user logon across a forest—it automatically uses loopback Replace. Therefore, you can configure a Folder Redirection policy in the resource domain for users and link that policy to the OU in the domain where the Terminal Servers reside. Only users from a different forest should receive the folder redirection policy, which you can then base on a group in the local forest.
Question
Does USMT support migrating multi-monitor settings from Windows XP computers, such as which one is primary, the resolutions, etc.?
Answer
USMT 4.0 does not supported migrating any monitor settings from any OS to any OS (screen resolution, monitor layout, multi-monitor, etc.). Migrating hardware settings and drivers from one computer to another is dangerous, so USMT does not attempt it. I strongly discourage you from trying to make this work through custom XML for the same reason – you may end up with unusable machines.
Starting in USMT 5.0 , a new replacement manifest – Windows 7 to Windows 7, Windows 7 to Windows 8, or Windows 8 to Windows 8 only – named “DisplayConfigSettings_Win7Update.man” was added. For the first time in USMT, it migrates:
<pattern type="Registry">HKLM\System\CurrentControlSet\Control\GraphicsDrivers\Connectivity\* [*]</pattern>
<pattern type="Registry">HKLM\System\CurrentControlSet\Control\GraphicsDrivers\Configuration\* [*]</pattern>
This is OK on Win7 and Win8 because the OS itself knows what valid and invalid are in that context and discards/fixes things as necessary. I.e. this is safe is only because USMT doesn’t actually do anything but copy some values and relies on the OS to fix things after migration is over.
Question
Our proprietary application is having memory pressure issues and it manifests when someone runs gpupdate or waits for GP to refresh; some times it’s bad enough to cause a crash. I was curious if there was a way to stop the policy refresh from occurring.
Answer
Only in Vista and later does preventing total refresh become possible vaguely possible; you could prevent the group policy service from running at all (no, I am not going to explain how). The internet is filled with thousands of people repeating a myth that preventing GP refresh is possible with an imaginary registry value on Win2003/XP – it isn’t.
What you could do here is prevent background refresh altogether. See the policies in the “administrative templates\system\group policy” section of GP:
1. You could enable policy “group policy refresh interval for computers” and apply it to that one server. You could set the background refresh interval to 45 days (the max). That way it be far more likely to reboot in the meantime for a patch Tuesday or whatever and never have a chance to refresh automatically.
2. You could also enable each of the group policy extension policies (ex: “disk quota policy processing”, “registry policy processing”) and set the “do not apply during periodic background processing” option on each one. This may not actually prevent GPUPDATE /FORCE though – each CSE may decide to ignore your background refresh setting; you will have to test, as this sounds boring.
Keep in mind for #1 that there are two of those background refresh policies – one per user (“group policy refresh interval for users”), one per computer (“group policy refresh interval for computers”). They both operate in terms of each boot up or each interactive logon, on a per computer/per user basis respectively. I.e. if you logon as a user, you apply your policy. Policy will not refresh for 45 days for that user if you were to stay logged on that whole time. If you log off at 22 days and log back on, you get apply policy, because that is not a refresh – it’s interactive logon foreground policy application.
Ditto for computers, only replace “logon” with “boot up”. So it will apply the policy at every boot up, but since your computers reboot daily, never again until the next bootup.
After those thoughts… get a better server or a better app. 🙂
Question
I’m testing Virtualized Domain Controller cloning in Windows Server 2012 on Hyper-V and I have DCs with snapshots. Bad bad bad, I know, but we have our reasons and we at least know that we need to delete them when cloning.
Is there a way to keep the snapshots on the source computer, but not use VM exports? I.e. I just want the new copied VM to not have the old source machine’s snapshots.
Answer
Yes, through the new Hyper-V disk management Windows PowerShell cmdlets or through the management snap-in.
Graphical method
1. Examine the settings of your VM and determine which disk is the active one. When using snapshots, it will be an AVHD/X file.
2. Inspect that disk and you see the parent as well.
3. Now use the Edit Disk… option in the Hyper-V manager to select that AVHD/X file:
4. Merge the disk to a new copy:
Windows PowerShell method
Much simpler, although slightly counter-intuitive. Just use:
Convert-vhd
For example, to export the entire chain of a VM's disk snapshots and parent disk into a new single disk with no snapshots named DC4-CLONED.VHDX:
Violin!
You don’t actually have to convert the disk type in this scenario (note how I went from dynamic to dynamic). There is also Merge-VHD for more complex differencing disk and snapshot scenarios, but it requires some extra finagling and disk copying, and isn’t usually necessary. The graphical merge option works well there too.
As a side note, the original Understand And Troubleshoot VDC guide now redirects to TechNet. Coming soon(ish) is an RTM-updated version of the original guide, in web format, with new architecture, troubleshooting, and other info. I robbed part of my answer above from it – as you can tell by the higher quality screenshots than you usually see on AskDS – and I’ll be sure to announce it. Hard.
Question
It has always been my opinion that if a DC with a FSMO role went down, the best approach is to seize the role on another DC, rebuild the failed DC from scratch, then transfer the role back. It’s also been my opinion that as long as you have more than one DC, and there has not been any data loss, or corruption, it is better to not restore.
What is the Microsoft take on this?
Answer
This is one of those “it depends” scenarios:
1. The downside to restoring from (usually proprietary) backup solutions is that the restore process just isn’t something most customers test and work out the kinks on until it actually happens; tons of time is spent digging out the right tapes, find the right software, looking up the restore process, contacting that vendor, etc. Often times a restore doesn’t work at all, so all the attempts are just wasted effort. I freely admit that my judgment is tainted through my MS Support experience here – customers do not call us to say how great their backups worked, only that they have a down DC and they can’t get their backups to restore.
The upside is if your recent backup contained local changes that had never replicated outbound due to latency, restoring them (even non-auth) still means that those changes will have a chance to replicate out. E.g. if someone changed their password or some group was created on that server and captured by the backup, you are not losing any changes. It also includes all the other things that you might not have been aware of – such as custom DFS configurations, operating as a DNS server that a bunch of machines were solely pointed to, 3 rd party applications pointed directly to the DC by IP/Name for LDAP or PDC or whatever (looking at you, Open Source software!), etc. You don’t have to be as “aware”, per se.
2. The downside to seizing the FSMO roles and cutting your losses is the converse of my previous point around latent changes; those objects and attributes that could not replicate out but were caught by the backup are gone forever. You also might miss some of those one-offs where someone was specifically targeting that server – but you will hear from them, don’t worry; it won’t be too hard to put things back.
The upside is you get back in business much faster in most cases; I can usually rebuild a Win2008 R2 server and make it a DC before you even find the guy that has the combo to the backup tape vault. You also don’t get the interruptions in service for Windows from missing FSMO roles, such as DCs that were low on their RID pool and now cannot retrieve more (this only matters with default, obviously; some customers raise their pool sizes to combat this effect). It’s typically a more reliable approach too – after all, your backup may contain the same time bomb of settings or corruption or whatever that made your DC go offline in the first place. Moreover, the backup is unlikely to contain the most recent changes regardless – backups usually run overnight, so any un-replicated originating updates made during the day are going to be nuked in both cases.
For all these reasons, we in MS Support generally recommend a rebuild rather than a restore, all things being equal. Ideally, you fix the actual server and do neither!
As a side note, restoring the RID master used to cause issues that we first fixed in Win2000 SP3 . This unfortunately has live on as a myth that you cannot safely restore the RID master. Nevertheless, if someone impatiently seizes that role, then someone else restores that backup, you get a new problem where you cannot issue RIDs anymore. Your DC will also refuse to claim role ownership with a restored RID Master (or any FSMO role) if your restored server has an AD replication problem that prevents at least one good replication with a partner. Keep those in mind for planning no matter how the argument turns out!
Question
I am trying out Windows Server 2012 and its new Minimal Server Interface . Is there a way to use WMI to determine if a server is running with a Full Installation, Core Installation, or a Minimal Shell installation?
Answer
Indeed, although it’s not made it way to MSDN quite yet. The Win32_ServerFeature class returns a few new properties in our latest operating system. You can use WMIC or Windows PowerShell to browse the installed ones. For example:
The “ 99 ” ID is Server Graphical Shell, which means, in practical terms, “Full Installation”. If 99 alone is not present, that means it’s a minshell server. If the “ 478 ” ID is also missing, it’s a Core server.
E.g. if you wanted to apply some group policy that only applied to MinShell servers, you’d set your query to return true if 99 was not present but 478 was present.
Other Stuff
Speaking of which, Windows Server 2012 General Availability is September 4th . If you manage to miss the run up, you might want to visit an optometrist and/or social media consultant.
Stop worrying so much about the end of the world and think it through.
So awesome:
And so fake 😞
If you are married to a psychotic Solitaire player who poo-poo’ed switching totally to the Windows 8 Consumer Preview because they could not get their mainline fix of card games, we have you covered now in Windows 8 RTM. Just run the Store app and swipe for the Charms Bar, then search for Solitaire.
It’s free and exactly 17 times better than the old in-box version:
OMG Lisa, stop yelling at me!
Is this the greatest geek advert of all time?
Yes. Yes it is.
When people ask me why I stopped listening to Metallica after the Black Album, this is how I reply:
Ride the lightning Mercedes
We have quite a few fresh, youthful faces here in MS Support these days and someone asked me what “Mall Hair” was when I mentioned it. If you graduated high school between 1984 and 1994 in the Midwestern United States, you already know .
Finally – I am heading to Sydney in late September to yammer in-depth about Windows Server 2012 and Windows 8. Anyone have any good ideas for things to do? So far I’ve heard “ bridge climb ”, which is apparently the way Australians trick idiot tourists into paying for death. They probably follow it up with “ funnel-web spider petting zoo ” and “ swim with the saltwater crocodiles ”. Lunatics.
Until next time,
- Ned “I bet James Hetfield knows where I can get a tropical drink by the pool” Pyle