Unusual Kerberos Failure...User to User to What?
We get some really unique issues at times that strain patience and understanding. With Kerberos this is doubly true since it is already as complex and extensible as any person could ever ask for. This one may be particularly interesting to those who are creating new solutions using our Kerberos implantation, or are troubleshooting an already customized solution using it.
Recently we had an interesting scenario that I’ll relate here since it was both difficult to set up correctly, and even more difficult to troubleshoot with certainty. Specific applications, users, companies, and products have been removed to protect the innocent.
The scenario went like this. A partner company had a solution that would have a user connect to an IIS web server (call it Web Server A) from an Internet Explorer client. The connection was authenticated via a smartcard based certificate for client-authentication required HTTPS. Smartcards typically specify an universal principal name (UPN) for the user that will be using the smartcard, like billybob@craziness. This is “hard-coded” into the certificate in the smartcard for obvious reasons.
Once the HTTPS session was started, code running in the application pool on Web Server A would use Kerberos Protocol Transition to transition from TLS to Kerberos authentication. Then, since the identity the application running in Web Server A’s application pool was trusted for delegation, the app would contact Web Server B using HTTP (at this point, again, authenticated by Kerberos as the original IE calling user).
Keep in mind here that the smartcard/certificate is tied to the domain user’s account which is part of what enables us to transition from HTTPS- authenticated using that certificate as credential- to Kerberos as that domain user principal. In addition, in this scenario the customer had configured that Web Server A was constraining delegation from itself to just the HTTP service (web) running on Web Server B.
So, we’ve reached the Web Server B which also has an application running in the application pool. This application is intended to access a local resource (a file) on Web Server B on that users behalf and supply information or usage of that file resource back to the user in his IE browser window running on his client. It’s also intended to do that on the users behalf (that old “trusted for delegation” thing) and without having to get a ticket granting ticket for that users session prior to getting a service ticket to the local resource as that user.
Huh? you say. Exactly, I respond.
Here the customer was using a programmatic (available in our Solution Developers Kit, or SDK) method to do something called S4U2Proxy, a type of User to User authentication. Developers may wince when I describe this simply as a method to impersonate a user and request a service ticket on behalf of that user without ever getting a TGT to supply in the service ticket request. Normally, like when you logon to your domain member computer, you request a TGT from the KDC and then once you have that you request the service tickets (like for your local session) that you need using that TGT as your ‘credential’. That’s why TGT stands for Ticket Granting Ticket. You got TGT, KDC grant service ticket.
The “service for user to proxy” or “user to user” methods are a way to write a program to accept a service ticket for a local resource it has from a user and then turn around and use that ticket to request a new ticket to a resource on behalf of that user without having a TGT, having to request a TGT and without having any real access to the users “secrets” like password or private key. This is a good thing, from a security standpoint. Keep in mind that the application is running in a context that is trusted for this otherwise this wouldn’t happen.
In addition, this extension to our Kerberos implementation is often done in conjunction with constrained delegation, adding an additional bit of security.
A more general article about this type of thing is available at the URL below. It’s a good a good reference in that it goes over all of the points of configuration in some detail, as opposed to looking at it holistically like I do.
Exploring S4U Kerberos Extensions in Windows Server
http://msdn.microsoft.com/msdnmag/issues/03/04/SecurityBriefs/
The problem (that’s right, I just now got to it) was that the supplied information was not provided successfully to the user in his or her IE browser window, and they were instead prompted for credentials.
If you’ve read my prior blog posts regarding Kerberos then you know that I’m a big proponent of viewing all of these transaction “on-the-wire”-network captures. This scenario was not different in that respect and I was hopeful that it would tell us how far along in the scenario we got before things did not work as expected.
So we gathered network captures, all taken at the same time, of the client computer, Web Server A and Web Server B. In them we saw HTTPS traffic from client to Web Server A and no errors. Step one done, no problem.
We then looked at the network traffic from Web Server A to Web Server B. As previously mentioned, this traffic was also HTTP. In viewing the HTTP authentication information displayed in the packets we could tell that a service ticket was being supplied successfully there as well, and no errors were seen in the capture. This told us that our constrained delegation was not a factor otherwise we never would have sent that traffic with that service ticket.
So that suggested that there was something happening once we reached Web Server B successfully. At this point we did some head scratching and decided we needed to look a little deeper. We decided we needed to enable Kerberos debug logging on Web Server B and see what’s happening.
As an aside, Kerberos debug logging, when described as a debug log, is a registry value change that takes the debug entries in our Kerberos code and dumps them to a log file called %systemroot%system32lsass.log. This file can be opened and read using Notepad or Wordpad. I took a moment to explain this since some folks may be familiar with another similar technique where you can increase the verbosity (and frequency as a result) of the Kerberos specific events which appear in the System event log. That’s a different thing, and in my opinion, and usually not as helpful.
The link below, Troubleshooting Kerberos Errors, has the step on how to enable Kerberos debug logging. Just search it for “lsass.log”.
http://www.microsoft.com/technet/prodtechnol/windowsserver2003/technologies/security/tkerberr.mspx
In that log is a lot of information that can be pretty confusing at first glance.
Here’s a snippet of what we saw:
1060.1188> Kerb-LSess: KerbCreateLogonSessionFromTicket creating logon session for 0:0x396f98, accepting 0:0x3e7, client billybob@craziness@DOMAIN1
1060.1204> Kerb-Warn: Trying to delegate but no forwardable TGT
1060.1204> Kerb-Warn: KerbBuildGssChecksum failed to get delegation TGT: 0x8009030e
1060.1156> Kerb-Error: KerbS4UToSelfLogon failed - c000000d
1060.1196> Kerb-LSess: KerbCreateLogonSessionFromTicket creating logon session for 0:0x39eaa8, accepting 0:0x3e7, client billybob@craziness@DOMAIN1
1060.1200> Kerb-LSess: KerbCreateLogonSessionFromTicket creating logon session for 0:0x39ec94, accepting 0:0x3e7, client billybob@craziness@DOMAIN1
1060.1188> Kerb-LSess: KerbCreateLogonSessionFromTicket creating logon session for 0:0x39ecfa, accepting 0:0x3e7, client billybob@craziness@DOMAIN1
1060.1196> Kerb-LSess: KerbCreateLogonSessionFromTicket creating logon session for 0:0x39ed66, accepting 0:0x3e7, client billybob@craziness@DOMAIN1
1060.1204> Kerb-Warn: Trying to delegate but no forwardable TGT
1060.1204> Kerb-Warn: KerbBuildGssChecksum failed to get delegation TGT: 0x8009030e
1060.1244> Kerb-Error: KerbS4UToSelfLogon failed - c000000d
1060.1196> Kerb-Error: KerbS4UToSelfLogon failed - c000000d
1060.1200> Kerb-LSess: KerbFindCommonPaEtype using current password of billybob@craziness@(null)
1060.1200> KSupp-Warning: KerbUnpackData failed to unpack typed data, trying error method data
1060.1200> Kerb-Error: KerbCallKdc failed: error 0x18. d:ntdssecurityprotocolskerberosclient2logonapi.cxx, line 1715
1060.1200> Kerb-Warn: KerbFindCommonPaEtype using old password of billybob@craziness@(null)
1060.1200> Kerb-LSess: KerbFindCommonPaEtype using current password of billybob@craziness@(null)
1060.1200> Kerb-Warn: KerbFindCommonPaEtype using old password of billybob@craziness@(null)
1060.1200> Kerb-Error: GetAuthenticationTicket: Failed to build pre-auth data: 0xc000006a. d:ntdssecurityprotocolskerberosclient2logonapi.cxx, line 1482
1060.1200> Kerb-Warn: LogonUser: Failed to get TGT for (null)billybob@craziness : 0xc000006a
1060.1188> Kerb-LSess: KerbCreateLogonSessionFromTicket creating logon session for 0:0x3edc3f, accepting 0:0x3e7, client billybob@craziness@DOMAIN1
1060.1244> Kerb-LSess: KerbCreateLogonSessionFromTicket creating logon session for 0:0x3ee61a, accepting 0:0x3e7, client billybob@craziness@DOMAIN1
1060.1156> Kerb-LSess: KerbCreateLogonSessionFromTicket creating logon session for 0:0x3ee6a7, accepting 0:0x3e7, client billybob@craziness@DOMAIN1
1060.1188> Kerb-LSess: KerbCreateLogonSessionFromTicket creating logon session for 0:0x3ee714, accepting 0:0x3e7, client billybob@craziness@DOMAIN1
1060.1204> Kerb-Warn: Trying to delegate but no forwardable TGT
1060.1204> Kerb-Warn: KerbBuildGssChecksum failed to get delegation TGT: 0x8009030e
1060.1200> Kerb-Error: KerbS4UToSelfLogon failed - c000000d
1060.1244> Kerb-Error: KerbS4UToSelfLogon failed - c000000d
A quick overview of this log. The left hand portion is not a time index (there isn’t one). This is process ID and thread. The type of entry which appears next is a general group of what kind of entry it is (Kerb-Warn, Kerb-Error, et cetera). Don’t be fooled by how scary things may look when you see Error and such. You will often see things like that in these logs, even when there is no problem at all.
The final entries are the actual debug entry information. These will contain the function (API) and the description entered in there. Some may contain some or all of the variable values that were being used. Therein lies some of the value you may find in looking at this log sometimes.
In the issue I’ve described this log was interesting for two general reasons. First, the list of calls being done and the order they are being done suggests that we are trying a S4ULogon using a service ticket that has been received. Second, the principal, or user, name being passed is some permutation of billybob@craziness, rather than deriving it from that user name.
What do I mean by that? A user of the UPN billybob@craziness may also be represented as crazinessbillybob or similar. You don’t typically see the principal represented like billybob@craziness@DOMAIN1 or (null)billybob@craziness. These are not names that can be parsed out and used to contact a domain controller and request services for. They are munged.
In this issue, knowing what the problem is half the battle (sorry for the GI Joe reference). For the above issue, it let our partner know that they needed to workaround this problem by adding a routine that does a name lookup on the user and then passes it to the authentication layer un-munged.
Why is this interesting to you? If you have a complex, customized single sign on scenario, and you are darned certain you’ve verified the configuration steps have been done well (outlined in the troubleshooting Kerberos delegation whitepaper) then review your network traces and lsass.log. Maybe you’re seeing a similar behavior. And knowing is half the battle.