It's been a while! Sorry for the delay since the last post. It has been a hectic few weeks. I've been temporarily assigned as a beta support person, which means that I have been working on Windows Vista and Longhorn Server, assisting with filed (and filing) bugs for seen behavior and design change requests. So I’ve been kept mucho busy.
But that’s no excuse! Let’s get started talking about some technical stuff…
The last few posts we talked about authentication, specifically Kerberos. Some of the most lengthy support incidents we can see are the ones where our Directory Services folks are contacted by our SQL support team to assist with authentication failures. So let’s talk about troubleshooting those types of issue. Now, for those out there that have looked on the web on this topic, you know that there is some good info out there. I will discuss some of that, add some more, and tie it together a bit to make troubleshooting these issues more linear, less confusion.
Let’s first discuss why troubleshooting Kerberos authentication may be different from one application to another. You would expect that if it uses Kerberos then you would approach the problem in the same way, regardless of the application which is involved, right?
Only partially. Applications may be designed to perform and authenticate a certain way. Microsoft provides extensive documentation on this in the Platform SDK, and online at MSDN:
When the application is developed there is code that deals with how it will work with the Security Support Provider Interface (SSPI), which works as an intermediary section of code (layer) which passes the authentication request to the appropriate authentication package (Kerberos, NTLM, SSL…).
Here’s a quick snippet from MSDN on SSPI:
Security Support Provider Interface
(SSPI) A common interface between transport-level applications, such as Microsoft Remote Procedure Call (RPC), and security providers, such as Windows Distributed Security. SSPI allows a transport application to call one of several security providers to obtain an authenticated connection. These calls do not require extensive knowledge of the security protocol's details.
They don’t require extensive knowledge or configuration, but if the application needs it to get its job done, it can get very granular in it’s authentication requests. Most if this takes place as the API InitializeSecurityContext() is called in one of several ways. More detail on that (we won’t get too much more specific here) is also available at MSDN:
Why have a put fifty percent or more of my audience to sleep in this post talking about this stuff? The reason is that it’s an important segue into troubleshooting SQL authentication failures. SQL is one of those applications which has some uniqueness in how it can authenticate, what is needed for its authentication request to succeed, and how it treats its needs to reauthenticate or delegate that authentication during an active authenticated session.
Let me take a quick moment to add a disclaimer: I am not a SQL specialist. I am someone who specializes in DS technology, including authentication, so I’ve had a good amount of experience talking with folks regarding these issues, usually conferenced in by a SQL specialist.
Typical Kerberos Troubleshooting Steps
-serviceprincipalname in the format of MSSQLSvc/<FQDN> <SQL_Service_Account>
-service account trusted for delegation (if delegating), and not “too sensitive to delegate”
-network trace as authentication is requested (look for what was requested and how)
As a recap, the serviceprincipalname is a multi string value of data which is stored with Active Directory on a principal. This principal, in SQL’s case, would be the one being used for the SQL service account. If a service is using the local system account (or network service) then that principal would be the computer object for that server in AD. This can be viewed and edited via ADSIEDIT.MSC or LDP.EXE.
A key point that is commonly overlooked is that SPNs are constructed on the fly-meaning on demand, and configuration issues may cause that to fail. What you may see in a failure of this kind is that the DNS portion of the MSSQLSvc/<FQDN> <SQL_Service_Account string is taken case of by the client side network on that computer. If there is a HOST entry that is wrong, or a record in DNS which is wrong, it will skew your SPN. A trace will catch this a good portion of the time.
In a similar manner, that principal, or others which will be used for the context of a SQL action, must be “trusted for delegation” (an option on the object in AD) and not “sensitive and cannot be delegated”. DSA.MSC is the best place to view and alter this.
For the directory service minded person, a network trace is a very definitive method of finding what happened behind the scenes as SQL attempted to authenticate. A trace taken as you reproduce the problem can give answers to the following questions:
-Was there any traffic on the wire as we had this problem?
-If Kerberos traffic is seen, what service ticket was requested, and was it given by the KDC?
-If there was a Kerberos error packet in the trace, what was it? It will certainly be obvious if there was one, and they are usually informative.
The article below goes over SQL auth in some more detail on this as it pertains to interoperability with IIS:
How to use Kerberos authentication in SQL Server
But now we get to the money portion. I’ve gone on at length (ad nauseam?) in this post about how some applications authenticate based on their own design needs.
Wouldn’t it be nice if there was a tool to simulate the creation and usage of the SPN, and the back and forth steps of how SQL interfaces with the SSPI layer?
Well, there is: SSPIClient.exe. This GUI tool emulates your SQL server and tests the various authentication request sand re-requests SQL needs to do. It provides a log in which you can examine what took place. This tool is very useful in spotting configuration issues, and some other issues, early on. The Read Me document is very informative-I recommend that you don’t use the tool without reading that document first.
I would have attached the tool to the site for download, unfortunately our blog management software is not that fast and loose with the hosting server space. Until it is posted to our external web site I would recommend contacting Microsoft Customer Services and Support and asking for it from the tool box :)
But I hope it helps!
Some additional, applicable and more technically adept SQL links:
I’ve glossed over some details, and actual scenarios, so if you have some questions, or want to throw your scenario up for discussion/review please do.
I’d promise another post real soon, but I don’t want to potentially lie to you all. But hopefully soon! Take care out there…