Incorrect DNS can lead to various network connectivity issues. In this post, I explain how it affects your connection to SQL Server. In some cases, you may see the well-known "Cannot Generate SSPI Context" error message.
Suppose your SQL Server is located on a machine called HostA. Its IP address is A.B.C.D and its FQDN is HostA.mydomain.com. However, your DNS is poisoned and wrong results are returned to your DNS query. Then, you may have problems to connect to your server. There are two forms of DNS poison.
HostA --> A1.B1.C1.D1, i.e, clients get a wrong IP address for HostA or HostA.mydomain.com
In this case, a client may fail to find the server when connecting to the server using the hostname and/or FQDN. If it happens that another server is listening on A1.B1.C1.D1 on another machine, you may see more confusing error messages, usually, logon will fail. This situation is not very difficult to identify and the reason is obvious.
HostA --> A.B.c.D, i.e. clients can get correct IP address for HostA or HostA.mydomain.com
A.B.C.D --> HostB.mydomain.com, i.e. The IP address is poisoned.
In this case, client can locate the server correctly and make a connection to the server. However, under the hood, SQL Server client stack will do a reverse lookup and build up a SPN based on the result. In this case, the client get something like this:
If such a SPN happens exist on the network (more specifically Active Directory), then the client will try to connect to the server using Kerberos. However, the service account of the target SQL Server is not the correct container of the SPN and Kerberos will fail. Clients will see the "Cannot Generate SSPI Context" error message. Imagining the SQL Server on the HostB is turned on and off periodically, the client will see occasional logon failure on the SSPI error message. Note that all these only happen when TCP is used to connect to the server.
For Case 2, if the client is on the same machine with the server (i.e. local connection), it should be OK unless the poisoned DNS coming from Host file. The reason is that, for local connection, NTLM is usually used. There is one special situation you may see a different logon failure. If the server is a SQL cluster and the cluster IP address is poisoned, the connection from the local machine will fail with the following error message:
Login failed for user ''. Reason: Not associated with a trusted SQL Server connection.
One source for DNS poison is that user put an entry in hosts file (c:WINDOWSsystem32driversetchosts) on the client machine and forgot to remove it when they do not need it anymore. Over the time, this entry may not match rhe dynamic record on DNS server and user will see connectivity issue.
Another good source I can think of: Nowadays, more people use laptop on corp network. Most people don't shutdown their laptop when they take the laptop off the network. This will leave the DNS entry on the DNS server for the specific IP address the laptop was using. Sometime later, another machine jumps in and obtained that IP address and ofcourse will register its own DNS entry for the IP. Then that IP will have two entries on the network. Actually, there might be more records (e.g. SPN) left on the networks if the machine did not properly logoff. All this can cause connectivity issues.
Here are the suggestions on trouble-shooting SQL connectivity issue related to DNS. (Actually, I suggest you always do the following check up the ensure the issue is not because of poisoned DNS)
Under command windows, run the following commands:
1) ping -a HostA (and HostA.mydomain.com)
2) ping -a A.B.C.D (If this only returns HostA, but not FQDN, usually means no DNS entry was found, and HostA is from Netbios)
You need input FQDN and IP address at least twice in nslookup as you may see different results if there are more than one entry for the IP or FQDN.
If you find any mismatch between server's IP and FQDN, that could be the cause of your connectivity issue.
Please read more posts on this blog for related info: