Unable to connect to a SQL Server named instance on a cluster
Published Mar 23 2019 04:14 AM 1,157 Views
Microsoft
First published on MSDN on Feb 27, 2006

We have been seeing several occurrences where users failed to connect to SQL Server named instance on cluster. The error messages are usually as follow:

For SNAC:

C:>osql -E -S clusterinst

[SQL Native Client]SQL Network Interfaces: Error Locating Server/Instance Specified [xFFFFFFFF].

[SQL Native Client]Login timeout expired

For MDAC:

C:>osql -E -S clusterinst

[DBNETLIB]Specified SQL server not found.

[DBNETLIB]ConnectionOpen (Connect()).

In all occurrences, customers already tried some basic steps for solving similar issues, such as, enabling TCP protocol, enabling remote connection, put SQL Server named instance into firewall exception list, etc. After we tracked down to the issues, we found that it's a combination of the specifics of Windows Cluster and the way we discover SQL Server named instance. When connecting to SQL Server named instances, our client components rely on SQL Browser to discover the server and its parameters. The discovery process is:

The client sends a UDP packet to SQL Browser on the target machine. When the named instance is on a windows cluster, the packet is sent to the cluster IP (or more specifically, the IP address corresponding to the virtual SQL Server). However, SQL Browser is not cluster-aware and listens on IP ANY. When SQL Browser receives the UDP request packet, it sends a response UDP packet back the client. The destination IP address is the client's IP address, however, the source IP address is changed. It's now the IP address for the NIC card on the physical machine, rather than the virtual SQL Server IP address. The source IP address of the response UDP packet is determined by Windows OS, based on the routing table. Because both virtual SQL Server IP address and the IP address attached to physical NIC are usually on the same subnet (thus belong to same route), physical IP address is selected preferably. Depends on the security settings on the client and server machines, this response UDP packet may be dropped because the peer IP address is changed. We have been seeing that the response packet is dropped by Firewall and/or IPSec.

Windows Firewall does not drop the packet. However, a third-party firewall may drop the packet. In addition, IPSec may also drop the packet if IPSec policy is enabled on the client and it can not establish a trust connection between the client and server. ( Important update : If your client is a Vista machine, you will see this issue. A workaround is to specify tcp port or pipe name in your connection string directly.)

We decided not to fix this minor issue because it is determined by the nature of UDP protocol. A UDP socket can response to multiple senders and the socket layer never knows which one it is actually replying to. We may consider letting SQL Browser listen on individual IPs but the cost will be high. A workaround is to specify TCP port number in the connection string in which case we bypass the discovery process.

Please refer to the following links for additional information. The articles talk about issue for SQL Server 2000, but it also applies to SQL Server 2005 as the fundamentals did not change.

http://support.microsoft.com/?kbid=888228

http://support.microsoft.com/default.aspx?scid=kb;[LN];318432


*********Important update 2: regarding SQL Server 2008. ***************


We had a fix for this issue in SQL Server 2008. Unfortunately, the fix is still partial. We identify another issue which invalidate the fix on X64 machine. Other than that, the issue is fixed if the server is SQL Server 2008 on Vista/Windows Server 2008 on X86/IA64. We don't have to do anything on the client side for these scenario. Note: the issue still applies to all version of SQL Server 2005.



Update3: (Mar/2009)


If you upgrade your OS to Vista SP2 or Windows Server 2008 SP2, and your SQL Server 2008 is SP1, the partial issue on X64 is fixed. Meanwhile, we identified another related issue which affect the ability to enumrate SQL instances on Vista/Windows 2008 on network . The fix is also in SQL Server 2008 SP1.



Xinwei Hong, SQL Server Protocols
Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights


Version history
Last update:
‎Mar 23 2019 04:14 AM
Updated by: