Client application using .NET Framework 4.6.1 fails to connect to SQL Server and the error we get is something like following.
'Connection Timeout Expired. The timeout period elapsed while attempting to consume the pre-login handshake failed or the server was unable to respond back in time. The duration spent while attempting to connect to this server was - [Pre-Login] initialization=11977; handshake=5138;’
One more behavior to observer is, the connection works well with lower versions of .NET Framework.
What is the issue?
In short, the issue is because of a new property introduced in .NET Framework 4.6.1. A parameter in the connection string called ‘
’. This parameter is by default is set to
A little background on why we introduced this parameter,
There is a known design limitation with the way the SQL Server connections work on a SQL Server Availability Groups/ AlwaysOn inside a MultiSubnet Environment. Whenever the Availability Group Listener resolves to 2 different IP addresses of different Subnets based on the DNS lookup resolution the connection will go to the first IP for first attempt. If that is not the primary, then the connection will timeout.
By default, the behavior of the SQL client libraries is to try all IP addresses returned by the DNS lookup – one after another (serially) until the all of the IP addresses have been exhausted and either a connection is made, or a connection timeout threshold has been reached. This can become an issue when the first IP address returned is not the primary. Connection may timeout before it can attempt to connect to the next IP address.
To overcome this behavior, we had introduced a connection property called MultiSubnetFailover which we had to manually set it to true in such scenarios.
In .NET Framework 4.6.1, a new parameter “TransparentNetworkIPResolution” is introduced which would eliminate the need of this MultiSubnetFailover property in connection string. With this property being set, an initial connection attempt to the first-returned IP address is still made, but that attempt is timed-out after only 500ms, and then connection attempts to all the IP addresses are attempted in parallel. By default, this property is set to true.
In this scenario, where the connection is failing with .NET Framework 4.6.1, the DNS lookup is taking longer than 500ms. As this property is by default set in the connection string (Doesn’t check if this is Always On or not) it was timing out at 500ms. This made the connection timeout irrespective of it is Always On or not.
Tracert result will be something like following:
Tracing route to test.xyz.com [126.96.36.199]
over a maximum of 30 hops:
1 <1 ms <1 ms <1 ms 188.8.131.52
2 1 ms <1 ms <1 ms test1.xyz.com [184.108.40.206]
3 555 ms 552 ms 576 ms test2.xyz.com [220.127.116.11]
4 604 ms 595 ms 559 ms test3.xyz.com [18.104.22.168]
5 560 ms 559 ms 554 ms 22.214.171.124
6 553 ms 553 ms 559 ms 126.96.36.199
7 765 ms 783 ms 780 ms 188.8.131.52
8 776 ms 813 ms 842 ms 184.108.40.206
9 775 ms 854 ms 783 ms 220.127.116.11
10 832 ms 777 ms 772 ms 18.104.22.168
11 791 ms 819 ms 801 ms 22.214.171.124
12 773 ms 807 ms 771 ms test.xyz.com [126.96.36.199]
One of the sample network communication for this scenario:
This issue can be resolved by modifying the connection string to set the parameter ‘TransparentNetworkIPResolution’ to false.