SSIS package fails with 'Communication link failure'
Published Oct 20 2020 01:28 AM 27.3K Views
Microsoft

 

“Communication Link Failure” is often due to outdated drivers or poor network configuration settings.

 

Please note OLEDB driver it is not maintained anymore and it is not recommended to use this driver for new development. If however, your business require the use of OLEDB, you should download and install the recent OLEDB provider and make sure you change the provider name in the connection string. Please read more about this here.

 

It is highly recommended to use ADO.NET whenever is possible as it provides more reliability.

 

In this scenario, SSIS packages were used to load data from SQL on-premises to Azure SQL DB. After a while, the bulk insert failed with below errors:

 

Error: 2020-08-25 10:44:08.16
   Code: 0xC0202009
   Source: Extract Location Hierarchy Data to Datamart OLE DB Command [156]
   Description: SSIS Error Code DTS_E_OLEDBERROR.  An OLE DB error has occurred. Error code: 0x80004005.
An OLE DB record is available.  Source: "Microsoft OLE DB Driver for SQL Server"  Hresult: 0x80004005  Description: "Communication link failure".
An OLE DB record is available.  Source: "Microsoft OLE DB Driver for SQL Server"  Hresult: 0x80004005  Description: "TCP Provider: An existing connection was forcibly closed by the remote host.
Error: 2020-08-25 10:44:08.23
   Code: 0xC0047022
   Source: Extract Location Hierarchy Data to Datamart SSIS.Pipeline
   Description: SSIS Error Code DTS_E_PROCESSINPUTFAILED.  The ProcessInput method on component "OLE DB Command" (156) failed with error code 0xC0202009 while processing input "OLE DB Command Input" (161). The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running.  There may be error messages posted before this with more information about the failure.
End Error"

 

Note that the same error message might appear also when reconfiguration happen on the SQL DB and SSIS is not aware of retry logic.

 

There are several options to be considered in order to address the above error on client side.

 

1 - Disable TLS_DHE ciphers on client machine

 

Unless you have a specific requirement for a legacy application, disable all the TLS_DHE ciphers, as they are outdated and not recommended to be used.  

 

Option #1 - Follow the steps described in the documentation below:

Applications experience forcibly closed TLS connection errors when connecting SQL Servers in Windows

 

Option #2 - Alternatively, you can download IIS Crypto Tool from here and configure the best practices on your client and server. 

 

Steps to configure best practices:

 

  1. Open IIS Crypto
  2. Click on Best Practices
  3. Go to Cipher Suits > Click on Best Practices Again
  4. Click Apply
  5. Restart the server (so that the changes are applied in the registries)

MelaniaNitu_1-1602280192941.jpeg

 

 

2 - Configure client machine 

 

Step #1 Ensure NIC Drivers are up to date

Step #2 Disable TCP chimney at NIC Level

 

TCP Chimney Offload is a feature that provides capability to offload TCP/IP packet processing from the processor to the network adapters and some other balancing options.  This has been known to cause issues on the SQL Server side causing network errors.  Further information on this can be found in the following blog post as well.

 

TCP Chimney Offload – Possible Performance and Concurrency Impacts to SQL Server Workloads 

 

 Make sure that under Advanced Settings of NIC below options are DISABLED:

  • IPv4 Checksum Offload
  • IPv4 Large Send Offload
  • Checksum Offload
  • Large Send Offload

Steps to disable TCP chimney features at NIC level:

 

  1. Click Start -> Run -> type ncpa.cpl -> click OK
  2. Right-click a network adapter object -> click Properties
  3. Click Configure -> Advanced tab
  4. In the Property list: click receive side scaling -> Disable in the Value list -> OK
  5. In the Property list: click large send offload ->Disable in the Value list-> OK
  6. In the Property list: click ipv4 checksum offload -> Disable in the Value list -> OK
  7. In the Property list: click TCP checksum offload (IPv4) -> Disable in the Value list -> OK

 

The other properties to look for and disable are:

  • Offload Receive IP Checksum
  • Offload Receive TCP Checksum
  • Offload TCP Segmentation
  • Offload Transmit IP Checksum
  • Offload Transmit TCP Checksum

 

Step #3 Configure TCP Chimney Offload, RSS and NetDMA in the operating system.

 

Open an elevated cmd and type below commands.

 

To determine the current status of TCP Chimney Offload and RSS:

netsh int tcp show global

  

To disable TCP Chimney Offload:  

netsh int tcp set global chimney=disabled 

 

 To disable RSS:

netsh int tcp set global rss=disabled

  

To disable NetDMA: 

netsh int tcp set global netdma=disabled 

 

Restart the machine.

 

 3 - Tune SSIS Package

 

Another recommendation to alleviate the connection issue is to actually tune your SSIS package to perform more efficiently. There’s lots of articles around this aspect, but the top settings are DefaultBufferMaxRows, DefaultBufferSize, and Rows per batch settings. However, this needs to be performed over many trials, while capturing logging data and adjusted accordingly. We usually see General Network Errors associated with poor performing packages, that degrade over time because of the data/transactions involved. 

 

This blog gives a quick rundown about tuning your SSIS package for a better performance.

 

Here’s our documentation on top 10 SSIS Best Practices which will also go over different settings you can test.

 

5 Comments
Brass Contributor

We ran into infrequent connections errors from/to our PaaS databases with SSIS. There is not much information to be found on the internet, but after switching to the newest MSOLEDBSQL drivers, we've not seen the connection errors anymore.

 

The connection error due to scaling the Azure SQL DB was of course still there. Any idea how to work around that? The SSIS packages are still iaas, thus via the job agent we simply wait 1 minute, then start again. But with long running packages this is a time loss.

Microsoft

@Johannes_Vink  so after updating the driver the connection errors are gone, but then you’ve mentioned other errors are present as a result of a database scaling operation? What is the error message and what is the link with the SSIS running packages? I would recommend opening a support incident for further investigation.

Brass Contributor

@MelaniaNitu The transient network errors are gone, very content with that. Never found out where that came from, part of the deal when using cloud resources I guess (or something internal on our side, who knows).

 

However when scaling a database, at a certain point the open connections are killed and SSIS cannot recover from that...

 

... but while writing this I realize that I should test this also with the new drivers. I am not sure I saw that error after we started using the new drivers. And since we have batch windows, we often scale before and after the SSIS executions, so it does not happen often that a package is running during a scale operation.

Microsoft

@Johannes_Vink changing the service tier i.e. scaling a database creates a replica of the original database at the desired performance level and then switches connections over to the replica. No data is lost during this process but during the brief moment when we switch over to the replica, connections to the database are disabled, so some transactions in flight may be rolled back. The length of time for the switch-over varies, but is less than 30 seconds 99% of the time. If there are large numbers of transactions in flight at the moment connections are disabled, the length of time for the switch-over may be longer. Please check the following article on Best practice recommendation for SSIS package execution retries.

 
Brass Contributor

@MelaniaNitu thanks for that. We are following the retry approach via the job agent. I was hoping that another setting within the SSIS package or driver would solve it, but it is what it is :)

Version history
Last update:
‎Oct 20 2020 08:52 AM
Updated by: