I think we are due for a reminder on best practices related to Windows features collectively known as “Microsoft Scalable Networking Pack (SNP)”, as it seems difficult to counter some of the “tribal knowledge” on the subject. Please also see our previous post on the subject.
Recently we had a customer that called in on the subject and this particular case stressed the importance of the Scalable Networking Pack features. The background of this case was that the customer was Running Exchange 2010 SP2 RU6 on Windows 2008 R2 SP1, and they had multiple physical sites with a stretched DAG. This customer had followed our guidance from Windows 2003 times and disabled all of the relevant options on all of their servers, similar to below:
Receive-Side Scaling State : disabled
Chimney Offload State : disabled
NetDMA State : disabled
Direct Cache Acess (DCA) : disabled
Receive Window Auto-Tuning Level : disabled
Add-On Congestion Control Provider : ctcp
ECN Capability : disabled
RFC 1323 Timestamps : disabled
The current problem was that the customer was trying to add copies of all their databases to a new physical site, so that they could retire an old disaster recovery site. The majority of these databases were around 1 to 1.5TB in size, with some ranging up to 3TB. The customer stated that the databases took five days to reseed, which was unacceptable in his mind, especially since he had to decommission this site in two weeks. After digging into this case a little bit more and referencing this article, we first started by looking at the network drivers. With all latency issues or transport issues overs a WAN or LAN, we should always make sure that the network drivers are updated. Since the majority of the servers in this customer’s environment were virtual servers running the latest version of the virtual software, we switched our focus over to physical machines. When we looked at the physical machines we saw they had a network driver with a publishing date of December 17, 2009.
At this point I recommended to update the network driver to a newer version with at least a driver date of 2012 or newer. We then tested again, and still saw the transfer speeds roughly similar to those before updating the drivers. At that point I asked the customer to change the scalable network pack items from above to:
Receive-Side Scaling State : enabled
Chimney Offload State : automatic
NetDMA State : enabled
(Here is how you change these items.)
The customer changed the SNP features and then rebooted the machines in question. At that time he started to reseed a 2.2TB database across their WAN at around 12pm. The customer sent me an email later that night that stated the database would now take around 12 hours to reseed. The next morning he sent me another email and the logs were copied over before he showed up for work at 7am. This time the reseed took 19 hours to complete compared to 100+ hours with SNP features disabled. Customer stated that he was very happy, and started planning how to upgrade network drivers on all other physical machines in his environment. Once that was done he was going to change RSS, TCP Chimney, and NetDMA to the recommended values on all of his other Windows 2008 R2 SP1 machines.
The following two articles show the current recommendations for the Scalable Networking Pack features:
- Here is the document reference above that shows the correct settings for each version
- Even though this article specifies SQL, this is still relevant to the operating system that Exchange sits on.
So, what exactly is our point?
Friends don’t let friends run their modern OS-servers with old network drivers and SNP features turned off! As mentioned in our previous blog post on the subject, please make sure that you update network-level drivers first, as many vendors made various fixes in their driver stacks to make sure that SNP features function correctly. The above is just one illustration of issues that incorrect settings in this area can bring to your environment.
David Dockter
You Had Me at EHLO.