Microsoft Secure Tech Accelerator
Apr 03 2024, 07:00 AM - 11:00 AM (PDT)
Microsoft Tech Community
Introduction to Network Trace Analysis 3: TCP Performance
Published Feb 08 2023 06:17 AM 25.6K Views
Microsoft

# Introduction to Network Trace Analysis 3: TCP Performance

Introduction to Network Trace Analysis 3: TCP Performance

Hello everyone, we are back with TCP performance. If you are reading this post I am going to assume you have read the previous post Introduction to Network Trace Analysis 2: Jumping into TCP Connectivity.

There are a ton of caveats and “yes but…”s when these things are considered in the context of virtualization and offloading. I’ll save that can of worms for another post.

So let’s get going.

What affects TCP performance

A lot of variables affect TCP performance but not all of them are easily configurable. Let’s start with our absolute constraint.  The speed of light.

Unfortunately, we need to abide by the pesky laws of physics. Let’s say I am communicating with a server on the exact opposite site of the earth 6,357km away.

Assuming that I was able to send my packet directly to that location at the speed of light it would take 21.2ms giving us a round trip time of 42ms.

`299792458 m/s/1000 299792.458 km/s 6357 / 299792.458 0.0212046695317465 seconds * 1000 21.2046695317465 ms `

These numbers are calculated on the back of a napkin using physically perfect conditions. But this will never be the case. In the real world we will need to deal with:

• Routers
• Switches

When we are working with real world constraints geographic distance is a non-insignificant variable.

From an OS perspective we have a few questions to ask yourself when considering TCP performance.

• How much data can I put onto the wire at a time?
• How much data can I have unacknowledged at a time?
• Data in the receive buffer that I haven’t ACK’d and data still in transit
• How quickly am I pulling data out of the TCP receive buffer?
• How often do I need to retransmit data?

What can we do to adjust TCP performance?

In general, leave the settings as default. While you may be able to see performance improvements by making adjustments to the OS network configuration you could just as easily make things worse.

We highly recommend you Stick with Well-Known and Proven Solutions.

If you are going to proceed down the dark treacherous road of adjusting settings to improve TCP performance through configuration here are a few recommendations:

• If you are trying to improve TCP performance, focus on TCP. Upper layer protocols may obfuscate your testing
• Remove as many variables as possible with:
• Consistent network paths
• Consistent machines
• Confirm behavior with single stream vs multistream
• 1 TCP Connection vs 2+ TCP connections to transfer the payload

But keep in mind that when you go back to real workloads you will be subject to these variables again and your performance will never meet the absolute maximum throughput possible.

Understanding TCP performance through network traces

Now that we have gone through what affects TCP performance let’s start looking at this in some network traces.

Let’s begin with some important TCP options found within the TCP 3-way handshake.

`Frame 1: 66 bytes on wire (528 bits), 66 bytes captured (528 bits) on interface \Device\NPF_{EB2F0E58-D6B7-4B7F-A292-376EA49BDD03}, id 0 Ethernet II, Src: Microsof_08:04:00 (00:15:5d:08:04:00), Dst: Microsof_08:04:01 (00:15:5d:08:04:01) Internet Protocol Version 4, Src: 172.27.136.253, Dst: 172.27.135.44 Transmission Control Protocol, Src Port: 50006, Dst Port: 5001, Seq: 0, Len: 0     Source Port: 50006     Destination Port: 5001     [Stream index: 0]     [Conversation completeness: Incomplete, DATA (15)]     [TCP Segment Len: 0]     Sequence Number: 0    (relative sequence number)     Sequence Number (raw): 1906168047     [Next Sequence Number: 1    (relative sequence number)]     Acknowledgment Number: 0     Acknowledgment number (raw): 0     1000 .... = Header Length: 32 bytes (8)     Flags: 0x0c2 (SYN, ECE, CWR)     Window: 64240     [Calculated window size: 64240]     Checksum: 0x6887 [unverified]     [Checksum Status: Unverified]     Urgent Pointer: 0     Options: (12 bytes), Maximum segment size, No-Operation (NOP), Window scale, No-Operation (NOP), No-Operation (NOP), SACK permitted         TCP Option - Maximum segment size: 1460 bytes         TCP Option - No-Operation (NOP)         TCP Option - Window scale: 8 (multiply by 256)         TCP Option - No-Operation (NOP)         TCP Option - No-Operation (NOP)         TCP Option - SACK permitted     [Timestamps] `

From these headers we can see two important options, the first being the maximum segment size (MSS). The MSS is an indication of the largest TCP segment size that can be transmitted without being fragmented.

Side note: People often conflate the MSS with maximum transmission unit (MTU). They are related in that the MSS = MTU - (IP Header + Ethernet Header).

The larger this number, the more data that is sent per packet. You can think about this as how much data you can fit in an envelope before you need a new one. This is configurable through the OS, however; it is still limited by the network path. You cannot send a packet larger than the smallest MTU in the network path. The packet will be fragmented, aka the post office will move the letter into two envelopes.

Another option we can see is window scaling. The “window” is in reference to the TCP receive window. The option you see listed below:

`        TCP Option - Window scale: 8 (multiply by 256) `

This states that we are using a window scale of 8 and that we should multiply by 256? The idea is that we multiply our window size (1026 bytes by default) by 2^s where s is the scale factor. Given that within Wireshark we can see the window scale factor is set to 8, we know that our window size multiplier is 2^8 or 256.

The TCP receive window is a sliding window to determine the maximum amount of unacknowledged data for a given connection. In contract to the TCP receive buffer, the receive window includes data that is still in flight. As data is sent the TCP receive window will shrink. As data is acknowledged, the TCP receive window will grow. Pretty straight forward.

The last option we will discuss is selective acknowledgement (SACK).

`    TCP Option - SACK permitted `

SACK is an option that helps reduce some of the overhead of TCP communications by allowing blocks of TCP data to be sent and acknowledged as the group. If a packet is dropped only the missing data is requested for retransmission instead of the entire block. This option has been enabled by default on Windows 7 and newer OSs since late 2012.

Let’s take a look at TCP data flow in action.

`// TCP 3-way handshake 1   19:47:39.510548 172.27.136.253  172.27.135.44   TCP 66  50006 → 5001 [SYN, ECE, CWR] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM 2   19:47:39.511748 172.27.135.44   172.27.136.253  TCP 66  5001 → 50006 [SYN, ACK, ECE] Seq=0 Ack=1 Win=65535 Len=0 MSS=8960 WS=256 SACK_PERM 3   19:47:39.511807 172.27.136.253  172.27.135.44   TCP 54  50006 → 5001 [ACK] Seq=1 Ack=1 Win=262656 Len=0 // Sending data 15  19:47:40.032755 172.27.136.253  172.27.135.44   TCP 1514    50006 → 5001 [ACK] Seq=1 Ack=1 Win=262656 Len=1460 16  19:47:40.032755 172.27.136.253  172.27.135.44   TCP 1514    50006 → 5001 [ACK] Seq=1461 Ack=1 Win=262656 Len=1460 17  19:47:40.032755 172.27.136.253  172.27.135.44   TCP 1514    50006 → 5001 [ACK] Seq=2921 Ack=1 Win=262656 Len=1460 18  19:47:40.032755 172.27.136.253  172.27.135.44   TCP 1514    50006 → 5001 [ACK] Seq=4381 Ack=1 Win=262656 Len=1460 19  19:47:40.032755 172.27.136.253  172.27.135.44   TCP 1514    50006 → 5001 [ACK] Seq=5841 Ack=1 Win=262656 Len=1460 20  19:47:40.032755 172.27.136.253  172.27.135.44   TCP 1514    50006 → 5001 [ACK] Seq=7301 Ack=1 Win=262656 Len=1460 21  19:47:40.032755 172.27.136.253  172.27.135.44   TCP 1514    50006 → 5001 [ACK] Seq=8761 Ack=1 Win=262656 Len=1460 22  19:47:40.032755 172.27.136.253  172.27.135.44   TCP 1514    50006 → 5001 [ACK] Seq=10221 Ack=1 Win=262656 Len=1460 23  19:47:40.032755 172.27.136.253  172.27.135.44   TCP 1514    50006 → 5001 [ACK] Seq=11681 Ack=1 Win=262656 Len=1460 24  19:47:40.032755 172.27.136.253  172.27.135.44   TCP 1514    50006 → 5001 [ACK] Seq=13141 Ack=1 Win=262656 Len=1460 // Acknowledging sent data 25  19:47:40.034547 172.27.135.44   172.27.136.253  TCP 54  5001 → 50006 [ACK] Seq=1 Ack=14601 Win=2097920 Len=0 `

Data is sent and acknowledged but let’s see how the sequence numbers and acknowledgement numbers change.

`15  19:47:40.032755 172.27.136.253  172.27.135.44   TCP 1514    50006 → 5001 [ACK] Seq=1 Ack=1 Win=262656 Len=1460Frame 15: 1514 bytes on wire (12112 bits), 1514 bytes captured (12112 bits) on interface \Device\NPF_{EB2F0E58-D6B7-4B7F-A292-376EA49BDD03}, id 0 Ethernet II, Src: Microsof_08:04:00 (00:15:5d:08:04:00), Dst: Microsof_08:04:01 (00:15:5d:08:04:01) Internet Protocol Version 4, Src: 172.27.136.253, Dst: 172.27.135.44 Transmission Control Protocol, Src Port: 50006, Dst Port: 5001, Seq: 1, Ack: 1, Len: 1460     Source Port: 50006     Destination Port: 5001     [Stream index: 0]     [Conversation completeness: Incomplete, DATA (15)]     [TCP Segment Len: 1460]     Sequence Number: 1    (relative sequence number)     Sequence Number (raw): 1906168048     [Next Sequence Number: 1461    (relative sequence number)]     Acknowledgment Number: 1    (relative ack number)     Acknowledgment number (raw): 670872652     0101 .... = Header Length: 20 bytes (5)     Flags: 0x010 (ACK)     Window: 1026     [Calculated window size: 262656]     [Window size scaling factor: 256]     Checksum: 0x6e2f [unverified]     [Checksum Status: Unverified]     Urgent Pointer: 0     [Timestamps]     [SEQ/ACK analysis]     TCP payload (1460 bytes) Data (1460 bytes) `

Wireshark will provide a relative sequence number; it uses after the 3-way handshake as sequence number 1 and increments the data from there. For the sake of being concise I will be referring to this relative sequence number as the sequence number.

We start with our sequence number of 1 and send 1460 bytes of TCP payload. This the starting point of the next TCP data packet:

`16  19:47:40.032755 172.27.136.253  172.27.135.44   TCP 1514    50006 → 5001 [ACK] Seq=1461 Ack=1 Win=262656 Len=1460 `

And this process of sending data, filling the TCP receive window, repeats over and over until the server finally acknowledges the data that was sent:

`24  19:47:40.032755 172.27.136.253  172.27.135.44   TCP 1514    50006 → 5001 [ACK] Seq=13141 Ack=1 Win=262656 Len=1460 25  19:47:40.034547 172.27.135.44   172.27.136.253  TCP 54  5001 → 50006 [ACK] Seq=1 Ack=14601 Win=2097920 Len=0 `

With the ACK that was sent out, the server is acknowledging it has received those data packets, pulling them out of the TCP receive buffer, increasing the size of the TCP receive window.

Now that is all well and good, but let’s take a look at this process when the wheels start coming off.

`// TCP 3-way handshake 1   18:00:06.375549 172.23.151.160  172.23.157.78   TCP 66  51039 → 5001 [SYN, ECE, CWR] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM 3   18:00:07.380461 172.23.157.78   172.23.151.160  TCP 66  5001 → 51039 [SYN, ACK, ECE] Seq=0 Ack=1 Win=65535 Len=0 MSS=8960 WS=256 SACK_PERM 4   18:00:07.380581 172.23.151.160  172.23.157.78   TCP 54  51039 → 5001 [ACK] Seq=1 Ack=1 Win=262656 Len=0 // Sending data 15  17:10:13.525670 172.23.151.160  172.23.157.78   TCP 1514    51001 → 5001 [ACK] Seq=1 Ack=1 Win=262656 Len=1460 4   18:00:07.380581 172.23.151.160  172.23.157.78   TCP 54  51039 → 5001 [ACK] Seq=1 Ack=1 Win=262656 Len=0 14  18:00:07.386768 172.23.151.160  172.23.157.78   TCP 1514    51039 → 5001 [ACK] Seq=1 Ack=1 Win=262656 Len=1460 15  18:00:07.386768 172.23.151.160  172.23.157.78   TCP 1514    51039 → 5001 [ACK] Seq=1461 Ack=1 Win=262656 Len=1460 16  18:00:07.386768 172.23.151.160  172.23.157.78   TCP 1514    51039 → 5001 [ACK] Seq=2921 Ack=1 Win=262656 Len=1460 17  18:00:07.386768 172.23.151.160  172.23.157.78   TCP 1514    51039 → 5001 [ACK] Seq=4381 Ack=1 Win=262656 Len=1460 18  18:00:07.388096 172.23.157.78   172.23.151.160  TCP 54  5001 → 51039 [ACK] Seq=1 Ack=2921 Win=2097920 Len=0 19  18:00:07.388096 172.23.157.78   172.23.151.160  TCP 54  5001 → 51039 [ACK] Seq=1 Ack=5841 Win=2097920 Len=0 20  18:00:07.388122 172.23.151.160  172.23.157.78   TCP 1514    51039 → 5001 [ACK] Seq=5841 Ack=1 Win=262656 Len=1460 21  18:00:07.388122 172.23.151.160  172.23.157.78   TCP 1514    51039 → 5001 [ACK] Seq=7301 Ack=1 Win=262656 Len=1460 22  18:00:07.388122 172.23.151.160  172.23.157.78   TCP 1514    51039 → 5001 [ACK] Seq=8761 Ack=1 Win=262656 Len=1460 23  18:00:07.388122 172.23.151.160  172.23.157.78   TCP 1514    51039 → 5001 [ACK] Seq=10221 Ack=1 Win=262656 Len=1460 24  18:00:07.388122 172.23.151.160  172.23.157.78   TCP 1514    51039 → 5001 [ACK] Seq=11681 Ack=1 Win=262656 Len=1460 25  18:00:07.388122 172.23.151.160  172.23.157.78   TCP 1514    51039 → 5001 [ACK] Seq=13141 Ack=1 Win=262656 Len=1460 26  18:00:07.388122 172.23.151.160  172.23.157.78   TCP 1514    51039 → 5001 [ACK] Seq=14601 Ack=1 Win=262656 Len=1460 27  18:00:07.388122 172.23.151.160  172.23.157.78   TCP 1514    51039 → 5001 [ACK] Seq=16061 Ack=1 Win=262656 Len=1460 32  18:00:07.389123 172.23.151.160  172.23.157.78   TCP 1514    51039 → 5001 [ACK] Seq=17521 Ack=1 Win=262656 Len=1460 // Acknowledging data up to seq 8761 28  18:00:07.389418 172.23.157.78   172.23.151.160  TCP 54  5001 → 51039 [ACK] Seq=1 Ack=8761 Win=2097920 Len=0 // Acknowledging data up to seq 11681 29  18:00:07.389418 172.23.157.78   172.23.151.160  TCP 54  5001 → 51039 [ACK] Seq=1 Ack=11681 Win=2097920 Len=0 // Acknowledging data up to seq 14601 30  18:00:07.389418 172.23.157.78   172.23.151.160  TCP 54  5001 → 51039 [ACK] Seq=1 Ack=14601 Win=2097920 Len=0 // 3 ACKs in a row for the same SEQ # 31  18:00:07.389418 172.23.157.78   172.23.151.160  TCP 54  5001 → 51039 [ACK] Seq=1 Ack=17521 Win=2097920 Len=0 48  18:00:07.391073 172.23.157.78   172.23.151.160  TCP 66  [TCP Dup ACK 31#1] 5001 → 51039 [ACK] Seq=1 Ack=17521 Win=2097920 Len=0 SLE=18981 SRE=20441 49  18:00:07.391073 172.23.157.78   172.23.151.160  TCP 66  [TCP Dup ACK 31#2] 5001 → 51039 [ACK] Seq=1 Ack=17521 Win=2097920 Len=0 SLE=18981 SRE=21901 50  18:00:07.391073 172.23.157.78   172.23.151.160  TCP 66  [TCP Dup ACK 31#3] 5001 → 51039 [ACK] Seq=1 Ack=17521 Win=2097920 Len=0 SLE=18981 SRE=23361 // Fast transmission of packet with seq 17521 63  18:00:07.391117 172.23.151.160  172.23.157.78   TCP 1514    [TCP Fast Retransmission] 51039 → 5001 [ACK] Seq=17521 Ack=1 Win=262656 Len=1460 `

This data flow looks a little different than the previous one when it comes to data acknowledgement. Near the end of the data flow we can see the server sending out 3 ACKs in a row with the same ACK number 17521.

This pattern is known as the request for fast retransmission. The idea is because the TCP data is sent sequentially, if we find that the sequence isn’t complete, we can request the packet that would complete the sequence. And it signals this by sending 3 duplicate ACKs.

Now that you’ve got the basics, let’s look at some scenarios.

One of your colleagues mentioned that downloads from the web server are slow and they can’t figure out why.

With that little bit of information, you start asking questions:

• Q: When did this start happening?
• A: Dunno
• Q: Is this all files or just one file?
• A: All files
• Q: What protocol is the traffic using?
• A: It’s a webserver and it’s serving HTTP (which uses TCP as the transport)
• Q: How fast is it normally? (Great question)
• A: It normally takes a second or two to download a 476MB file
• Q: How slow is it now? (Another great question)
• A: Dunno. Slower.

To get a better sense of just how slow it is you put together a quick PowerShell script to test the performance.

``````for(\$i = 0; \$i -lt 5; \$i++) {
\$Result = Measure-Command { Invoke-WebRequest "http://172.28.18.74/BigFile.txt" -OutFile BigFile\$i.txt}
Write-Host "\$i Total s: \$(\$Result.TotalSeconds) Total ms: \$(\$Result.TotalMilliseconds)"
} ``````

And you see the following results:

• 0 Total s: 13.1842173 Total ms: 13184.2173
• 1 Total s: 13.1933978 Total ms: 13193.3978
• 2 Total s: 13.3275595 Total ms: 13327.5595
• 3 Total s: 13.3114786 Total ms: 13311.4786
• 4 Total s: 13.3064738 Total ms: 13306.4738

Compared to a second or two this is pretty bad… Let’s get a two sided network trace and see what we can see.

Starting with server side:

`1   15:41:16.157828 172.28.18.214   172.28.18.74    TCP 66  49713 → 80 [SYN, ECE, CWR] Seq=0 Win=62720 Len=0 MSS=8960 WS=256 SACK_PERM 2   15:41:16.157994 172.28.18.74    172.28.18.214   TCP 66  80 → 49713 [SYN, ACK, ECE] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256 SACK_PERM 3   15:41:17.164414 172.28.18.74    172.28.18.214   TCP 66  [TCP Retransmission] 80 → 49713 [SYN, ACK, ECE] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256 SACK_PERM 4   15:41:17.165840 172.28.18.214   172.28.18.74    TCP 54  49713 → 80 [ACK] Seq=1 Ack=1 Win=262656 Len=0 `

Already off to a bad start. The server sent out its TCP SYN ACK at 15:41:16.157994 and didn’t get a response for a little over a second so it retransmitted the data.

And on the client side:

`1   15:41:16.157828 172.28.18.214   172.28.18.74    TCP 66  49713 → 80 [SYN, ECE, CWR] Seq=0 Win=62720 Len=0 MSS=8960 WS=256 SACK_PERM 3   15:41:17.164414 172.28.18.74    172.28.18.214   TCP 66  80 → 49713 [SYN, ACK, ECE] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256 SACK_PERM 4   15:41:17.165840 172.28.18.214   172.28.18.74    TCP 54  49713 → 80 [ACK] Seq=1 Ack=1 Win=262656 Len=0 `

We only see one TCP SYN ACK and it is at 15:41:17.1644. Meaning that the TCP SYN ACK really was dropped somewhere along the way.

Let’s see if things get better or worse as we progress in the trace. Again, on the server side:

`// Server happily sending data 33  15:41:17.168116 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=30661 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 34  15:41:17.168116 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=32121 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 35  15:41:17.168116 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=33581 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 36  15:41:17.168116 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=35041 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 37  15:41:17.168116 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=36501 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 38  15:41:17.168116 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=37961 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 39  15:41:17.168116 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=39421 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] // Client acknowledging data 40  15:41:17.168352 172.28.18.214   172.28.18.74    TCP 54  49713 → 80 [ACK] Seq=168 Ack=20441 Win=262656 Len=0 // Sending more data 41  15:41:17.168364 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=40881 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 42  15:41:17.168364 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=42341 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 43  15:41:17.168364 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=43801 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 44  15:41:17.168364 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=45261 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] // Acknowledging more data 45  15:41:17.168532 172.28.18.214   172.28.18.74    TCP 54  49713 → 80 [ACK] Seq=168 Ack=23361 Win=262656 Len=0 46  15:41:17.168532 172.28.18.214   172.28.18.74    TCP 54  49713 → 80 [ACK] Seq=168 Ack=26281 Win=262656 Len=0 47  15:41:17.168532 172.28.18.214   172.28.18.74    TCP 54  49713 → 80 [ACK] Seq=168 Ack=29201 Win=262656 Len=0 // Sending more data 48  15:41:17.168547 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=46721 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 49  15:41:17.168547 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=48181 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 50  15:41:17.168547 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=49641 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 51  15:41:17.168547 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=51101 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 52  15:41:17.168547 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=52561 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 53  15:41:17.168547 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=54021 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 54  15:41:17.168547 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=55481 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 55  15:41:17.168547 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=56941 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 56  15:41:17.168547 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=58401 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 57  15:41:17.168547 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=59861 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 58  15:41:17.168547 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=61321 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 59  15:41:17.168547 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=62781 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] `

Uh oh. We see some DUP ACKs

`60  15:41:17.168722 172.28.18.214   172.28.18.74    TCP 66  49713 → 80 [ACK] Seq=168 Ack=30661 Win=262656 Len=0 SLE=32121 SRE=33581 61  15:41:17.168722 172.28.18.214   172.28.18.74    TCP 66  [TCP Dup ACK 60#1] 49713 → 80 [ACK] Seq=168 Ack=30661 Win=262656 Len=0 SLE=32121 SRE=35041 62  15:41:17.168722 172.28.18.214   172.28.18.74    TCP 66  [TCP Dup ACK 60#2] 49713 → 80 [ACK] Seq=168 Ack=30661 Win=262656 Len=0 SLE=32121 SRE=36501 63  15:41:17.168742 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=64241 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 64  15:41:17.168747 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=65701 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 65  15:41:17.168753 172.28.18.214   172.28.18.74    TCP 66  [TCP Dup ACK 60#3] 49713 → 80 [ACK] Seq=168 Ack=30661 Win=262656 Len=0 SLE=32121 SRE=37961 66  15:41:17.168757 172.28.18.74    172.28.18.214   TCP 1514    [TCP Fast Retransmission] 80 → 49713 [ACK] Seq=30661 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] `

Breaking down the packet capture above:

`60  15:41:17.168722 172.28.18.214   172.28.18.74    TCP 66  49713 → 80 [ACK] Seq=168 Ack=30661 Win=262656 Len=0 SLE=32121 SRE=33581 61  15:41:17.168722 172.28.18.214   172.28.18.74    TCP 66  [TCP Dup ACK 60#1] 49713 → 80 [ACK] Seq=168 Ack=30661 Win=262656 Len=0 SLE=32121 SRE=35041 62  15:41:17.168722 172.28.18.214   172.28.18.74    TCP 66  [TCP Dup ACK 60#2] 49713 → 80 [ACK] Seq=168 Ack=30661 Win=262656 Len=0 SLE=32121 SRE=36501 <snip> 65  15:41:17.168753 172.28.18.214   172.28.18.74    TCP 66  [TCP Dup ACK 60#3] 49713 → 80 [ACK] Seq=168 Ack=30661 Win=262656 Len=0 SLE=32121 SRE=37961 `

The client is requesting a TCP fast retransmission for a packet with sequence number 30661. But that packet was already sent! We can clearly see that in frame 33:

`33  15:41:17.168116 172.28.18.74    172.28.18.214   TCP 1514    80 → 49713 [ACK] Seq=30661 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] <snip> 61  15:41:17.168722 172.28.18.214   172.28.18.74    TCP 66  [TCP Dup ACK 60#1] 49713 → 80 [ACK] Seq=168 Ack=30661 Win=262656 Len=0 SLE=32121 SRE=35041 62  15:41:17.168722 172.28.18.214   172.28.18.74    TCP 66  [TCP Dup ACK 60#2] 49713 → 80 [ACK] Seq=168 Ack=30661 Win=262656 Len=0 SLE=32121 SRE=36501 <snip> 65  15:41:17.168753 172.28.18.214   172.28.18.74    TCP 66  [TCP Dup ACK 60#3] 49713 → 80 [ACK] Seq=168 Ack=30661 Win=262656 Len=0 SLE=32121 SRE=37961 66  15:41:17.168757 172.28.18.74    172.28.18.214   TCP 1514    [TCP Fast Retransmission] 80 → 49713 [ACK] Seq=30661 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] `

Again, we have another packet dropped. So what? Looking in Wireshark and filtering on the tcp.port == 49713 && tcp.analysis.retransmission we see that only 8041 out of 554433 packets were retransmitted this works out to 1.5%.

After mentioning to your colleague that you are seeing packet loss, they remember that they forgot to turn one of the routers back on after last night’s upgrades. This caused all traffic to go through a single router overwhelming it.

Not sure if that is the issue, you run your test again.

• 0 Total s: 2.0704767 Total ms: 2070.4767
• 1 Total s: 1.4600527 Total ms: 1460.0527
• 2 Total s: 1.2229278 Total ms: 1222.9278
• 3 Total s: 1.2822072 Total ms: 1282.2072
• 4 Total s: 1.3285967 Total ms: 1328.5967

You take a look in the packet capture and see no packet loss. This is a great example of how impactful such small amounts of packet loss can be. As a general rule, anything over 0.5% packet loss you will see severe performance degradation.

The long pipe

Your IT manager is working on migrating their on-premises deployment into Azure. The company is located on the California and the data center is located in Ireland.

Users are reporting that after the migration, downloading files from the webserver is much slower than before. The IT manager has put you in charge of determining why and doing something about it.

You start off with a handful of questions:

• Q: When did this start happening?
• A: When we migrated to Azure!
• Q: Is this all files or just one file?
• A: All files
• Q: What protocol is the traffic using?
• A: It’s a webserver and it’s serving HTTP (which uses TCP as the transport)
• Q: How fast is it normally?
• A: It normally takes a second or two to download a 476MB file
• Q: How slow is it now?

Using the sample PowerShell script as before you run your tests:

• 0 Total s: 150.5332241 Total ms: 150533.2241
• 1 Total s: 101.4933428 Total ms: 101493.3428
• 2 Total s: 139.1269624 Total ms: 139126.9624
• 3 Total s: 131.8244205 Total ms: 131824.4205
• 4 Total s: 114.3400698 Total ms: 114340.0698

Yeah, wow, that is much slower but why? Let’s look at our capture to find out.

Starting with the server side we can see the following:

`1   20:30:35.836154 172.25.0.71 172.25.0.182    TCP 66  50497 → 80 [SYN, ECE, CWR] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM 2   20:30:35.836306 172.25.0.182    172.25.0.71 TCP 66  80 → 50497 [SYN, ACK, ECE] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256 SACK_PERM 3   20:30:36.005644 172.25.0.71 172.25.0.182    TCP 54  50497 → 80 [ACK] Seq=1 Ack=1 Win=262656 Len=0 4   20:30:36.005644 172.25.0.71 172.25.0.182    HTTP    221 GET /BigFile.txt HTTP/1.1  5   20:30:36.006750 172.25.0.182    172.25.0.71 TCP 1514    80 → 50497 [ACK] Seq=1 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 6   20:30:36.006750 172.25.0.182    172.25.0.71 TCP 1514    80 → 50497 [ACK] Seq=1461 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 7   20:30:36.006750 172.25.0.182    172.25.0.71 TCP 1514    80 → 50497 [ACK] Seq=2921 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 8   20:30:36.006750 172.25.0.182    172.25.0.71 TCP 1514    80 → 50497 [ACK] Seq=4381 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 9   20:30:36.006750 172.25.0.182    172.25.0.71 TCP 1514    80 → 50497 [ACK] Seq=5841 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 10  20:30:36.006750 172.25.0.182    172.25.0.71 TCP 1514    80 → 50497 [ACK] Seq=7301 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 11  20:30:36.006750 172.25.0.182    172.25.0.71 TCP 1514    80 → 50497 [ACK] Seq=8761 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 12  20:30:36.006750 172.25.0.182    172.25.0.71 TCP 1514    80 → 50497 [ACK] Seq=10221 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 13  20:30:36.006750 172.25.0.182    172.25.0.71 TCP 1514    80 → 50497 [ACK] Seq=11681 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] 14  20:30:36.006750 172.25.0.182    172.25.0.71 TCP 1514    80 → 50497 [ACK] Seq=13141 Ack=168 Win=2097664 Len=1460 [TCP segment of a reassembled PDU] // Whoa this is a lifetime of doing nothing in the world of networking 15  20:30:36.175081 172.25.0.71 172.25.0.182    TCP 54  50497 → 80 [ACK] Seq=168 Ack=2921 Win=262656 Len=0 16  20:30:36.175081 172.25.0.71 172.25.0.182    TCP 54  50497 → 80 [ACK] Seq=168 Ack=5841 Win=262656 Len=0 17  20:30:36.175081 172.25.0.71 172.25.0.182    TCP 54  50497 → 80 [ACK] Seq=168 Ack=8761 Win=262656 Len=0 18  20:30:36.175081 172.25.0.71 172.25.0.182    TCP 54  50497 → 80 [ACK] Seq=168 Ack=11681 Win=262656 Len=0 19  20:30:36.175081 172.25.0.71 172.25.0.182    TCP 54  50497 → 80 [ACK] Seq=168 Ack=14601 Win=262656 Len=0 `

It takes roughly 170ms for data send by the server to be acknowledge by the client and have that TCP ACK to arrive back on the server indicating that the client has acknowledged the data.

Wow, that is wild. Why does it take so long for the client to acknowledge the data? Well… Let’s stop for a second and think about why the TCP ACK’s might be taking so long? How about we take a look at the TCP 3-way handshake.

I’ll be truncating the time to just seconds and milliseconds for the sake of concision.

`1   20:30:35.836154 172.25.0.71 172.25.0.182    TCP 66  50497 → 80 [SYN, ECE, CWR] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM `

The TCP SYN arrives on the server at 35.836154. Okay, great! The server receives and processes that SYN and sends out its TCP SYN ACK:

`2   20:30:35.836306 172.25.0.182    172.25.0.71 TCP 66  80 → 50497 [SYN, ACK, ECE] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256 SACK_PERM `

And that TCP SYN ACK is sent out at 35.836306. Meaning that going from receiving the TCP SYN to sending out the TCP SYN ACK took (35.836306 - 35.836154) ~= 0.000152 seconds. Or 0.1ms pretty quick right?

`3   20:30:36.005644 172.25.0.71 172.25.0.182    TCP 54  50497 → 80 [ACK] Seq=1 Ack=1 Win=262656 Len=0 `

But, when we take a look at the TCP ACK, we see that it arrives back on the server at 36.005644. Looking at the time delta from the TCP SYN ACK, (36.005644 - 35.836306) ~= 0.169338. That’s a significant delay.

Where could this delay be coming from??? Thinking about what the final TCP ACK what does it take for this packet to arrive back on the server?

• The TCP SYN ACK from the server needs to traverse the network
• The TCP SYN ACK needs to be processed by the client
• A TCP ACK needs to be created by the client
• A TCP ACK from the client needs to traverse the network

Now it is certainly possibly that we are seeing slowness in the Windows TCPIP stack… but I wouldn’t count on it. So why does this take so long? Let’s remove TCP from the situation and try to just understand the network path, and we can do this with something as simple as ping which will send ICMP request and response traffic. Small, lightweight, simple.

Here are our results:

``````PS C:\Temp> ping 172.25.0.182

Pinging 172.25.0.182 with 32 bytes of data:
Reply from 172.25.0.182: bytes=32 time=164ms TTL=128
Reply from 172.25.0.182: bytes=32 time=177ms TTL=128
Reply from 172.25.0.182: bytes=32 time=173ms TTL=128
Reply from 172.25.0.182: bytes=32 time=144ms TTL=128``````

Yep, it looks like the network path is seeing some slowness. With that in mind why might that be? Thinking back to what we learn when asking questions:

The company is located in California and the data center is located in Ireland.

• Q: When did this start happening?
• A: When we migrated to Azure!

Well if the deployment moved from the same site to across the Atlantic Ocean we will see a delay in the network path…

Given how simple the problem is you reach out to your colleague for a simple solution.

• Q: Can we test the deployment in a North America Azure Region?
• A: Sure!

Starting with our ping test:

``````PS C:\Temp> ping 172.25.0.182

Pinging 172.25.0.182 with 32 bytes of data:
Reply from 172.25.0.182: bytes=32 time=66ms TTL=128
Reply from 172.25.0.182: bytes=32 time=63ms TTL=128
Reply from 172.25.0.182: bytes=32 time=78ms TTL=128
Reply from 172.25.0.182: bytes=32 time=80ms TTL=128 ``````

Great so our latency is much lower, how about the actual file transfer?

After:

• 0 Total s: 45.079023 Total ms: 45079.023
• 1 Total s: 66.2950938 Total ms: 66295.0938
• 2 Total s: 69.9207062 Total ms: 69920.7062
• 3 Total s: 62.4024251 Total ms: 62402.4251
• 4 Total s: 69.3996179 Total ms: 69399.6179

Better but still not as fast, but unfortunately as I mentioned at the beginning of the post we are bound by that pesky speed of light.

There may be some additional TCP tuning that can be done to potentially squeeze more performance out of the connection but your mileage may vary. In general Windows does a pretty good job with the addition of CUBIC TCP congestion provider added in the Windows 10 Fall 2017 creators update.

Wrapping things up

Compared to some of the other topics we will cover TCP performance is really up to the infrastructure. Packet loss and geographic distance are going to be your primary causes of the issue.

In my next post we have one of my favorite topics, DNS! See you then.

Previous posts in series: