Network Oddity

This is… strange. Two machines, connected through cat5 and gigabit adaptors/hub.

$ iperf -c melchett.local -d
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to melchett.local, TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[  4] local 192.168.1.7 port 35197 connected with 192.168.1.10 port 5001
[  5] local 192.168.1.7 port 5001 connected with 192.168.1.10 port 33692
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  1.08 GBytes   926 Mbits/sec
[  5]  0.0-10.0 sec  1.05 GBytes   897 Mbits/sec

Simultaneous transfers get ~900MBits/s.

$ iperf -c melchett.local -r
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to melchett.local, TCP port 5001
TCP window size: 22.9 KByte (default)
------------------------------------------------------------
[  5] local 192.168.1.7 port 35202 connected with 192.168.1.10 port 5001
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-10.0 sec   210 MBytes   176 Mbits/sec
[  4] local 192.168.1.7 port 5001 connected with 192.168.1.10 port 33693
[  4]  0.0-10.0 sec  1.10 GBytes   941 Mbits/sec

Testing each direction independently results in only 176MBits/sec on the transfer to the iperf server (melchett). This is 100% reproducible, and the same results appear if I swap iperf client and servers.

I’ve swapped one of the cables involved but the other is harder to get to, but I don’t see how physical damage could cause this sort of performance issue. Oh Internet, any ideas?

6 thoughts on “Network Oddity

  1. Two thoughts:
    Could this be related to TCP window scaling?
    Weird buffering in the switch? (try removing it from the equation)

  2. Perhaps, it’s due to bandwidth delay product. Try increasing TCP window size or use UDP for measurement.

  3. My guess is that it’s caused by the really low TCP window size (22.9 kB) and interrupt coalescing. Interrupt coalescing keeps the OS from seeing received data until a buffer on the NIC fills or a timeout occurs. Either melchett’s OS is only seeing 22.9 kilobytes of data every timeout period (usually ~1 ms), or the client’s OS only sees the ACKs for that data once every millisecond. Either way:

    (22.9 kilobytes) / (1 millisecond) = 178.90625 megabits per second

    …which is very close to the 176 Mbit/s you’re seeing.

    Try increasing the TCP window with iperf’s –window option.

  4. I’ve seen something like this, 5 or so years ago, it had something to do with cpuidle/cpufreq. You could temporarily “fix” it by running a stupid “while true ; do true ; done” in a terminal to keep the cpu awake.

  5. Bad cables can lead to packet loss, which would cause retransmissions, and the attendant delays.

    Yea, bad cables can cause a big drop in throughput, but I can’t tell for sure if this was bad cables without looking at some network captures.

  6. If, as it appears, these machines are on the same LAN segment, then Latency, and therefore window sizes are very unlikely to be the culprit here.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>