Project

General

Profile

Actions

Bug #5003

closed

Feature #1624: Design and Implement Congestion Control

Congestion Marking too aggressive

Added by Anonymous about 5 years ago. Updated 12 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
Faces
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:

Description

The current implementation of the active queue management in GenericLinkService is too agressive, which can cause a drop in throughput.

Example with 2 NFD nodes connected via UDP tunnel (added 10ms RTT):

klaus@consumer:~/work$ ndncatchunks /100m

All segments have been received.
Time elapsed: 43.3606 seconds
Segments received: 23832
Transferred size: 104858 kB
Goodput: 19.346137 Mbit/s
Congestion marks: 13 (caused 13 window decreases)
Timeouts: 144 (caused 17 window decreases)
Retransmitted segments: 107 (0.446969%), skipped: 37
RTT min/avg/max = 10.520/22.440/427.402 ms 

With congestion marks ignored:

klaus@consumer:~/work$ ndncatchunks --ignore-marks /100m


All segments have been received.
Time elapsed: 20.561 seconds
Segments received: 23832
Transferred size: 104858 kB
Goodput: 40.798586 Mbit/s
Congestion marks: 48 (caused 0 window decreases)
Timeouts: 1459 (caused 19 window decreases)
Retransmitted segments: 1389 (5.50732%), skipped: 70
RTT min/avg/max = 10.574/33.479/1006.184 ms
klaus@consumer:~/work$ ndncatchunks /100m 

The queuing (and congestion marking) happens mostly inside NFD, since the links are faster than NFD can process (often the case in real networks too).

The solution is to implement a more proper version of CoDel (see https://tools.ietf.org/html/rfc8289), compared to what was done in #4362.

There are two simplifications in the current code:

  1. We measure the queue size (in bytes) rather than queuing delay (ms).
  2. The first mark happens directly after exceeding the threshold, rather than exceeding the threshold for a given time period (100ms).

I think (2) is more significant, and I will look at it first.


Files

retx.pdf (7.82 KB) retx.pdf Timeouts per packet number Anonymous, 10/02/2019 09:40 PM
Actions #1

Updated by Anonymous about 5 years ago

  • Assignee set to Anonymous
Actions #2

Updated by Anonymous about 5 years ago

  • Category set to Faces
  • Parent task set to #1624
Actions #3

Updated by Anonymous about 5 years ago

We wanted to find out how many of these timeouts are real packet drops vs. just exceeding the consumer RTO limit.

Some measurements about the Unix sockets. Retrieving a 50 Megabyte file:

klaus@localhost:~$ ndncatchunks --ignore-marks /50

All segments have been received.
Time elapsed: 2.04337 seconds
Segments received: 11916
Transferred size: 52428.8 kB
Goodput: 205.264109 Mbit/s
Congestion marks: 0 (caused 0 window decreases)
Timeouts: 3248 (caused 2 window decreases)
Retransmitted segments: 2650 (18.1931%), skipped: 598
RTT min/avg/max = 0.088/192.217/473.843 ms

With higher RTO and interest lifetime (values in ms):

klaus@localhost:~$ ndncatchunks --ignore-marks --min-rto 40000 --lifetime 100000 /50

All segments have been received.
Time elapsed: 5.9546 seconds
Segments received: 11916
Transferred size: 52428.8 kB
Goodput: 70.438089 Mbit/s
Congestion marks: 0 (caused 0 window decreases)
Timeouts: 0 (caused 0 window decreases)
Retransmitted segments: 0 (0%), skipped: 0
RTT min/avg/max = 0.370/2139.468/4118.719 ms

Actions #4

Updated by Anonymous about 5 years ago

Some more local (unix socket) measurements with a larger file (500MB), CS disabled:

Without congestion marks:

klaus@localhost:~$ ndncatchunks --ignore-marks /500

All segments have been received.
Time elapsed: 23.4087 seconds
Segments received: 119157
Transferred size: 524288 kB
Goodput: 179.177290 Mbit/s
Congestion marks: 11098 (caused 0 window decreases)
Timeouts: 2660 (caused 2 window decreases)
Retransmitted segments: 1905 (1.57357%), skipped: 755
RTT min/avg/max = 0.062/317.786/528.805 ms

With congestion marks:

klaus@localhost:~$ ndncatchunks /500

All segments have been received.
Time elapsed: 7.3627 seconds
Segments received: 119157
Transferred size: 524288 kB
Goodput: 569.669132 Mbit/s
Congestion marks: 30 (caused 16 window decreases)
Timeouts: 3545 (caused 2 window decreases)
Retransmitted segments: 3019 (2.47103%), skipped: 526
RTT min/avg/max = 0.298/9.548/465.668 ms

Increasing the minimum RTO:

klaus@localhost:~$ ndncatchunks --min-rto 1000 /500

All segments have been received.
Time elapsed: 6.50988 seconds
Segments received: 119157
Transferred size: 524288 kB
Goodput: 644.298549 Mbit/s
Congestion marks: 32 (caused 18 window decreases)
Timeouts: 0 (caused 0 window decreases)
Retransmitted segments: 0 (0%), skipped: 0
RTT min/avg/max = 0.208/7.186/236.157 ms
Actions #5

Updated by Anonymous about 5 years ago

Run in the scenario described above (UDP tunnel, 20ms RTT):

klaus@consumer:~/work$ ndncatchunks /200

All segments have been received.
Time elapsed: 75.5894 seconds
Segments received: 47663
Transferred size: 209715 kB
Goodput: 22.195196 Mbit/s
Congestion marks: 110 (caused 2 window decreases)
Timeouts: 4678 (caused 42 window decreases)
Retransmitted segments: 4605 (8.81036%), skipped: 73
RTT min/avg/max = 20.878/49.811/805.386 ms

Consumer UDP Receive/Buffer Errors: 325
Producer UDP Receive/Buffer Errors: 1974

Actions #6

Updated by Anonymous about 5 years ago

Same scenario run again with higher timeouts:

klaus@consumer:~/work$ ndncatchunks --min-rto 2000 --lifetime 10000 /200

All segments have been received.
Time elapsed: 56.7384 seconds
Segments received: 47663
Transferred size: 209715 kB
Goodput: 29.569444 Mbit/s
Congestion marks: 10 (caused 2 window decreases)
Timeouts: 870 (caused 37 window decreases)
Retransmitted segments: 870 (1.79259%), skipped: 0
RTT min/avg/max = 20.752/57.494/918.524 ms

Consumer UDP Receive/Buffer Errors: 193
Producer UDP Receive/Buffer Errors: 677
Total: 870

With large enough timeouts, all the timeouts are caused by UDP buffer overflows/drops.

Actions #7

Updated by Anonymous about 5 years ago

Moreover, we're getting the same number (or even more) of timeouts on a file 10x smaller (20M vs 200M):

ndncatchunks --min-rto 2000 --lifetime 10000 /20

All segments have been received.
Time elapsed: 6.94108 seconds
Segments received: 4767
Transferred size: 20971.5 kB
Goodput: 24.170902 Mbit/s
Congestion marks: 7 (caused 2 window decreases)
Timeouts: 1231 (caused 2 window decreases)
Retransmitted segments: 1231 (20.5235%), skipped: 0
RTT min/avg/max = 21.276/100.755/215.389 ms
klaus@consumer:~/work$ ndncatchunks --min-rto 2000 --lifetime 10000 /20

That means that likely most of those timeouts are happen during the slow-start phase.

Actions #8

Updated by Anonymous about 5 years ago

Some iperf results as comparison. Same scenario (3 nodes, 20ms RTT):

klaus@consumer:~/work$ iperf -e -l 256K -c 10.2.0.3
------------------------------------------------------------
Client connecting to 10.2.0.3, TCP port 5001 with pid 15837
Write buffer size:  256 KByte
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 10.1.0.1 port 52334 connected with 10.2.0.3 port 5001
[ ID] Interval        Transfer    Bandwidth       Write/Err  Rtry    Cwnd/RTT
[  3] 0.00-10.05 sec   125 MBytes   105 Mbits/sec  1/0        142      448K/25883 us

Actions #9

Updated by Anonymous about 5 years ago

Actions #10

Updated by Anonymous about 5 years ago

  • Description updated (diff)
Actions #11

Updated by Davide Pesavento about 5 years ago

  • Subject changed from Congestion Marking too agressive to Congestion Marking too aggressive
Actions #12

Updated by Davide Pesavento about 5 years ago

  • Status changed from New to Code review
  • Target version set to v0.7
Actions #13

Updated by Davide Pesavento about 5 years ago

I don't understand the context of these numbers you posted. Are they measured after applying the change? How do they compare with the behavior before?

Actions #14

Updated by Anonymous about 5 years ago

Yes, all the numbers on #note-3 and later are for the new design.

The original post contains the comparison between before and after the change: 19.346137 Mbit/s vs. 40.798586 Mbit/s

Actions #15

Updated by Anonymous about 5 years ago

Basically the later comments are a follow-up discussion I had with Beichuan to determine:

  • Whether timeouts only/mostly happen during the start of the connection?
  • Whether those are real packet drops vs. just timeouts caused by the RTO setting?
  • Whether NFD + unix sockets will drop any packets or queue them indefinitely?
Actions #16

Updated by Davide Pesavento about 5 years ago

Klaus Schneider wrote:

The original post contains the comparison between before and after the change: 19.346137 Mbit/s vs. 40.798586 Mbit/s

I'm still confused. The description compares a run with cong marks vs one without. I don't see any "before vs after". Please clarify.
Moreover, it'd be great to see the behavior in a few more scenarios, e.g. different link delays (1/10/100 ms or 1/20/200 ms or something like that), different transfer sizes (say, 1MB and 100MB).

Actions #17

Updated by Anonymous about 5 years ago

Note: The default UDP Buffer capacity seems to be quite low at only 106KB.

1570305326.735465 TRACE: [nfd.GenericLinkService] [id=264,local=udp4://10.1.0.1:6363,remote=udp4://10.2.0.3:6363] txqlen=768 threshold=65536 capacity=106496

Actions #18

Updated by Davide Pesavento about 5 years ago

But in any case, if the capacity is so low, it still doesn't make sense to slow down the throughput even more by adding congestion marks.

Maybe we should consider increasing the defaultCongestionThreshold? (currently at 64 KB)

Actions #19

Updated by Davide Pesavento about 5 years ago

Or change that std::min to std::max? Or have both an upper bound and a lower bound?

Actions #20

Updated by Anonymous about 5 years ago

If the buffer is too small, there is very little you can do via congestion marking.

  • Lower threshold: You mark too many packets, poor throughput
  • Higher threshold: You mark very few packets and get lots of packet drops from the queue.

I think the default threshold is fine, and if any thing, should be tuned (via nfdc) to the link capacity and desired queuing delay. A higher threshold leads to higher avg. queuing delay, but also higher throughput.

The ideal would be to have the threshold approximate 5ms of queuing delay. Aka use a higher threshold for faster links.

Actions #21

Updated by Anonymous about 5 years ago

Okay, here's a proper comparison between old and new code. 10ms RTT, UDP tunnel, 100MByte file.

I used 3 runs, since there's a lot of variance.

Old code (100MB):


All segments have been received.
Time elapsed: 64.6259 seconds
Segments received: 23832
Transferred size: 104858 kB
Goodput: 12.980265 Mbit/s
Congestion marks: 12 (caused 9 window decreases)
Timeouts: 242 (caused 34 window decreases)
Retransmitted segments: 176 (0.733089%), skipped: 66
RTT min/avg/max = 11.236/24.991/1229.038 ms
klaus@consumer:~/work$ ndncatchunks /100

All segments have been received.
Time elapsed: 26.9613 seconds
Segments received: 23832
Transferred size: 104858 kB
Goodput: 31.113460 Mbit/s
Congestion marks: 15 (caused 11 window decreases)
Timeouts: 788 (caused 14 window decreases)
Retransmitted segments: 772 (3.1377%), skipped: 16
RTT min/avg/max = 10.894/24.501/542.259 ms

klaus@consumer:~/work$ ndncatchunks /100


All segments have been received.
Time elapsed: 21.4059 seconds
Segments received: 23832
Transferred size: 104858 kB
Goodput: 39.188221 Mbit/s
Congestion marks: 14 (caused 14 window decreases)
Timeouts: 12 (caused 1 window decreases)
Retransmitted segments: 12 (0.0503271%), skipped: 0
RTT min/avg/max = 10.974/18.564/59.997 ms

New code 100MB:


All segments have been received.
Time elapsed: 21.2545 seconds
Segments received: 23832
Transferred size: 104858 kB
Goodput: 39.467509 Mbit/s
Congestion marks: 98 (caused 4 window decreases)
Timeouts: 1943 (caused 16 window decreases)
Retransmitted segments: 1900 (7.3838%), skipped: 43
RTT min/avg/max = 11.060/50.522/772.800 ms
klaus@consumer:~/work$ ndncatchunks /100


All segments have been received.
Time elapsed: 14.9753 seconds
Segments received: 23832
Transferred size: 104858 kB
Goodput: 56.016126 Mbit/s
Congestion marks: 103 (caused 4 window decreases)
Timeouts: 1139 (caused 13 window decreases)
Retransmitted segments: 1076 (4.3199%), skipped: 63
RTT min/avg/max = 11.243/79.103/790.922 ms
klaus@consumer:~/work$ ndncatchunks /100


All segments have been received.
Time elapsed: 14.0394 seconds
Segments received: 23832
Transferred size: 104858 kB
Goodput: 59.750446 Mbit/s
Congestion marks: 33 (caused 5 window decreases)
Timeouts: 889 (caused 10 window decreases)
Retransmitted segments: 815 (3.30669%), skipped: 74
RTT min/avg/max = 10.952/27.316/459.914 ms
klaus@consumer:~/work$ 

General result: You see much fewer window decreases via congestion marks + Higher throughput.

Actions #22

Updated by Anonymous about 5 years ago

Same measurement, but with 200 Megabyte file:

Old:


klaus@consumer:~/work$ ndncatchunks /200


All segments have been received.
Time elapsed: 35.9728 seconds
Segments received: 23832
Transferred size: 104858 kB
Goodput: 23.319323 Mbit/s
Congestion marks: 19 (caused 14 window decreases)
Timeouts: 130 (caused 12 window decreases)
Retransmitted segments: 90 (0.376223%), skipped: 40
RTT min/avg/max = 11.299/23.571/433.105 ms
klaus@consumer:~/work$ ndncatchunks /200


All segments have been received.
Time elapsed: 47.5256 seconds
Segments received: 23832
Transferred size: 104858 kB
Goodput: 17.650725 Mbit/s
Congestion marks: 10 (caused 9 window decreases)
Timeouts: 308 (caused 25 window decreases)
Retransmitted segments: 257 (1.06688%), skipped: 51
RTT min/avg/max = 10.840/25.944/509.276 ms
klaus@consumer:~/work$ ndncatchunks /200

klaus@consumer:~/work$ ndncatchunks /200


All segments have been received.
Time elapsed: 24.11 seconds
Segments received: 23832
Transferred size: 104858 kB
Goodput: 34.793028 Mbit/s
Congestion marks: 17 (caused 17 window decreases)
Timeouts: 0 (caused 0 window decreases)
Retransmitted segments: 0 (0%), skipped: 0
RTT min/avg/max = 11.209/17.683/58.173 ms

New:

klaus@consumer:~/work$ ndncatchunks /200


All segments have been received.
Time elapsed: 31.3761 seconds
Segments received: 47663
Transferred size: 209715 kB
Goodput: 53.471389 Mbit/s
Congestion marks: 31 (caused 4 window decreases)
Timeouts: 523 (caused 17 window decreases)
Retransmitted segments: 453 (0.941475%), skipped: 70
RTT min/avg/max = 11.083/35.429/909.116 ms
klaus@consumer:~/work$ ndncatchunks /200


All segments have been received.
Time elapsed: 27.983 seconds
Segments received: 47663
Transferred size: 209715 kB
Goodput: 59.955015 Mbit/s
Congestion marks: 1 (caused 1 window decreases)
Timeouts: 54 (caused 8 window decreases)
Retransmitted segments: 44 (0.0922297%), skipped: 10
RTT min/avg/max = 10.834/17.465/353.567 ms
klaus@consumer:~/work$ ndncatchunks /200


All segments have been received.
Time elapsed: 32.4897 seconds
Segments received: 47663
Transferred size: 209715 kB
Goodput: 51.638518 Mbit/s
Congestion marks: 2 (caused 1 window decreases)
Timeouts: 521 (caused 24 window decreases)
Retransmitted segments: 390 (0.811604%), skipped: 131
RTT min/avg/max = 11.039/22.453/754.475 ms

Actions #23

Updated by Davide Pesavento about 5 years ago

Klaus Schneider wrote:

I used 3 runs, since there's a lot of variance.

Did you disable the content store?

Actions #24

Updated by Anonymous about 5 years ago

Yes, I set "cs_max_packets 0" on both NFDs.

Actions #25

Updated by Anonymous about 5 years ago

Log level is "INFO" btw, and piped to text file.

sudo nfd 2>log.txt

Actions #26

Updated by Anonymous about 5 years ago

  • Description updated (diff)
Actions #27

Updated by Davide Pesavento about 5 years ago

  • Status changed from Code review to Closed
  • % Done changed from 0 to 100
Actions #28

Updated by Davide Pesavento 12 months ago

  • Start date deleted (09/21/2019)
Actions

Also available in: Atom PDF