Project

General

Profile

Bug #5003

Updated by Anonymous over 4 years ago

The current implementation of the active queue management in GenericLinkService is too agressive, which can cause a drop in throughput. 

 Example with 2 NFD nodes connected via UDP tunnel (added 10ms 20ms RTT):  

 ~~~ 
 klaus@consumer:~/work$ ndncatchunks /100m 

 All segments have been received. 
 Time elapsed: 43.3606 seconds 
 Segments received: 23832 
 Transferred size: 104858 kB 
 Goodput: 19.346137 Mbit/s 
 Congestion marks: 13 (caused 13 window decreases) 
 Timeouts: 144 (caused 17 window decreases) 
 Retransmitted segments: 107 (0.446969%), skipped: 37 
 RTT min/avg/max = 10.520/22.440/427.402 ms  
 ~~~ 


 With congestion marks ignored: 

 ~~~ 
 klaus@consumer:~/work$ ndncatchunks --ignore-marks /100m 


 All segments have been received. 
 Time elapsed: 20.561 seconds 
 Segments received: 23832 
 Transferred size: 104858 kB 
 Goodput: 40.798586 Mbit/s 
 Congestion marks: 48 (caused 0 window decreases) 
 Timeouts: 1459 (caused 19 window decreases) 
 Retransmitted segments: 1389 (5.50732%), skipped: 70 
 RTT min/avg/max = 10.574/33.479/1006.184 ms 
 klaus@consumer:~/work$ ndncatchunks /100m  
 ~~~ 

 The queuing (and congestion marking) happens mostly inside NFD, since the links are faster than NFD can process (often the case in real networks too). 


 The solution is to implement a more proper version of CoDel (see https://tools.ietf.org/html/rfc8289), compared to what was done in #4362. 

 There are two simplifications in the current code: 

 1. We measure the queue size (in bytes) rather than queuing delay (ms). 
 2. The first mark happens directly after exceeding the threshold, rather than exceeding the threshold for a given time period (100ms).  

 I think (2) is more significant, and I will look at it first.   

Back