Project

General

Profile

Bug #5082

ndncatchunks low throughput even when retrieving from local content store

Added by susmit shannigrahi 4 months ago. Updated 4 months ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Here is the setup: 2 VMs, 50Mbps link (using tc), transferring a 250MB file.

First time - the transfer goes to the producer, as expected. The throughput is around 32Mbps.

Second time - the transfer happens from the local cache (we verified this). However, the throughput is around 40Mbps - I'd expect this to be much higher.

I see some of the issues described here: https://redmine.named-data.net/issues/4362/
Specifically, I am seeing "Send queue length dropped below congestion threshold". See the attached file.

Any suggestions on how we can get higher throughput from the cache?


Files

ndncatchunks_contentstore.txt (21.1 KB) ndncatchunks_contentstore.txt susmit shannigrahi, 02/14/2020 10:07 AM
#1

Updated by Davide Pesavento 4 months ago

  • Start date deleted (02/14/2020)

I see many congestion marks... what version of NFD are you using? The marking algorithm was made less aggressive in this commit.

#2

Updated by susmit shannigrahi 4 months ago

Davide Pesavento wrote:

I see many congestion marks... what version of NFD are you using? The marking algorithm was made less aggressive in this commit.

0.7.x for ndn-cxx, NFD, and tools.

#3

Updated by susmit shannigrahi 4 months ago

Forgot to mention, we tried with --ignore-marks (and with version discovery disabled). Still, the throughput did not increase when fetching from the cache.

#4

Updated by Klaus Schneider 4 months ago

Probably CPU bound/general NFD performance issue?

Check the CPU utilization on the VMs. If it's 100%, then that's the issue.

#5

Updated by Klaus Schneider 4 months ago

Any suggestions on how we can get higher throughput from the cache?

Disabling logging should help if it's the CPU issue.

I see some of the issues described here: https://redmine.named-data.net/issues/4362/

Not sure what you refer to? The throughput from the cache on my local machine was much higher than 40 Mbps.

Specifically, I am seeing "Send queue length dropped below congestion threshold". See the attached file.

You can safely ignore this message. It's just some trace output about how the congestion marking works, and not a warning.

#6

Updated by susmit shannigrahi 4 months ago

ndncatchunks is not utilizing 100% - however, NFD's CPU usage is consistently over 100%. This is probably why the throughput is low.

I played with the init-cwnd (at around 25-30) and now getting higher throughput at around 250-300Mbps from the local cache. This is still low in my opinion but
looks like you were getting similar numbers so maybe that's the upper bound.

#7

Updated by Klaus Schneider 4 months ago

ndncatchunks is not utilizing 100% - however, NFD's CPU usage is consistently over 100%. This is probably why the throughput is low.

Yeah, that's what I saw in my earlier measurements as well.

#8

Updated by Davide Pesavento 4 months ago

  • Subject changed from ndncatchunks low throughput even when retrieving from local content store to ndncatchunks low throughput even when retrieving from local content store
  • Status changed from New to Rejected

susmit shannigrahi wrote:

ndncatchunks is not utilizing 100% - however, NFD's CPU usage is consistently over 100%. This is probably why the throughput is low.

Yeah, this sounds like the usual "NFD is slow" problem... not much we can do about it (in the short term).

Also available in: Atom PDF