Project

General

Profile

Actions

Bug #2626

open

CS lookup takes long time if Interest Name covers many entries

Added by John DeHart about 9 years ago. Updated about 6 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Tables
Target version:
-
Start date:
03/09/2015
Due date:
% Done:

0%

Estimated time:

Description

We have been noticing odd spikes in ndnping RTTs and have been trying to track down the root cause of it.

I believe I have narrowed it down to an interaction between the ndnping Interests and the Interests we send out to collect data
from nfd for the ndnmap.arl.wustl.edu bandwidth display.

I have some very specific data now on something strange that is happening with nfd.

I include a portion of the combined log files from nfd and an ndndump I was running on the WU Testbed node at the time of such a spike. The ndnping in question has a reference number of: 1744606568
I have modified the log messages to insert “nfd” and “ndndump” to help keep straight which messages
are from which log. Please note that this is all from the WU Testbed node so the times should be
directly comparable.

At time 1425669071.230902, nfd performs a [ContentStore] find /localhost/nfd/faces/list R but it doesn't register [ContentStore] no-match until 1425669071.253650, which is
22.748 ms later. There were NO nfd log messages in between.
During that time, ndndump registers the arrival of 7 new Interests.

From this it would appear that the search of the CS took 22.748 ms.

1425669071.226706 ndndump From: 128.252.153.28, To: 128.252.153.194, Tunnel Type: UDP, INTEREST: /ndn/edu/wustl/ndnmap/stats/128.196.203.36/72.36.112.82/141.225.11.173/193.147.51.186?ndn.MustBeFresh=1&ndn.InterestLifetime=700&ndn.Nonce=3582782466
1425669071.226759 nfd DEBUG: [Forwarder] onIncomingInterest face=266 interest=/ndn/edu/wustl/ndnmap/stats/128.196.203.36/72.36.112.82/141.225.11.173/193.147.51.186
1425669071.227090 nfd DEBUG: [ContentStore] find /ndn/edu/wustl/ndnmap/stats/128.196.203.36/72.36.112.82/141.225.11.173/193.147.51.186 L
1425669071.229658 nfd DEBUG: [ContentStore]   no-match
1425669071.229736 nfd DEBUG: [AccessStrategy] /ndn/edu/wustl/ndnmap/stats/128.196.203.36/72.36.112.82/141.225.11.173/193.147.51.186?ndn.MustBeFresh=1&ndn.InterestLifetime=700&ndn.Nonce=3582782466 interestFrom 266 new-interest mi=/ndn/edu/wustl/ndnmap/stats/128.196.203.36/72.36.112.82/141.225.11.173
1425669071.230136 nfd DEBUG: [AccessStrategy] /ndn/edu/wustl/ndnmap/stats/128.196.203.36/72.36.112.82/141.225.11.173/193.147.51.186?ndn.MustBeFresh=1&ndn.InterestLifetime=700&ndn.Nonce=3582782466 interestTo 269 last-nexthop rto=111227
1425669071.230395 nfd DEBUG: [Forwarder] onOutgoingInterest face=269 interest=/ndn/edu/wustl/ndnmap/stats/128.196.203.36/72.36.112.82/141.225.11.173/193.147.51.186
1425669071.230794 nfd DEBUG: [Forwarder] onIncomingInterest face=269 interest=/localhost/nfd/faces/list
1425669071.230902 nfd DEBUG: [ContentStore] find /localhost/nfd/faces/list R
1425669071.231738 ndndump From: 128.252.153.28, To: 128.252.153.194, Tunnel Type: UDP, INTEREST: /ndn/edu/ucla/ndnmap/stats/192.172.226.248/128.97.98.7/129.82.138.48/162.105.146.26/128.195.4.36?ndn.MustBeFresh=1&ndn.InterestLifetime=700&ndn.Nonce=2821820197
1425669071.236867 ndndump From: 128.252.153.28, To: 128.252.153.194, Tunnel Type: UDP, INTEREST: /ndn/edu/arizona/ndnmap/stats/192.172.226.248/129.82.138.48/141.225.11.173/133.9.73.66/128.187.81.12?ndn.MustBeFresh=1&ndn.InterestLifetime=700&ndn.Nonce=2335882350
1425669071.241170 ndndump From: 172.16.21.176, To: 128.252.153.194, Tunnel Type: UDP, INTEREST: /ndn/edu/wustl/ping/1744606568?ndn.MustBeFresh=1&ndn.Nonce=1744606568
1425669071.241297 ndndump From: 128.252.153.28, To: 128.252.153.194, Tunnel Type: UDP, INTEREST: /ndn/org/caida/ndnmap/stats/128.195.4.36/202.120.188.176?ndn.MustBeFresh=1&ndn.InterestLifetime=700&ndn.Nonce=2532105913
1425669071.244815 ndndump From: 128.252.153.28, To: 128.252.153.194, Tunnel Type: UDP, INTEREST: /ndn/ch/unibas/ndnmap/stats/193.147.51.186/162.105.146.26/137.194.54.81?ndn.MustBeFresh=1&ndn.InterestLifetime=700&ndn.Nonce=3200941563
1425669071.249054 ndndump From: 128.252.153.28, To: 128.252.153.194, Tunnel Type: UDP, INTEREST: /ndn/cn/edu/pku/ndnmap/stats/202.120.188.176/114.247.165.44?ndn.MustBeFresh=1&ndn.InterestLifetime=700&ndn.Nonce=338471191
1425669071.253186 ndndump From: 128.252.153.28, To: 128.252.153.194, Tunnel Type: UDP, INTEREST: /ndn/edu/neu/ndnmap/stats/162.105.146.26?ndn.MustBeFresh=1&ndn.InterestLifetime=700&ndn.Nonce=763898399
1425669071.253650 nfd DEBUG: [ContentStore]   no-match
1425669071.253702 nfd DEBUG: [Forwarder] onOutgoingInterest face=1 interest=/localhost/nfd/faces/list
1425669071.253788 nfd DEBUG: [BestRouteStrategy2] /localhost/nfd/faces/list?ndn.ChildSelector=1&ndn.MustBeFresh=1&ndn.Nonce=2666790337 from=269 newPitEntry-to=1
1425669071.253909 nfd DEBUG: [InternalFace] received Interest: /localhost/nfd/faces/list
1425669071.253986 nfd DEBUG: [InternalFace] found Interest filter for /localhost/nfd/faces (previous match)
1425669071.254052 nfd DEBUG: [FaceManager] command result: processing verb: list
1425669071.257714 nfd DEBUG: [Forwarder] onIncomingData face=1 data=/localhost/nfd/faces/list/%FD%00%00%01K%F0%7F%A1%96/%00%00
1425669071.257882 nfd DEBUG: [ContentStore] insert /localhost/nfd/faces/list/%FD%00%00%01K%F0%7F%A1%96/%00%00
1425669071.258040 ndndump From: 128.252.153.28, To: 128.252.153.194, Tunnel Type: UDP, INTEREST: /ndn/it/unipd/ndnmap/stats/72.36.112.82/192.43.193.111/193.147.51.186/161.105.195.18?ndn.MustBeFresh=1&ndn.InterestLifetime=700&ndn.Nonce=738760142
1425669071.258043 nfd DEBUG: [Forwarder] onIncomingData matching=/localhost/nfd/faces/list
1425669071.258123 nfd DEBUG: [Strategy] beforeSatisfyInterest pitEntry=/localhost/nfd/faces/list inFace=1 data=/localhost/nfd/faces/list/%FD%00%00%01K%F0%7F%A1%96/%00%00
1425669071.258333 nfd DEBUG: [Forwarder] onOutgoingData face=269 data=/localhost/nfd/faces/list/%FD%00%00%01K%F0%7F%A1%96/%00%00
1425669071.258514 nfd DEBUG: [Forwarder] onIncomingInterest face=266 interest=/ndn/edu/ucla/ndnmap/stats/192.172.226.248/128.97.98.7/129.82.138.48/162.105.146.26/128.195.4.36
1425669071.258815 nfd DEBUG: [ContentStore] find /ndn/edu/ucla/ndnmap/stats/192.172.226.248/128.97.98.7/129.82.138.48/162.105.146.26/128.195.4.36 L
1425669071.261579 nfd DEBUG: [ContentStore]   no-match
1425669071.261652 nfd DEBUG: [Forwarder] onOutgoingInterest face=260 interest=/ndn/edu/ucla/ndnmap/stats/192.172.226.248/128.97.98.7/129.82.138.48/162.105.146.26/128.195.4.36
1425669071.261920 ndndump From: 128.252.153.194, To: 72.36.112.82, Tunnel Type: UDP, INTEREST: /ndn/edu/ucla/ndnmap/stats/192.172.226.248/128.97.98.7/129.82.138.48/162.105.146.26/128.195.4.36?ndn.MustBeFresh=1&ndn.InterestLifetime=700&ndn.Nonce=2821820197
1425669071.261945 nfd DEBUG: [BestRouteStrategy2] /ndn/edu/ucla/ndnmap/stats/192.172.226.248/128.97.98.7/129.82.138.48/162.105.146.26/128.195.4.36?ndn.MustBeFresh=1&ndn.InterestLifetime=700&ndn.Nonce=2821820197 from=266 newPitEntry-to=260
1425669071.262232 nfd DEBUG: [Forwarder] onIncomingInterest face=303 interest=/ndn/edu/wustl/ping/1744606568

Here are some other examples of CS 'find' operations:

1425669061.723974 DEBUG: [ContentStore] find /localhost/nfd/faces/events R
1425669061.725418 DEBUG: [ContentStore]   no-match

1425669105.051430 DEBUG: [ContentStore] find /localhost/nfd/faces/list R
1425669105.074600 DEBUG: [ContentStore]   no-match

1425669001.213923 DEBUG: [ContentStore] find /localhost/nfd/status R
1425669001.214097 DEBUG: [ContentStore]   no-match

1425669001.222970 DEBUG: [ContentStore] find /localhost/nfd/faces/channels R
1425669001.223182 DEBUG: [ContentStore]   no-match

1425669001.639833 DEBUG: [ContentStore] find /localhost/nfd/faces/query/%96%1Cr%1Audp4%3A%2F%2F128.252.153.28%3A6363 R
1425669001.640152 DEBUG: [ContentStore]   no-match

1425669105.099019 DEBUG: [ContentStore] find /ndn/it/unipd/ndnmap/stats/72.36.112.82/192.43.193.111/193.147.51.186/161.105.195.18 L
1425669105.101716 DEBUG: [ContentStore]   no-match

1425669105.242469 DEBUG: [ContentStore] find /ndn/edu/wustl/ping/1744606602 L
1425669105.242597 DEBUG: [ContentStore]   no-match

Analyzing these times, I get this:

   Time Diff       find log msg      no-match log mst               name
0.174045562744   1425669001.213923   1425669001.214097   find   /localhost/nfd/status   R   no-match
0.211954116821   1425669001.222970   1425669001.223182   find   /localhost/nfd/faces/channels   R   no-match
0.319004058838   1425669001.639833   1425669001.640152   find   /localhost/nfd/faces/query/%96%1Cr%1Audp4%3A%2F%2F128.252.153.28%3A6363   R   no-match
1.44410133362    1425669061.723974   1425669061.725418   find   /localhost/nfd/faces/events   R   no-match
23.1699943542    1425669105.051430   1425669105.074600   find   /localhost/nfd/faces/list   R   no-match
2.6969909668     1425669105.099019   1425669105.101716   find   /ndn/it/unipd/ndnmap/stats/72.36.112.82/192.43.193.111/193.147.51.186/161.105.195.18   L   no-match
0.128030776978   1425669105.242469   1425669105.242597   find   /ndn/edu/wustl/ping/1744606602   L   no-match

So, it appears that the find for /localhost/nfd/faces/list takes a lot longer than
any of the other find operations.

Actions #1

Updated by Davide Pesavento about 9 years ago

  • Description updated (diff)
Actions #2

Updated by Junxiao Shi about 9 years ago

  • Subject changed from Long delays for some Content Store Lookups to CS lookup takes long time if Interest Name covers many entries
  • Category set to Tables
  • Target version set to v0.4

As stated in #2254 note-13:

ContentStore has worst case performance: all Data packets in the ContentStore will be visited.

The worst case scenario happens when Interest Name=ndn:/, simple selectors reject all Data packets.

This Bug is an observation of this worst case scenario:

  1. ndnmap client requests FaceStatus dataset every 1300ms.
    FaceStatus dataset has FreshnessPeriod set to 1000ms, which means every request from ndnmap client would generate a new version
  2. Data packets of all versions are cached in ContentStore under ndn:/localhost/nfd/faces/list prefix.
    Without CS eviction, this namespace grows by 2769 children per hour.
  3. CS lookup for an Interest with ndn:/localhost/nfd/faces/list prefix visits all entries under this prefix until a Data matching Selectors is found.
    However, the Interest from ndnmap client has MustBeFresh (as required by StatusDataset) and no cached Data can satisfy it.
    This causes all entries to be visited, and therefore the duration of this lookup grows linearly as more entries are inserted.

This problem is confirmed without ndnmap client:

  1. enable DEBUG log for ContentStore
  2. for I in $(seq 36000); do nfd-status -f; sleep 0.1; done
  3. parse NFD logs with:

    awk '
    BEGIN {
      T = 0
    }
    $3=="[ContentStore]" && $4=="find" && $5=="/localhost/nfd/faces/list" {
      T = $1
      next
    }
    $3=="[ContentStore]" && T>0 {
      if ($4=="no-match") {
        print int(($1 - T) * 1000000)
      }
      T = 0
    }
    ' nfd.log
    
  4. visualize the trend of parsed dataset

    X-axis is number of requests; Y-axis is duration of lookup

    trend

One possible solution is to relax system requirements: limit the number of visited entries on each CS lookup, and return "no-match" when exceeding this limit.

Actions #3

Updated by John DeHart about 9 years ago

What is the difference in how the CS is searched for these two:

1425669061.723974 DEBUG: [ContentStore] find /localhost/nfd/faces/events R
1425669061.725418 DEBUG: [ContentStore]   no-match

1425669105.051430 DEBUG: [ContentStore] find /localhost/nfd/faces/list R
1425669105.074600 DEBUG: [ContentStore]   no-match

Neither finds a match. One takes under 2ms and the other over 20ms.

Actions #4

Updated by Junxiao Shi about 9 years ago

/localhost/nfd/faces/lists grows by 2769 children per hour, assuming one dataset request per 1300ms;

/localhost/nfd/faces/events grows by 2 children per face create+destroy, which is much slower than the namespace above.

It's worth noting that this Bug isn't introduced in #2254. Ilya's CS is expected to have same problem, because algorithm is similar.

Actions #5

Updated by John DeHart about 9 years ago

Junxiao Shi wrote:

However, the Interest from ndnmap client has MustBeFresh (as required by StatusDataset) and no cached Data can satisfy it.

This causes all entries to be visited, and therefore the duration of this lookup grows linearly as more entries are inserted.

So, if we have MustBeFresh, don't we know ahead of time that we are not and by definition can not find
data in the CS that will satisfy this Interest? If so, then why not skip the CS lookup?
Is the thought that MustBeFresh should be rare and we don't want to spend the processing time
checking for that on all Interests?

Actions #6

Updated by John DeHart about 9 years ago

One additional question. Is there a way to mark data so it does not get cached?
If we could do that with the ndnmap and nfd-status requests then they
would not interfere with application interests and data.

Actions #7

Updated by Davide Pesavento about 9 years ago

John DeHart wrote:

So, if we have MustBeFresh, don't we know ahead of time that we are not and by definition can not find
data in the CS that will satisfy this Interest?

Uhm no, you can have data in the CS that is fresh (i.e. whose FreshnessPeriod hasn't expired yet).

One additional question. Is there a way to mark data so it does not get cached? If we could do that with the ndnmap and nfd-status requests then they would not interfere with application interests and data.

IIUC, the NoCache caching policy should help in this case, see http://redmine.named-data.net/projects/nfd/wiki/LocalControlHeader#CachingPolicy-feature

Actions #8

Updated by John DeHart about 9 years ago

Ahh. Ok. For some reason I had interpretted MustBeFresh to mean must come from the original producer.

IIUC, the NoCache caching policy should help in this case, see http://redmine.named-data.net/projects/nfd/wiki/LocalControlHeader#CachingPolicy-feature

I see NoCache in ndn-cxx but I don't see it used anywhere in NFD except in one test. Has it been implemented?

Actions #9

Updated by Anonymous about 9 years ago

I don't think so. It should be feature #2184, but the implementation is "blocked" by the management refactoring (in the sense of avoiding repeat work). The management refactoring is blocked by a question about extracting about identities from signed Interests.

I'm on the hook for #2184. If it's useful to you, and Alex + Junxiao are willing to approve code, I'm willing to implement it for you ASAP.

Actions #10

Updated by John DeHart about 9 years ago

I am not sure NoCache makes sense for this. I suspect that this data is handled
the way it is to avoid a DOS type attack and if we change the caching policy for it so
that all requests go all the way to nfd management code then we have probably opened
that doorway for attack.

Actions #11

Updated by Junxiao Shi about 9 years ago

One possible solution is to relax system requirements: limit the number of visited entries on each CS lookup, and return "no-match" when exceeding this limit.

NoCache can mask this Bug, but it cannot solve the general problem in note-2.

The following issues are needed to use NoCache in StatusDataset from NFD Management: #2185, #2182.

Without #2182, StatusDataset with multiple segments won't work.

Actions #12

Updated by John DeHart about 9 years ago

I think limiting the number of visited entries makes sense. What number do you propose?

I'm also not sure what the difference is for these requests if we specify the ChildSelector or not.

We currently set the ChildSelector to 1 for the ndnmap requests which gives us a RightMost find.
We do this because that is what nfd-status does.

What happens if we set it to 0 and use a LeftMost find?

I'm trying that right now and the worst case time for the find is about 2.5 ms.

Actions #13

Updated by Junxiao Shi about 9 years ago

I think limiting the number of visited entries makes sense. What number do you propose?

I don't have a specific number in mind. This needs benchmark.

What happens if we set it to 0 and use a LeftMost find?

I'm trying that right now and the worst case time for the find is about 2.5 ms.

Count of visited entries is the same when there's no match.

Rightmost lookup procedure calls lower_bound many times (linear to count of unique next-components under Interest Name).
Leftmost lookup procedure calls lower_bound only once, so it is faster.

"Visiting an entry" means calling the procedure that determines whether the Interest including simple Selectors matches the cached Data.
Merely comparing the Name to another entry doesn't count as a visit.


Given these stored entries:

  • /P/v0/s0, /P/v0/s1, .., /P/v0/s99
  • /P/v1/s0, /P/v1/s1, .., /P/v1/s99
  • ..
  • /P/v9/s0, /P/v9/s1, .., /P/v9/s99

Current rightmost lookup procedure "PROC-B" is optimized for the expected case that there's a match near the rightmost end (eg. /P/v9/s0), and minimizes count of visited entries in this expected case.

However, when there's no match due to simple Selectors (such as MustBeFresh), the procedure would visit all entries, and also perform 10 lower_bounds (to locate /P/v9/s0, /P/v8/s0, .., /P/v0/s0).

An alternate procedure "PROC-A" has slightly better worst case performance, but its expected case performance is worse: all entries under Interest Name are always visited, regardless of whether there's a match; but it does not call lower_bound.

Yet another procedure "PROC-C" can visit the entries backwards.
It shall be better than PROC-A in expected case, and has same worst case performance as PROC-A.

When there's a match (at /P/v9/s0):

count of visited entries count of lower_bounds
PROC-A 1000 1
PROC-B 1 2
PROC-C 100 1

When there's no match:

count of visited entries count of lower_bounds
PROC-A 1000 1
PROC-B 1000 11
PROC-C 1000 1

Back to this Bug: the namespace of stored entries after one hour of ndnmap client execution looks like:
(suppose FaceStatus dataset has 2 segments)

  • /localhost/nfd/faces/list/v0/s0, /localhost/nfd/faces/list/v0/s1
  • /localhost/nfd/faces/list/v1/s0, /localhost/nfd/faces/list/v1/s1
  • ..
  • /localhost/nfd/faces/list/v2768/s0, /localhost/nfd/faces/list/v2768/s1

When there's a match (at /localhost/nfd/faces/list/v2768/s0):

count of visited entries count of lower_bounds
PROC-A 5536 1
PROC-B 1 2
PROC-C 2 1

When there's no match:

count of visited entries count of lower_bounds
PROC-A 5536 1
PROC-B 5536 2770
PROC-C 5536 1

I wish the ContentStore could select a procedure based on "what kind of namespace it is".
In the examples above, PROC-C has better average performance if there are many versions but few segments per version, while PROC-B has better average performance is there are few versions but many segments per version.

However, the ContentStore index is an ordered container, which doesn't allow the algorithm to make this choice.

Actions #14

Updated by John DeHart about 9 years ago

That helped a lot. I am now convinced that the ndnmap interests should ask for LeftMost
since we know that the expected result is no-match, we don't what to spend the worst-case
time of RightMost.

I'll run some more tests to make sure nothing else weird happens.

Actions #15

Updated by Junxiao Shi over 8 years ago

20150727 conference call discussed this problem.

Lixia points out that this would be a DDoS attack.

Alex proposes that ContentStore could consider Name+Exclude only: Name must be an exact match of Name without implicit digest, and Exclude can only reject implicit digests.

This shall allow the ContentStore to visit only a small number of entries (all having the same Name without implicit digest), but require all other Interests to be processed by the producer.

Actions #16

Updated by Alex Afanasyev over 8 years ago

  • Target version changed from v0.4 to v0.5
Actions #17

Updated by Davide Pesavento about 6 years ago

  • Target version deleted (v0.5)
Actions

Also available in: Atom PDF