Project

General

Profile

Bug #5149

Error 10060 when fetching RIB dataset: Timeout exceeded

Added by Manar Aldaoud 7 months ago. Updated 7 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
RIB
Target version:
-
Start date:
02/26/2021
Due date:
% Done:

0%

Estimated time:

Description

Hello,
I'm getting the subject error while trying to list RIB table using "nfdc route".
Up to 30,000 route entries, the command will work just fine, so do other commands like "nfdc status report".
However, once I hit the 40,000 entries, the NFD is still up, "nfdc route" returns the subject error message, and commands like "nfd status report" will return something like this -notice the empty RIB-:

md@dev:~/Repo$ nfdc status report | more
Error while collecting status report (4010060).
General NFD status:
version=0.7.1-6-g264af773
startTime=20210213T114742.442000
currentTime=20210213T122315.908000
uptime=2133 seconds
nNameTreeEntries=40015
nFibEntries=40002
nPitEntries=2
nMeasurementsEntries=0
nCsEntries=65536
nInInterests=201902
nOutInterests=201902
nInData=214339
nOutData=200958
nInNacks=0
nOutNacks=0
nSatisfiedInterests=200958
nUnsatisfiedInterests=932
...
FIB:

RIB:

CS information:
capacity=65536
admit=on
serve=on
nEntries=65536
nHits=0
nMisses=203449
Strategy choices:
prefix=/ strategy=/localhost/nfd/strategy/best-route/%FD%05
prefix=/ndn/broadcast strategy=/localhost/nfd/strategy/multicast/%FD%03
prefix=/localhost/nfd strategy=/localhost/nfd/strategy/best-route/%FD%05
prefix=/localhost strategy=/localhost/nfd/strategy/multicast/%FD%03

Increasing DEFAULT_INTEREST_LIFETIME in ndn-cxx and ExecuteContext::getTimeout() in nfd didn't solve the problem.

The associated tcpdump:
No. Time Source Destination Protocol Length Info
4 0.000099166 127.0.0.1 127.0.0.1 TCP (NDN) 113 Interest /localhost/nfd/rib/list
6 0.141838378 127.0.0.1 127.0.0.1 TCP (NDN) 4628 Data /localhost/nfd/rib/list/%FD%00%00%01w%9BTbP/%00%00

8 0.142271036 127.0.0.1 127.0.0.1 TCP (NDN) 124 Interest /localhost/nfd/rib/list/%FD%00%00%01w%9BTbP/%00%01
10 0.142307401 127.0.0.1 127.0.0.1 TCP (NDN) 124 Interest /localhost/nfd/rib/list/%FD%00%00%01w%9BTbP/%00%02
12 0.570984767 127.0.0.1 127.0.0.1 TCP (NDN) 124 Interest /localhost/nfd/rib/list/%FD%00%00%01w%9BTbP/%00%01
14 0.571044007 127.0.0.1 127.0.0.1 TCP (NDN) 124 Interest /localhost/nfd/rib/list/%FD%00%00%01w%9BTbP/%00%02

While the expected behaviour would look something like:
No. Time Source Destination Protocol Length Info
4 0.000099349 127.0.0.1 127.0.0.1 TCP (NDN) 113 Interest /localhost/nfd/rib/list
6 0.004395822 127.0.0.1 127.0.0.1 TCP (NDN) 4628 Data /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%00
8 0.004806216 127.0.0.1 127.0.0.1 TCP (NDN) 124 Interest /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%01
10 0.005039007 127.0.0.1 127.0.0.1 TCP (NDN) 124 Interest /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%02
12 0.005201735 127.0.0.1 127.0.0.1 TCP (NDN) 4628 Data /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%01
14 0.005320291 127.0.0.1 127.0.0.1 TCP (NDN) 4630 Data /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%02
16 0.005460251 127.0.0.1 127.0.0.1 TCP (NDN) 124 Interest /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%03
18 0.005794301 127.0.0.1 127.0.0.1 TCP (NDN) 4630 Data /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%03
20 0.005873704 127.0.0.1 127.0.0.1 TCP (NDN) 124 Interest /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%04
22 0.006165100 127.0.0.1 127.0.0.1 TCP (NDN) 4629 Data /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%04

The issue can be reproduced using both (0.7.1-ppa1~focal and 0.7.1-6-g264af773) by running the following loop:

for i in {1..40000}
do
    nfdc route add prefix /routeA/$i nexthop 256 > /dev/null
done

One final note, if I remove surplus routes, the "nfdc route" command works again, i.e. there is a threshold that depends on the prefix length, beyond it, the "nfdc route" no longer retrieves the RIB table. For example, for "/routeA/$i" the threshold is 33483 entries, adding one triggers the issue.


Related issues

Related to NFD - Bug #2174: Multiple register prefix gives NFD error "request timed out (code: 10060)"New11/13/2014

Actions
#1

Updated by Junxiao Shi 7 months ago

  • Related to Bug #2174: Multiple register prefix gives NFD error "request timed out (code: 10060)" added
#2

Updated by Junxiao Shi 7 months ago

  • Category changed from Tables to RIB

This issue is similar to #2174, but differs in:

  • #5149 is a timeout in rib/list dataset
  • #2174 is a timeout in rib/register command

Also available in: Atom PDF