Bug #5149
openError 10060 when fetching RIB dataset: Timeout exceeded
0%
Description
Hello,
I'm getting the subject error while trying to list RIB table using "nfdc route".
Up to 30,000 route entries, the command will work just fine, so do other commands like "nfdc status report".
However, once I hit the 40,000 entries, the NFD is still up, "nfdc route" returns the subject error message, and commands like "nfd status report" will return something like this -notice the empty RIB-:
md@dev:~/Repo$ nfdc status report | more
Error while collecting status report (4010060).
General NFD status:
version=0.7.1-6-g264af773
startTime=20210213T114742.442000
currentTime=20210213T122315.908000
uptime=2133 seconds
nNameTreeEntries=40015
nFibEntries=40002
nPitEntries=2
nMeasurementsEntries=0
nCsEntries=65536
nInInterests=201902
nOutInterests=201902
nInData=214339
nOutData=200958
nInNacks=0
nOutNacks=0
nSatisfiedInterests=200958
nUnsatisfiedInterests=932
...
FIB:
RIB:
CS information:
capacity=65536
admit=on
serve=on
nEntries=65536
nHits=0
nMisses=203449
Strategy choices:
prefix=/ strategy=/localhost/nfd/strategy/best-route/%FD%05
prefix=/ndn/broadcast strategy=/localhost/nfd/strategy/multicast/%FD%03
prefix=/localhost/nfd strategy=/localhost/nfd/strategy/best-route/%FD%05
prefix=/localhost strategy=/localhost/nfd/strategy/multicast/%FD%03
Increasing DEFAULT_INTEREST_LIFETIME in ndn-cxx and ExecuteContext::getTimeout() in nfd didn't solve the problem.
The associated tcpdump:
No. Time Source Destination Protocol Length Info
4 0.000099166 127.0.0.1 127.0.0.1 TCP (NDN) 113 Interest /localhost/nfd/rib/list
6 0.141838378 127.0.0.1 127.0.0.1 TCP (NDN) 4628 Data /localhost/nfd/rib/list/%FD%00%00%01w%9BTbP/%00%00
8 0.142271036 127.0.0.1 127.0.0.1 TCP (NDN) 124 Interest /localhost/nfd/rib/list/%FD%00%00%01w%9BTbP/%00%01
10 0.142307401 127.0.0.1 127.0.0.1 TCP (NDN) 124 Interest /localhost/nfd/rib/list/%FD%00%00%01w%9BTbP/%00%02
12 0.570984767 127.0.0.1 127.0.0.1 TCP (NDN) 124 Interest /localhost/nfd/rib/list/%FD%00%00%01w%9BTbP/%00%01
14 0.571044007 127.0.0.1 127.0.0.1 TCP (NDN) 124 Interest /localhost/nfd/rib/list/%FD%00%00%01w%9BTbP/%00%02
While the expected behaviour would look something like:
No. Time Source Destination Protocol Length Info
4 0.000099349 127.0.0.1 127.0.0.1 TCP (NDN) 113 Interest /localhost/nfd/rib/list
6 0.004395822 127.0.0.1 127.0.0.1 TCP (NDN) 4628 Data /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%00
8 0.004806216 127.0.0.1 127.0.0.1 TCP (NDN) 124 Interest /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%01
10 0.005039007 127.0.0.1 127.0.0.1 TCP (NDN) 124 Interest /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%02
12 0.005201735 127.0.0.1 127.0.0.1 TCP (NDN) 4628 Data /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%01
14 0.005320291 127.0.0.1 127.0.0.1 TCP (NDN) 4630 Data /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%02
16 0.005460251 127.0.0.1 127.0.0.1 TCP (NDN) 124 Interest /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%03
18 0.005794301 127.0.0.1 127.0.0.1 TCP (NDN) 4630 Data /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%03
20 0.005873704 127.0.0.1 127.0.0.1 TCP (NDN) 124 Interest /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%04
22 0.006165100 127.0.0.1 127.0.0.1 TCP (NDN) 4629 Data /localhost/nfd/rib/list/%FD%00%00%01w%9Bg%B6v/%00%04
The issue can be reproduced using both (0.7.1-ppa1~focal and 0.7.1-6-g264af773) by running the following loop:
for i in {1..40000}
do
nfdc route add prefix /routeA/$i nexthop 256 > /dev/null
done
One final note, if I remove surplus routes, the "nfdc route" command works again, i.e. there is a threshold that depends on the prefix length, beyond it, the "nfdc route" no longer retrieves the RIB table. For example, for "/routeA/$i" the threshold is 33483 entries, adding one triggers the issue.