Project

General

Profile

Actions

Bug #2757

closed

Gateway RIB crashes after remote unregistration

Added by Yanbiao Li almost 10 years ago. Updated almost 10 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
RIB
Target version:
Start date:
04/17/2015
Due date:
% Done:

100%

Estimated time:

Description

When the RIB manager on a gateway processes a remote unregistration command from a laptop, nfd process on the gateway crashes after sending a successful response.

Environment: two machines, A works as the laptop while B works as the gateway.

Steps to reproduce:

  1. run nfd on A with rib.remote-register enabled, and set the log level to INFO
  2. generate an identity /Z/A on A and install its cert.
  3. run nfd on B with rib.localhop-security enabled, configure the trust anchor as type any.
  4. run nfdc register /localhop/nfd udp4://<ip_address_of_B> on A to gain connectivity to B.
  5. run ndnpingserver ndn:/Z/A/H on A to register prefix /Z/A/H/ping locally. Confirm that /Z/A is successfully registered to B's rib (run nfd-status -r on B).
  6. stop ndnpingserver on A. Confirm that /Z/A/H/ping is unregistered from A's rib and there is not any other entries on A's rib starts with prefix /Z/A. So that /Z/A will be unregistered from B's rib.

Actual: A's nfd can receive the response of successfully unregistration of /Z/A from B. B's nfd has exited unexpectedly.

Expected: B's nfd does not crash.


Files

ok.png (93.8 KB) ok.png without a route toward the laptop Yanbiao Li, 04/21/2015 04:47 PM
crash.png (80.4 KB) crash.png with a route toward the laptop Yanbiao Li, 04/21/2015 04:47 PM
laptop.sh (642 Bytes) laptop.sh Yanbiao Li, 04/22/2015 01:52 PM
laptop.nfd.conf (10.8 KB) laptop.nfd.conf Yanbiao Li, 04/22/2015 01:52 PM
server.nfd.conf (10.8 KB) server.nfd.conf Yanbiao Li, 04/22/2015 01:52 PM
system.inf (129 Bytes) system.inf Yanbiao Li, 04/22/2015 01:52 PM
server.sh (377 Bytes) server.sh Yanbiao Li, 04/22/2015 01:52 PM
Actions #1

Updated by Junxiao Shi almost 10 years ago

  • Subject changed from nfd process will exit unexpectedly after the rib manager performed remote prefix unregistration. to Gateway RIB crashes after remote unregistration
  • Description updated (diff)
  • Target version set to v0.4

Offending commit is probably commit:76c751ce80109cd429cd45d32a04015f7715546b.

Actions #2

Updated by Vince Lehman almost 10 years ago

I tried to reproduce this bug, but was unable. Could you check to make sure I did not miss a step and that I generated the cert correctly?

On laptop:

rib.remote-register is enabled; default_level is INFO

$ sudo nfd-start

$ ndnsec-keygen /Z/A > key.req
$ ndnsec-certgen -N /Z/A key.req > tmp.cert
$ ndnsec-cert-install tmp.cert
$ ndnsec-set-default /Z/A

On server:

Uncomment rib.localhop-security; change trust anchor to type “any”
$ sudo nfd-start

On laptop:

$ nfdc register /localhop/nfd udp4://server-IP

$ ndnpingserver ndn:/Z/A/H

FIB:
  /localhost/nfd nexthops={faceid=1 (cost=0)}
  /Z/A/H/ping nexthops={faceid=261 (cost=0)}
  /localhop/nfd nexthops={faceid=260 (cost=0)}
  /localhost/nfd/rib nexthops={faceid=258 (cost=0)}
RIB:
  /localhost/nfd/rib route={faceid=258 (origin=0 cost=0 ChildInherit)}
  /localhop/nfd route={faceid=260 (origin=255 cost=0 ChildInherit)}
  /Z/A/H/ping route={faceid=261 (origin=0 cost=0 ChildInherit)}

On server:

1429567713.335752 DEBUG: [RibManager] Parameters parsed OK
1429567713.335786 DEBUG: [RibManager] command result: processing verb: register
1429567713.346025 INFO: [RibManager] Adding route /Z/A nexthop=260 origin=65 cost=15
1429567713.368157 INFO: [RemoteRegistrator] no hub connected when registering /Z/A
1429567713.368395 DEBUG: [RibManager] RIB update succeeded for RibUpdate {
  Name: /Z/A
  Action: REGISTER
  Route(faceid: 260, origin: 65, cost: 15, flags: 1, never expires)
}

FIB:
  /Z/A nexthops={faceid=260 (cost=15)}
  /localhost/nfd nexthops={faceid=1 (cost=0)}
  /localhop/nfd/rib nexthops={faceid=259 (cost=0)}
  /localhost/nfd/rib nexthops={faceid=259 (cost=0)}
RIB:
  /localhost/nfd/rib route={faceid=259 (origin=0 cost=0 ChildInherit)}
  /localhop/nfd/rib route={faceid=259 (origin=0 cost=0 ChildInherit)}
  /Z/A route={faceid=260 (origin=65 cost=15 ChildInherit)}

On laptop:

ctl^c to quit ndnpingserver

FIB:
  /localhost/nfd nexthops={faceid=1 (cost=0)}
  /localhop/nfd nexthops={faceid=260 (cost=0)}
  /localhost/nfd/rib nexthops={faceid=258 (cost=0)}
RIB:
  /localhost/nfd/rib route={faceid=258 (origin=0 cost=0 ChildInherit)}
  /localhop/nfd route={faceid=260 (origin=255 cost=0 ChildInherit)}

On server:

1429567782.676441 DEBUG: [RibManager] Parameters parsed OK
1429567782.676485 DEBUG: [RibManager] command result: processing verb: unregister
1429567782.686422 INFO: [RibManager] Removing route /Z/A nexthop=260 origin=65
1429567782.707531 INFO: [RemoteRegistrator] no hub connected when unregistering /Z/A
1429567782.707796 DEBUG: [RibManager] RIB update succeeded for RibUpdate {
  Name: /Z/A
  Action: UNREGISTER
  Route(faceid: 260, origin: 65, cost: 0, flags: 0, expires in: 9216282403347378533 nanoseconds)
}

FIB:
  /localhost/nfd nexthops={faceid=1 (cost=0)}
  /localhop/nfd/rib nexthops={faceid=259 (cost=0)}
  /localhost/nfd/rib nexthops={faceid=259 (cost=0)}
RIB:
  /localhost/nfd/rib route={faceid=259 (origin=0 cost=0 ChildInherit)}
  /localhop/nfd/rib route={faceid=259 (origin=0 cost=0 ChildInherit)}

After the server received the unregister command, it continued to run and seemed to function properly. I was able to retrieve the nfd-status and add a route to the RIB.

Nfd version on laptop:

nfd -V
0.1.0-316-ge06b627

Nfd version on server:

$ nfd -V
0.3.1-12-ge8f4246

which are both after the FibUpdater commit

Actions #3

Updated by Junxiao Shi almost 10 years ago

  • Assignee set to Vince Lehman

Vince agreed to work on this Bug at 20150420 conference call.

Updated by Yanbiao Li almost 10 years ago

Thanks for your testing.

The only difference between our configurations is that there is a route on the server toward the laptop in my testing environment. (I need this route for other test purpose)

I tested again. As long as there is a route toward the laptop on the server, the nfd will crash after remote unregistration. (see attached pictures)

Actions #5

Updated by Junxiao Shi almost 10 years ago

As shown in crash.png, this appears to be a problem with route inheritance.

Actions #6

Updated by Vince Lehman almost 10 years ago

I've added the step to register a route from the server back to the laptop:

On server:

$ nfd-status -r
RIB:
  /localhost/nfd/rib route={faceid=259 (origin=0 cost=0 ChildInherit)}
  /localhop/nfd/rib route={faceid=259 (origin=0 cost=0 ChildInherit)}

$ nfdc register / udp4://<laptop-IP>
Successful in name registration: ControlParameters(Name: /, FaceId: 263, Origin: 255, Cost: 0, Flags: 1, )

$ nfd-status -r
RIB:
  /localhost/nfd/rib route={faceid=259 (origin=0 cost=0 ChildInherit)}
  /localhop/nfd/rib route={faceid=259 (origin=0 cost=0 ChildInherit)}
  / route={faceid=263 (origin=255 cost=0 ChildInherit)}

On laptop:

$ ndnpingserver /Z/A/H

On server:

$ nfd-status -r
RIB:
  /localhost/nfd/rib route={faceid=259 (origin=0 cost=0 ChildInherit)}
  /localhop/nfd/rib route={faceid=259 (origin=0 cost=0 ChildInherit)}
  /Z/A route={faceid=263 (origin=65 cost=15 ChildInherit)}
  / route={faceid=263 (origin=255 cost=0 ChildInherit)}

On laptop:

ctl^c to kill ndnpingserver

$ nfd-status -r 
RIB:
  /localhost/nfd/rib route={faceid=258 (origin=0 cost=0 ChildInherit)}
  /localhop/nfd route={faceid=261 (origin=255 cost=0 ChildInherit)}

On server:

$ nfd-status -r
RIB:
  /localhost/nfd/rib route={faceid=259 (origin=0 cost=0 ChildInherit)}
  /localhop/nfd/rib route={faceid=259 (origin=0 cost=0 ChildInherit)}
  / route={faceid=263 (origin=255 cost=0 ChildInherit)}

Which OS/versions are you using for the laptop and the gateway?

Updated by Yanbiao Li almost 10 years ago

both the laptop and the server are ubuntu 14.04. The latest NFD is run on both ends.

I uploaded my test scripts and corresponding config files:

laptop.sh; server.sh; laptop.nfd.conf; server.nfd.conf

1/ on the server: sh server.sh LAPTOP_IP

2/ on the laptop: sh laptop.sh SERVER_IP

3/ run nfd-status -r on the server. I get "ERROR: error while connecting to the forwarder (Connection refused)"

Actions #8

Updated by Vince Lehman almost 10 years ago

  • Status changed from New to In Progress

I am able to reproduce the bug using the command line on a single machine:

$ sudo nfd-start
$ nfdc register / 258
$ nfdc register -o 65 -c 15 /Z/A 258
$ nfdc unregister -o 65 /Z/A 258
$ nfd-status
ERROR: error while connecting to the forwarder (Connection refused)

This is a problem caused by a FibUpdate being generated for a namespace that is removed from the RIB.

The RIB searches for the namespace and tries to apply the FibUpdate to the namespace. There is a BOOST_ASSERT in the code to check if the namespace does
not exist, but there is no code to stop the RibEntry from being dereferenced. When I compile with the --debug flag and run the above commands, I see that
the assertion fails.

I will push a patch to stop the FibUpdater from generating FibUpdates for a namespace that will be removed.

Actions #9

Updated by Vince Lehman almost 10 years ago

  • Status changed from In Progress to Code review
  • % Done changed from 0 to 90
Actions #10

Updated by Yanbiao Li almost 10 years ago

I run tests with latest commit, this bug has been resolved according to the results.

Actions #11

Updated by Junxiao Shi almost 10 years ago

  • Status changed from Code review to Closed
  • % Done changed from 90 to 100
Actions

Also available in: Atom PDF