Project

General

Profile

Actions

Bug #3362

closed

NFD crash after setInterestFilter for same prefix after app restart

Added by Alex Afanasyev almost 9 years ago. Updated almost 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
RIB
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:

Description

I'm experiencing a stable crash of NFD in the following scenario:

  • A simple app calls setInterestFilter
  • Ctrl-C to stop the app
  • Immediately call the same app to setInterestFilter for the same name

Here is the output logs

  • This is from the first time app starts
1450041822.524297 INFO: [UnixStreamTransport] [id=0,local=unix:///private/tmp/nfd.sock,remote=fd://31] Creating transport
1450041822.524420 INFO: [FaceTable] Added face id=261 remote=fd://31 local=unix:///private/tmp/nfd.sock
1450041822.530105 INFO: [RibManager] Adding route /hello/world nexthop=261 origin=0 cost=0
1450041822.536433 INFO: [AutoPrefixPropagator] no hub connected to propagate /
  • This is from the second time app starts
1450041828.586769 INFO: [Transport] [id=261,local=unix:///private/tmp/nfd.sock,remote=fd://31] setState UP -> FAILED
1450041828.587135 INFO: [Transport] [id=261,local=unix:///private/tmp/nfd.sock,remote=fd://31] setState FAILED -> CLOSED
1450041828.591469 INFO: [FaceTable] Removed face id=261 remote=fd://31 local=unix:///private/tmp/nfd.sock (LpFace failed)
1450041828.592871 INFO: [AutoPrefixPropagator] should be kept for another RIB entry: /localhost/nfd/rib
1450041829.380194 INFO: [UnixStreamTransport] [id=0,local=unix:///private/tmp/nfd.sock,remote=fd://31] Creating transport
1450041829.380317 INFO: [FaceTable] Added face id=262 remote=fd://31 local=unix:///private/tmp/nfd.sock
1450041829.385987 INFO: [RibManager] Adding route /hello/world nexthop=262 origin=0 cost=0
Assertion failed: (!entryIt->second.isNew()), function afterInsertRibEntry, file ../rib/auto-prefix-propagator.cpp, line 179.

Could be some effect of Ctrl-C

I have tested only on OSX 10.11.2 platform


Simple app that I'm using

#include <ndn-cxx/face.hpp>
#include <ndn-cxx/security/key-chain.hpp>

#include <memory>

int
main()
{
  auto data = std::make_shared<ndn::Data>("/hello/world");
  ndn::KeyChain keyChain;
  keyChain.sign(*data);

  std::cout << data->getFullName() << std::endl;

  ndn::Face face;
  face.setInterestFilter("/hello/world",
                         [&] (const ndn::InterestFilter&, const ndn::Interest& i) {
                           std::cerr << " << i " << i << std::endl;
                           if (i.getName() == data->getFullName()) {
                             face.put(*data);
                           }
                         },
                         nullptr);

  face.processEvents();
}

Related issues 1 (0 open1 closed)

Has duplicate NFD - Bug #3429: nrd assertion failure in rib/auto-prefix-propagator.cpp:179Duplicate01/25/2016

Actions
Actions #1

Updated by Alex Afanasyev almost 9 years ago

  • Assignee set to Yanbiao Li
Actions #2

Updated by Alex Afanasyev almost 9 years ago

What I'm observing is that the sample app face is not destructed when I press Ctrl-C. It is only destructed when the app is started for the second time. Another observation is that running another app that connects to NFD's unix socket does not cause the first app's face to be destroyed... I'm a little puzzled.

@Yanbiao. We need to update RIB manager's logic to not issue the assert and ensure that the processing is correct.

Actions #3

Updated by Alex Afanasyev almost 9 years ago

I have more input on this issue. I tracked down the problem to having root identity configured on my machine.

To complete the scenario in this bug report, run

ndnsec-keygen / | ndnsec-install-cert -

The problem seem to be entirely inside the Automatic Prefix Propagation module. Based on logs, face is being properly destroyed when Ctrl-C is pressed. Here is what I see in the log:

  • after app started
1450334272.471824 INFO: [UnixStreamTransport] [id=0,local=unix:///private/tmp/nfd.sock,remote=fd://31] Creating transport
1450334272.471950 INFO: [FaceTable] Added face id=261 remote=fd://31 local=unix:///private/tmp/nfd.sock
1450334272.477663 INFO: [RibManager] Adding route /hello/world nexthop=261 origin=0 cost=0
1450334272.483901 INFO: [AutoPrefixPropagator] no hub connected to propagate /
  • after Ctrl-C pressed
1450334275.311978 INFO: [Transport] [id=261,local=unix:///private/tmp/nfd.sock,remote=fd://31] setState UP -> FAILED
1450334275.312506 INFO: [Transport] [id=261,local=unix:///private/tmp/nfd.sock,remote=fd://31] setState FAILED -> CLOSED
1450334275.317282 INFO: [FaceTable] Removed face id=261 remote=fd://31 local=unix:///private/tmp/nfd.sock (LpFace failed)
1450334275.318667 INFO: [AutoPrefixPropagator] should be kept for another RIB entry: /localhost/nfd/rib
  • after app started again
1450334312.483872 INFO: [UnixStreamTransport] [id=0,local=unix:///private/tmp/nfd.sock,remote=fd://31] Creating transport
1450334312.484001 INFO: [FaceTable] Added face id=262 remote=fd://31 local=unix:///private/tmp/nfd.sock
1450334312.489580 INFO: [RibManager] Adding route /hello/world nexthop=262 origin=0 cost=0
Assertion failed: (!entryIt->second.isNew()), function afterInsertRibEntry, file ../rib/auto-prefix-propagator.cpp, line 179.
Abort trap: 6

I think the automatic prefix propagation record is incorrectly kept for /localhost/nfd/rib prefix. Or something related to this prefix.

Actions #4

Updated by Yanbiao Li almost 9 years ago

  • Status changed from New to Code review
  • % Done changed from 0 to 90

When a prefix was about to be propagated but there was no connectivity, its state may stay in NEW. When the same prefix is about to be propagated, we can not assert that the state must be PROPAGATED or PROPAGATE_FAIL. This is the key reason causes the bug.

Actions #5

Updated by Junxiao Shi almost 9 years ago

Reply to note-4:

Is this a problem originated from the design, or is it the implementation inconsistent with the design?

If there's a design problem, the state transition table in devguide should be updated as well.

Actions #6

Updated by Yanbiao Li almost 9 years ago

I think it's just a implementation issue. According to the design, it indeed will happen that a state goes into NEW but does not become PROPAGATING at once.

Actions #7

Updated by Junxiao Shi almost 9 years ago

The assertion failure occurs in AutoPrefixPropagator::afterInsertRibEntry.

This function is invoked when a RIB entry (not route) is inserted, which means one of the following:

  • prefix == LINK_LOCAL_NFD_PREFIX branch: "hub connect" event for all existing propagated entries
  • LOCAL_REGISTRATION_PREFIX.isPrefixOf(prefix) or !propagateParameters.isValid branch: no event because the prefix shouldn't be propagated
  • entryIt != m_propagatedEntries.end() branch: no event because propagateParameters.parameters.getName() refers to an existing propagated entries created by another RIB entry which has the same prefix propagation parameters
  • the last case: "rib insert" event on a new propagated entry

Based on this analysis, I agree that this is an implementation problem, and the assertion should be deleted.

Actions #8

Updated by Alex Afanasyev almost 9 years ago

  • Has duplicate Bug #3429: nrd assertion failure in rib/auto-prefix-propagator.cpp:179 added
Actions #9

Updated by Junxiao Shi almost 9 years ago

  • Status changed from Code review to Closed
  • % Done changed from 90 to 100
Actions

Also available in: Atom PDF