Project

General

Profile

Bug #1816

NFD quits if any connected application is interrupted using Ctrl-C.

Added by Syed Amin about 5 years ago. Updated about 5 years ago.

Status:
Closed
Priority:
High
Category:
Tables
Target version:
Start date:
08/03/2014
Due date:
% Done:

100%

Estimated time:

Description

I tried this on MacOSX(10.9.3), with following commits:

ndn-cxx: d1f34e864909bd07ce46e4906b4fa0cd06f0b6ca
nfd: 68bc1e0dce68c0a0ef98fe5e3fee15a4d8b21fc8

Steps to reproduce:

1: Start nfd:

$ sudo nfd 

2: On another terminal start nrd:

$ sudo nrd

3: Start any application that connects to NFD (I've tried with nlsr and the sample producer of ndn-cxx).

$./build/examples/producer 

4: Stop the application using Ctrl-C and nfd will quit as well.

Note: To check that I am using right versions and everything is properly installed, I did fresh install of ndn-cxx and nfd. After that for few run I didn't face this problem. But, when I quit NRD(which also terminated NFD, as expected) the problem started to happen again after following the steps mentioned above.

History

#1 Updated by Alex Afanasyev about 5 years ago

I confirm this on OSX 10.9.

Last log from NFD:

407127551.737573 TRACE: [UnixStreamFace] [id:262,uri:fd://28] Successfully sent: 806 bytes
1407127557.513826 INFO: [UnixStreamFace] [id:262,uri:fd://28] Connection closed
1407127557.518586 DEBUG: [Forwarder] onIncomingData face=1 data=/localhost/nfd/faces/events/%0A
1407127557.518793 DEBUG: [Forwarder] onIncomingData matching=/localhost/nfd/faces/events/%0A
1407127557.518919 DEBUG: [Strategy] beforeSatisfyPendingInterest pitEntry=/localhost/nfd/faces/events/%0A inFace=1 data=/localhost/nfd/faces/events/%0A
1407127557.519101 DEBUG: [Forwarder] onOutgoingData face=261 data=/localhost/nfd/faces/events/%0A
1407127557.519245 INFO: [FaceTable] Removed face id=262 remote=fd://28 local=unix:///private/var/run/nfd.sock

Here is backtrace

(gdb) bt
#0  0x00000001001730a8 in shared_ptr (this=0x7fff5fbfe030) at shared_ptr.hpp:328
#1  0x000000010017307d in shared_ptr (this=0x7fff5fbfe030) at shared_ptr.hpp:328
#2  0x00000001001c03f8 in boost::shared_ptr<nfd::name_tree::Entry>::operator= (this=0x7fff5fbfe470, r=@0x2000000000000) at shared_ptr.hpp:507
#3  0x00000001001ccb4c in nfd::NameTree::const_iterator::operator++ (this=0x7fff5fbfe468) at ../daemon/table/name-tree.cpp:610
#4  0x00000001001c0007 in nfd::Fib::removeNextHopFromAllEntries (this=0x10310e1d0, face=<value temporarily unavailable, due to optimizations>) at ../daemon/table/fib.cpp:147
#5  0x00000001000e38ab in nfd::FaceTable::remove (this=0x10310e0c0, face=<value temporarily unavailable, due to optimizations>) at ../daemon/fw/face-table.cpp:104
#6  0x00000001000e5e16 in boost::_mfi::mf1<void, nfd::FaceTable, boost::shared_ptr<nfd::Face> >::operator() (this=0x103113250, p=0x10310e0c0, a1=<value temporarily unavailable, due to optimizations>) at mem_fn_template.hpp:165
#7  0x00000001000e5d0c in boost::_bi::list2<boost::_bi::value<nfd::FaceTable*>, boost::_bi::value<boost::shared_ptr<nfd::Face> > >::operator()<boost::_mfi::mf1<void, nfd::FaceTable, boost::shared_ptr<nfd::Face> >, boost::_bi::list1<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&> > (this=0x103113260, f=@0x103113250, a=@0x7fff5fbfe708) at bind.hpp:313
#8  0x00000001000e5c82 in boost::_bi::bind_t<void, boost::_mfi::mf1<void, nfd::FaceTable, boost::shared_ptr<nfd::Face> >, boost::_bi::list2<boost::_bi::value<nfd::FaceTable*>, boost::_bi::value<boost::shared_ptr<nfd::Face> > > >::operator()<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > (this=0x103113250, a1=@0x7fff5fbfeaa8) at bind_template.hpp:47
#9  0x00000001000e5998 in boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void, boost::_mfi::mf1<void, nfd::FaceTable, boost::shared_ptr<nfd::Face> >, boost::_bi::list2<boost::_bi::value<nfd::FaceTable*>, boost::_bi::value<boost::shared_ptr<nfd::Face> > > >, void, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&>::invoke (function_obj_ptr=@0x1031208d8, a0=@0x7fff5fbfeaa8) at function_template.hpp:153
#10 0x000000010003c2a0 in boost::function1<void, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&>::operator() (this=0x1031208d0, a0=@0x7fff5fbfeaa8) at function_template.hpp:766
#11 0x000000010003b2fa in nfd::EventEmitter<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, nfd::empty, nfd::empty, nfd::empty>::operator() (this=0x103838698, a1=@0x7fff5fbfeaa8) at event-emitter.hpp:234
#12 0x000000010003a1c7 in nfd::Face::fail (this=0x103838620, reason=@0x7fff5fbfeaa8) at ../daemon/face/face.cpp:123
#13 0x000000010020b221 in nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>::processErrorCode (this=0x103838620, error=@0x7fff5fbff2e8) at stream-face.hpp:247
#14 0x000000010020cb99 in nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>::handleReceive (this=0x103838620, error=@0x7fff5fbff2e8, nBytesReceived=0) at stream-face.hpp:283
#15 0x000000010020b962 in boost::_mfi::mf2<void, nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>, boost::system::error_code const&, unsigned long>::operator() (this=0x7fff5fbff2d0, p=0x103838620, a1=@0x7fff5fbff2e8, a2=0) at mem_fn_template.hpp:280
#16 0x000000010020b8b8 in boost::_bi::list3<boost::_bi::value<nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>*>, boost::arg<1>, boost::arg<2> >::operator()<boost::_mfi::mf2<void, nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>, boost::system::error_code const&, unsigned long>, boost::_bi::list2<boost::system::error_code const&, unsigned long const&> > (this=0x7fff5fbff2e0, f=@0x7fff5fbff2d0, a=@0x7fff5fbff0a8) at bind.hpp:392
#17 0x000000010020b82a in boost::_bi::bind_t<void, boost::_mfi::mf2<void, nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>*>, boost::arg<1>, boost::arg<2> > >::operator()<boost::system::error_code, unsigned long> (this=0x7fff5fbff2d0, a1=@0x7fff5fbff2e8, a2=@0x7fff5fbff2f8) at bind_template.hpp:102
#18 0x000000010020df9e in boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf2<void, nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>*>, boost::arg<1>, boost::arg<2> > >, boost::system::error_code, unsigned long>::operator() (this=0x7fff5fbff2d0) at bind_handler.hpp:127
#19 0x000000010020df59 in boost::asio::asio_handler_invoke<boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf2<void, nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>*>, boost::arg<1>, boost::arg<2> > >, boost::system::error_code, unsigned long> > (function=@0x7fff5fbff2d0) at handler_invoke_hook.hpp:69
#20 0x000000010020de22 in boost_asio_handler_invoke_helpers::invoke<boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf2<void, nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>*>, boost::arg<1>, boost::arg<2> > >, boost::system::error_code, unsigned long>, boost::_bi::bind_t<void, boost::_mfi::mf2<void, nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>*>, boost::arg<1>, boost::arg<2> > > > (function=@0x7fff5fbff2d0, context=@0x7fff5fbff2d0) at handler_invoke_helpers.hpp:37
#21 0x000000010020dd28 in boost::asio::detail::reactive_socket_recv_op<boost::asio::mutable_buffers_1, boost::_bi::bind_t<void, boost::_mfi::mf2<void, nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<nfd::StreamFace<boost::asio::local::stream_protocol, nfd::LocalFace>*>, boost::arg<1>, boost::arg<2> > > >::do_complete (owner=0x103204180, base=0x103114ae0) at reactive_socket_recv_op.hpp:110
#22 0x00000001000206c7 in boost::asio::detail::task_io_service_operation::complete (this=0x103114ae0, owner=@0x103204180, ec=@0x7fff5fbff578, bytes_transferred=0) at task_io_service_operation.hpp:38
#23 0x000000010001fa10 in boost::asio::detail::task_io_service::do_run_one (this=0x103204180, lock=@0x7fff5fbff480, this_thread=@0x7fff5fbff4f0, ec=@0x7fff5fbff578) at task_io_service.ipp:384
#24 0x000000010001f539 in boost::asio::detail::task_io_service::run (this=0x103204180, ec=@0x7fff5fbff578) at task_io_service.ipp:153
#25 0x000000010000ba71 in boost::asio::io_service::run (this=0x103203d20) at io_service.ipp:59
#26 0x0000000100002309 in main (argc=1, argv=0x7fff5fbffb68) at main.cpp:396

#2 Updated by Alex Afanasyev about 5 years ago

  • Category set to Tables
  • Assignee set to Junxiao Shi
  • Priority changed from Normal to High
  • Target version set to v0.2

I believe the error is because for the NameTree cleanup introduced in commit:ee5a4443442e4648ff9217e3ebd0341e6a2d4426. Reverting it fixes the problem, though we are getting back the memory leak.

#3 Updated by Alex Afanasyev about 5 years ago

Ok. I think i know the problem, though not quite sure how to solve.

In Fib::removeNextHopFromAllEntries

for (NameTree::const_iterator it = m_nameTree.fullEnumerate(
     &predicate_NameTreeEntry_hasFibEntry); it != m_nameTree.end(); ++it) {
  shared_ptr<fib::Entry> entry = it->getFibEntry();
  entry->removeNextHop(face);
  if (!entry->hasNextHops()) {
    this->erase(*entry);
  }
}

this->erase can invalidate it iterator. This causes segfault at the point of ++it.

#4 Updated by Alex Afanasyev about 5 years ago

A simple solution would be to restart enumeration if nodes are deleted.

Correct solution would be to change void erase(shared_ptr<name-tree-entry>) to name-tree-iterator erase(name-tree-iterator) and use it in removeNextHopFromAllEntries

#5 Updated by Alex Afanasyev about 5 years ago

  • Status changed from New to In Progress
  • Assignee changed from Junxiao Shi to Alex Afanasyev

#6 Updated by Junxiao Shi about 5 years ago

I plan to work on this one hour later, but you are taking it before I start.

I'll still write my idea here:

  • Fib::removeNextHopFromAllEntries has a std::vector<shared_ptr<fib::Entry> > entriesToDelete variable.
  • During the enumeration, when a FIB entry is eligible to be deleted, it's inserted to this vector.
  • All entries in the vector are deleted after the enumeration completes.

The memory overhead of this solution is relative to number of FIB entries, and is small in the expected case that a face has a small number of nexthop records.

#7 Updated by Alex Afanasyev about 5 years ago

  • Status changed from In Progress to Code review
  • % Done changed from 0 to 100

I realized that we may not need such a procedure. While we are going through the list of items in the tree, the only way the entry will be deleted, if we are in the "leaf" node of the tree. Therefore, the next iterator will remain valid no matter what.

#8 Updated by Junxiao Shi about 5 years ago

The assumption in note-7 is risky, because it relies on the implementation details of NameTree.

NameTree never guarantees anything about the validity of iterators.
By default, after any non-const operation on a container, all iterators/pointers/references to structure members should be assumed invalidated.
If we go with the assumption in note-7, this requirement should be claimed in NameTree documentation, and relevant test cases should be added for NameTree implementation.

#9 Updated by Alex Afanasyev about 5 years ago

Iterators for NameTree elements has to survive removal of other branches of the tree. In my opinion, this definitely should be a guarantee for the NameTree abstraction.

#10 Updated by Junxiao Shi about 5 years ago

I agree that this requirement on NameTree is reasonable.

I will push an update to add necessary documentation and test cases for NameTree.

#11 Updated by Junxiao Shi about 5 years ago

  • Status changed from Code review to Closed

Also available in: Atom PDF