Project

General

Profile

Actions

Bug #3691

closed

Scheduler::cancelAllEvents causes memory error in ScopedEventId

Added by Junxiao Shi over 8 years ago. Updated over 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Utils
Target version:
Start date:
08/01/2016
Due date:
% Done:

100%

Estimated time:
3.00 h

Description

Snippet to reproduce:

// g++ -o x -std=c++0x x.cpp $(pkg-config --cflags --libs libndn-cxx)
#include <boost/asio.hpp>
#include <ndn-cxx/util/scheduler-scoped-event-id.hpp>

int main()
{
  boost::asio::io_service io;
  ndn::util::scheduler::Scheduler sched(io);
  ndn::util::scheduler::ScopedEventId eid(sched);

  eid = sched.scheduleEvent(ndn::time::milliseconds(10), []{});
  sched.cancelAllEvents();
  eid.cancel();

  return 0;
}

There's a double-free error on eid.cancel() line.
Same error also occurs if eid goes out of scope.

Actions #1

Updated by Junxiao Shi over 8 years ago

This was found by Weiwei Liu.

valgrind report (partial):

==5887== Invalid read of size 8
==5887==    at 0x58B6D60: std::_Rb_tree_rebalance_for_erase(std::_Rb_tree_node_base*, std::_Rb_tree_node_base&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19)
==5887==    by 0x513D5FA: std::_Rb_tree<ndn::util::scheduler::Scheduler::EventInfo, ndn::util::scheduler::Scheduler::EventInfo, std::_Identity<ndn::util::scheduler::Scheduler::EventInfo>, std::less<ndn::util::scheduler::Scheduler::EventInfo>, std::allocator<ndn::util::scheduler::Scheduler::EventInfo> >::_M_erase_aux(std::_Rb_tree_const_iterator<ndn::util::scheduler::Scheduler::EventInfo>) (stl_tree.h:1745)
==5887==    by 0x513C6B4: erase (stl_tree.h:809)
==5887==    by 0x513C6B4: erase (stl_multiset.h:538)
==5887==    by 0x513C6B4: ndn::util::scheduler::Scheduler::cancelEvent(std::shared_ptr<ndn::util::scheduler::EventIdImpl> const&) (scheduler.cpp:137)
==5887==    by 0x513C0CD: ndn::util::scheduler::ScopedEventId::cancel() (scheduler-scoped-event-id.cpp:63)
==5887==    by 0x405B26: main (in /home/vagrant/x)
==5887==  Address 0x9ad4860 is 16 bytes inside a block of size 88 free'd
==5887==    at 0x4C2C2BC: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5887==    by 0x408BBB: __gnu_cxx::new_allocator<std::_Rb_tree_node<ndn::util::scheduler::Scheduler::EventInfo> >::deallocate(std::_Rb_tree_node<ndn::util::scheduler::Scheduler::EventInfo>*, unsigned long) (in /home/vagrant/x)
==5887==    by 0x408AF5: std::_Rb_tree<ndn::util::scheduler::Scheduler::EventInfo, ndn::util::scheduler::Scheduler::EventInfo, std::_Identity<ndn::util::scheduler::Scheduler::EventInfo>, std::less<ndn::util::scheduler::Scheduler::EventInfo>, std::allocator<ndn::util::scheduler::Scheduler::EventInfo> >::_M_put_node(std::_Rb_tree_node<ndn::util::scheduler::Scheduler::EventInfo>*) (in /home/vagrant/x)
==5887==    by 0x4089E9: std::_Rb_tree<ndn::util::scheduler::Scheduler::EventInfo, ndn::util::scheduler::Scheduler::EventInfo, std::_Identity<ndn::util::scheduler::Scheduler::EventInfo>, std::less<ndn::util::scheduler::Scheduler::EventInfo>, std::allocator<ndn::util::scheduler::Scheduler::EventInfo> >::_M_destroy_node(std::_Rb_tree_node<ndn::util::scheduler::Scheduler::EventInfo>*) (in /home/vagrant/x)
==5887==    by 0x4086F6: std::_Rb_tree<ndn::util::scheduler::Scheduler::EventInfo, ndn::util::scheduler::Scheduler::EventInfo, std::_Identity<ndn::util::scheduler::Scheduler::EventInfo>, std::less<ndn::util::scheduler::Scheduler::EventInfo>, std::allocator<ndn::util::scheduler::Scheduler::EventInfo> >::_M_erase(std::_Rb_tree_node<ndn::util::scheduler::Scheduler::EventInfo>*) (in /home/vagrant/x)
==5887==    by 0x513C50D: clear (stl_tree.h:860)
==5887==    by 0x513C50D: clear (stl_multiset.h:617)
==5887==    by 0x513C50D: ndn::util::scheduler::Scheduler::cancelAllEvents() (scheduler.cpp:158)
==5887==    by 0x405B17: main (in /home/vagrant/x)

The root cause is: EventId is declared as shared_ptr<EventIdImpl>, and Scheduler::cancelAllEvents does not call EventIdImpl::invalidate so that Scheduler::cancelEvent is trying to cancel the event again, which involves erasing an STL container item with a non-dereferencable iterator.

Actions #2

Updated by Junxiao Shi over 8 years ago

  • Status changed from New to In Progress
  • Assignee set to Junxiao Shi
Actions #3

Updated by Junxiao Shi over 8 years ago

  • % Done changed from 0 to 100

https://gerrit.named-data.net/3035 redesigns EventId and fixes this bug.

EventIdImpl is eliminated. Instead, EventInfo is created as shared_ptr and added into the event queue, and EventId is a weak_ptr from that shared_ptr.

When an event expires or is cancelled, its shared_ptr<EventInfo> is deallocated, which invalidates all weak_ptrs so that it cannot be cancelled again; when an event expires and is being executed, its shared_ptr<EventInfo> is not yet deallocated but it should not be cancelled again (from within the callback), and this situation is distinguished by setting the queue iterator to end().

EventId is an opaque type so callers are disallowed to lock() it as shared_ptr; various asserts are added to check this violation.

Actions #4

Updated by Junxiao Shi over 8 years ago

  • Related to Bug #3722: Scheduler stops if a callback throws added
Actions #5

Updated by Junxiao Shi over 8 years ago

Other projects assume EventId is a shared_ptr, and https://gerrit.named-data.net/3035 patchset2 breaks this assumption.
https://travis-ci.org/yoursunny/ndn-cxx-breaks/builds/150304474.

I've identified some common usage patterns in other projects:

  • eid == eid2
  • construct EventId from nullptr
  • static_cast<bool>(eid)
  • eid.reset()
  • os << eid

Most of them are reasonable, so https://gerrit.named-data.net/3035 patchset3 creates a EventId class whose only field is weak_ptr<EventInfo> (so that the memory overhead remains the same), but supports the APIs used by other projects.

eid.reset() is somewhat unreasonable because it's equivalent to eid = nullptr, so I marked that one as \deprecated.

Actions #6

Updated by Junxiao Shi over 8 years ago

  • Related to deleted (Bug #3722: Scheduler stops if a callback throws)
Actions #7

Updated by Junxiao Shi over 8 years ago

  • Status changed from In Progress to Closed
Actions

Also available in: Atom PDF