Bug #3691
closedScheduler::cancelAllEvents causes memory error in ScopedEventId
100%
Description
Snippet to reproduce:
// g++ -o x -std=c++0x x.cpp $(pkg-config --cflags --libs libndn-cxx)
#include <boost/asio.hpp>
#include <ndn-cxx/util/scheduler-scoped-event-id.hpp>
int main()
{
boost::asio::io_service io;
ndn::util::scheduler::Scheduler sched(io);
ndn::util::scheduler::ScopedEventId eid(sched);
eid = sched.scheduleEvent(ndn::time::milliseconds(10), []{});
sched.cancelAllEvents();
eid.cancel();
return 0;
}
There's a double-free error on eid.cancel() line.
Same error also occurs if eid goes out of scope.
Updated by Junxiao Shi over 9 years ago
This was found by Weiwei Liu.
valgrind report (partial):
==5887== Invalid read of size 8
==5887== at 0x58B6D60: std::_Rb_tree_rebalance_for_erase(std::_Rb_tree_node_base*, std::_Rb_tree_node_base&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19)
==5887== by 0x513D5FA: std::_Rb_tree<ndn::util::scheduler::Scheduler::EventInfo, ndn::util::scheduler::Scheduler::EventInfo, std::_Identity<ndn::util::scheduler::Scheduler::EventInfo>, std::less<ndn::util::scheduler::Scheduler::EventInfo>, std::allocator<ndn::util::scheduler::Scheduler::EventInfo> >::_M_erase_aux(std::_Rb_tree_const_iterator<ndn::util::scheduler::Scheduler::EventInfo>) (stl_tree.h:1745)
==5887== by 0x513C6B4: erase (stl_tree.h:809)
==5887== by 0x513C6B4: erase (stl_multiset.h:538)
==5887== by 0x513C6B4: ndn::util::scheduler::Scheduler::cancelEvent(std::shared_ptr<ndn::util::scheduler::EventIdImpl> const&) (scheduler.cpp:137)
==5887== by 0x513C0CD: ndn::util::scheduler::ScopedEventId::cancel() (scheduler-scoped-event-id.cpp:63)
==5887== by 0x405B26: main (in /home/vagrant/x)
==5887== Address 0x9ad4860 is 16 bytes inside a block of size 88 free'd
==5887== at 0x4C2C2BC: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5887== by 0x408BBB: __gnu_cxx::new_allocator<std::_Rb_tree_node<ndn::util::scheduler::Scheduler::EventInfo> >::deallocate(std::_Rb_tree_node<ndn::util::scheduler::Scheduler::EventInfo>*, unsigned long) (in /home/vagrant/x)
==5887== by 0x408AF5: std::_Rb_tree<ndn::util::scheduler::Scheduler::EventInfo, ndn::util::scheduler::Scheduler::EventInfo, std::_Identity<ndn::util::scheduler::Scheduler::EventInfo>, std::less<ndn::util::scheduler::Scheduler::EventInfo>, std::allocator<ndn::util::scheduler::Scheduler::EventInfo> >::_M_put_node(std::_Rb_tree_node<ndn::util::scheduler::Scheduler::EventInfo>*) (in /home/vagrant/x)
==5887== by 0x4089E9: std::_Rb_tree<ndn::util::scheduler::Scheduler::EventInfo, ndn::util::scheduler::Scheduler::EventInfo, std::_Identity<ndn::util::scheduler::Scheduler::EventInfo>, std::less<ndn::util::scheduler::Scheduler::EventInfo>, std::allocator<ndn::util::scheduler::Scheduler::EventInfo> >::_M_destroy_node(std::_Rb_tree_node<ndn::util::scheduler::Scheduler::EventInfo>*) (in /home/vagrant/x)
==5887== by 0x4086F6: std::_Rb_tree<ndn::util::scheduler::Scheduler::EventInfo, ndn::util::scheduler::Scheduler::EventInfo, std::_Identity<ndn::util::scheduler::Scheduler::EventInfo>, std::less<ndn::util::scheduler::Scheduler::EventInfo>, std::allocator<ndn::util::scheduler::Scheduler::EventInfo> >::_M_erase(std::_Rb_tree_node<ndn::util::scheduler::Scheduler::EventInfo>*) (in /home/vagrant/x)
==5887== by 0x513C50D: clear (stl_tree.h:860)
==5887== by 0x513C50D: clear (stl_multiset.h:617)
==5887== by 0x513C50D: ndn::util::scheduler::Scheduler::cancelAllEvents() (scheduler.cpp:158)
==5887== by 0x405B17: main (in /home/vagrant/x)
The root cause is: EventId is declared as shared_ptr<EventIdImpl>, and Scheduler::cancelAllEvents does not call EventIdImpl::invalidate so that Scheduler::cancelEvent is trying to cancel the event again, which involves erasing an STL container item with a non-dereferencable iterator.
Updated by Junxiao Shi over 9 years ago
- Status changed from New to In Progress
- Assignee set to Junxiao Shi
Updated by Junxiao Shi over 9 years ago
- % Done changed from 0 to 100
https://gerrit.named-data.net/3035 redesigns EventId and fixes this bug.
EventIdImpl is eliminated. Instead, EventInfo is created as shared_ptr and added into the event queue, and EventId is a weak_ptr from that shared_ptr.
When an event expires or is cancelled, its shared_ptr<EventInfo> is deallocated, which invalidates all weak_ptrs so that it cannot be cancelled again; when an event expires and is being executed, its shared_ptr<EventInfo> is not yet deallocated but it should not be cancelled again (from within the callback), and this situation is distinguished by setting the queue iterator to end().
EventId is an opaque type so callers are disallowed to lock() it as shared_ptr; various asserts are added to check this violation.
Updated by Junxiao Shi over 9 years ago
- Related to Bug #3722: Scheduler stops if a callback throws added
Updated by Junxiao Shi over 9 years ago
Other projects assume EventId is a shared_ptr, and https://gerrit.named-data.net/3035 patchset2 breaks this assumption.
https://travis-ci.org/yoursunny/ndn-cxx-breaks/builds/150304474.
I've identified some common usage patterns in other projects:
eid == eid2- construct
EventIdfromnullptr static_cast<bool>(eid)eid.reset()os << eid
Most of them are reasonable, so https://gerrit.named-data.net/3035 patchset3 creates a EventId class whose only field is weak_ptr<EventInfo> (so that the memory overhead remains the same), but supports the APIs used by other projects.
eid.reset() is somewhat unreasonable because it's equivalent to eid = nullptr, so I marked that one as \deprecated.
Updated by Junxiao Shi over 9 years ago
- Related to deleted (Bug #3722: Scheduler stops if a callback throws)
Updated by Junxiao Shi over 9 years ago
- Status changed from In Progress to Closed