Bug #3691
closedScheduler::cancelAllEvents causes memory error in ScopedEventId
100%
Description
Snippet to reproduce:
// g++ -o x -std=c++0x x.cpp $(pkg-config --cflags --libs libndn-cxx)
#include <boost/asio.hpp>
#include <ndn-cxx/util/scheduler-scoped-event-id.hpp>
int main()
{
boost::asio::io_service io;
ndn::util::scheduler::Scheduler sched(io);
ndn::util::scheduler::ScopedEventId eid(sched);
eid = sched.scheduleEvent(ndn::time::milliseconds(10), []{});
sched.cancelAllEvents();
eid.cancel();
return 0;
}
There's a double-free error on eid.cancel()
line.
Same error also occurs if eid
goes out of scope.
Updated by Junxiao Shi over 8 years ago
This was found by Weiwei Liu.
valgrind report (partial):
==5887== Invalid read of size 8
==5887== at 0x58B6D60: std::_Rb_tree_rebalance_for_erase(std::_Rb_tree_node_base*, std::_Rb_tree_node_base&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19)
==5887== by 0x513D5FA: std::_Rb_tree<ndn::util::scheduler::Scheduler::EventInfo, ndn::util::scheduler::Scheduler::EventInfo, std::_Identity<ndn::util::scheduler::Scheduler::EventInfo>, std::less<ndn::util::scheduler::Scheduler::EventInfo>, std::allocator<ndn::util::scheduler::Scheduler::EventInfo> >::_M_erase_aux(std::_Rb_tree_const_iterator<ndn::util::scheduler::Scheduler::EventInfo>) (stl_tree.h:1745)
==5887== by 0x513C6B4: erase (stl_tree.h:809)
==5887== by 0x513C6B4: erase (stl_multiset.h:538)
==5887== by 0x513C6B4: ndn::util::scheduler::Scheduler::cancelEvent(std::shared_ptr<ndn::util::scheduler::EventIdImpl> const&) (scheduler.cpp:137)
==5887== by 0x513C0CD: ndn::util::scheduler::ScopedEventId::cancel() (scheduler-scoped-event-id.cpp:63)
==5887== by 0x405B26: main (in /home/vagrant/x)
==5887== Address 0x9ad4860 is 16 bytes inside a block of size 88 free'd
==5887== at 0x4C2C2BC: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5887== by 0x408BBB: __gnu_cxx::new_allocator<std::_Rb_tree_node<ndn::util::scheduler::Scheduler::EventInfo> >::deallocate(std::_Rb_tree_node<ndn::util::scheduler::Scheduler::EventInfo>*, unsigned long) (in /home/vagrant/x)
==5887== by 0x408AF5: std::_Rb_tree<ndn::util::scheduler::Scheduler::EventInfo, ndn::util::scheduler::Scheduler::EventInfo, std::_Identity<ndn::util::scheduler::Scheduler::EventInfo>, std::less<ndn::util::scheduler::Scheduler::EventInfo>, std::allocator<ndn::util::scheduler::Scheduler::EventInfo> >::_M_put_node(std::_Rb_tree_node<ndn::util::scheduler::Scheduler::EventInfo>*) (in /home/vagrant/x)
==5887== by 0x4089E9: std::_Rb_tree<ndn::util::scheduler::Scheduler::EventInfo, ndn::util::scheduler::Scheduler::EventInfo, std::_Identity<ndn::util::scheduler::Scheduler::EventInfo>, std::less<ndn::util::scheduler::Scheduler::EventInfo>, std::allocator<ndn::util::scheduler::Scheduler::EventInfo> >::_M_destroy_node(std::_Rb_tree_node<ndn::util::scheduler::Scheduler::EventInfo>*) (in /home/vagrant/x)
==5887== by 0x4086F6: std::_Rb_tree<ndn::util::scheduler::Scheduler::EventInfo, ndn::util::scheduler::Scheduler::EventInfo, std::_Identity<ndn::util::scheduler::Scheduler::EventInfo>, std::less<ndn::util::scheduler::Scheduler::EventInfo>, std::allocator<ndn::util::scheduler::Scheduler::EventInfo> >::_M_erase(std::_Rb_tree_node<ndn::util::scheduler::Scheduler::EventInfo>*) (in /home/vagrant/x)
==5887== by 0x513C50D: clear (stl_tree.h:860)
==5887== by 0x513C50D: clear (stl_multiset.h:617)
==5887== by 0x513C50D: ndn::util::scheduler::Scheduler::cancelAllEvents() (scheduler.cpp:158)
==5887== by 0x405B17: main (in /home/vagrant/x)
The root cause is: EventId
is declared as shared_ptr<EventIdImpl>
, and Scheduler::cancelAllEvents
does not call EventIdImpl::invalidate
so that Scheduler::cancelEvent
is trying to cancel the event again, which involves erasing an STL container item with a non-dereferencable iterator.
Updated by Junxiao Shi over 8 years ago
- Status changed from New to In Progress
- Assignee set to Junxiao Shi
Updated by Junxiao Shi over 8 years ago
- % Done changed from 0 to 100
https://gerrit.named-data.net/3035 redesigns EventId
and fixes this bug.
EventIdImpl
is eliminated. Instead, EventInfo
is created as shared_ptr and added into the event queue, and EventId
is a weak_ptr from that shared_ptr.
When an event expires or is cancelled, its shared_ptr<EventInfo>
is deallocated, which invalidates all weak_ptrs so that it cannot be cancelled again; when an event expires and is being executed, its shared_ptr<EventInfo>
is not yet deallocated but it should not be cancelled again (from within the callback), and this situation is distinguished by setting the queue iterator to end()
.
EventId
is an opaque type so callers are disallowed to lock()
it as shared_ptr; various asserts are added to check this violation.
Updated by Junxiao Shi over 8 years ago
- Related to Bug #3722: Scheduler stops if a callback throws added
Updated by Junxiao Shi over 8 years ago
Other projects assume EventId
is a shared_ptr, and https://gerrit.named-data.net/3035 patchset2 breaks this assumption.
https://travis-ci.org/yoursunny/ndn-cxx-breaks/builds/150304474.
I've identified some common usage patterns in other projects:
eid == eid2
- construct
EventId
fromnullptr
static_cast<bool>(eid)
eid.reset()
os << eid
Most of them are reasonable, so https://gerrit.named-data.net/3035 patchset3 creates a EventId
class whose only field is weak_ptr<EventInfo>
(so that the memory overhead remains the same), but supports the APIs used by other projects.
eid.reset()
is somewhat unreasonable because it's equivalent to eid = nullptr
, so I marked that one as \deprecated
.
Updated by Junxiao Shi over 8 years ago
- Related to deleted (Bug #3722: Scheduler stops if a callback throws)
Updated by Junxiao Shi over 8 years ago
- Status changed from In Progress to Closed