Project

General

Profile

Actions

Bug #2653

closed

UtilFaceUri/CanonizeEmptyCallback triggers use-after-free in dns::Resolver

Added by Davide Pesavento almost 10 years ago. Updated about 9 years ago.

Status:
Closed
Priority:
Normal
Category:
Utils
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:

Description

CanonizeEmptyCallback uses very short (1ms) timeouts. This means that sometimes, when the Resolver::onResolveSuccess callback is invoked, the timeout has already expired and its callback scheduled for execution. At this point, canceling the event/deadline_timer has no effect. At the end of the execution of onResolveSuccess, the Resolver refcount drops to zero and it's deallocated together with the scheduler instance. Eventually the timer callback in the scheduler is also invoked, which causes a use-after-free as soon as any scheduler instance variable is accessed.

Boost documentation for basic_deadline_timer confirms the above theory:

If the timer has already expired when cancel() is called, then the handlers for asynchronous wait operations will:

  • have already been invoked; or
  • have been queued for invocation in the near future. These handlers can no longer be cancelled, and therefore are passed an error code that indicates the successful completion of the wait operation.

(http://www.boost.org/doc/libs/1_48_0/doc/html/boost_asio/reference/basic_deadline_timer/cancel/overload1.html)

==23852==ERROR: AddressSanitizer: heap-use-after-free on address 0x61200000ee50 at pc 0xa928e3 bp 0x7fff9a977b80 sp 0x7fff9a977b70
WRITE of size 1 at 0x61200000ee50 thread T0
    #0 0xa928e2 in ndn::Scheduler::onEvent(boost::system::error_code const&) ../src/util/scheduler.cpp:168
    #1 0xa93dc6 in operator()<const boost::system::error_code&, void> /usr/include/c++/4.9/functional:569
    #2 0xa93dc6 in __call<void, const boost::system::error_code&, 0ul, 1ul> /usr/include/c++/4.9/functional:1264
    #3 0xa93dc6 in operator()<const boost::system::error_code&, void> /usr/include/c++/4.9/functional:1323
    #4 0xa93dc6 in boost::asio::detail::binder1<std::_Bind<std::_Mem_fn<void (ndn::Scheduler::*)(boost::system::error_code const&)> (ndn::Scheduler*, std::_Placeholder<1>)>, boost::system::error_code>::operator()() /usr/include/boost/asio/detail/bind_handler.hpp:47
    #5 0xa93dc6 in asio_handler_invoke<boost::asio::detail::binder1<std::_Bind<std::_Mem_fn<void (ndn::Scheduler::*)(const boost::system::error_code&)>(ndn::Scheduler*, std::_Placeholder<1>)>, boost::system::error_code> > /usr/include/boost/asio/handler_invoke_hook.hpp:69
    #6 0xa93dc6 in invoke<boost::asio::detail::binder1<std::_Bind<std::_Mem_fn<void (ndn::Scheduler::*)(const boost::system::error_code&)>(ndn::Scheduler*, std::_Placeholder<1>)>, boost::system::error_code>, std::_Bind<std::_Mem_fn<void (ndn::Scheduler::*)(const boost::system::error_code&)>(ndn::Scheduler*, std::_Placeholder<1>)> > /usr/include/boost/asio/detail/handler_invoke_helpers.hpp:37
    #7 0xa93dc6 in boost::asio::detail::wait_handler<std::_Bind<std::_Mem_fn<void (ndn::Scheduler::*)(boost::system::error_code const&)> (ndn::Scheduler*, std::_Placeholder<1>)> >::do_complete(boost::asio::detail::task_io_service*, boost::asio::detail::task_io_service_operation*, boost::system::error_code const&, unsigned long) /usr/include/boost/asio/detail/wait_handler.hpp:70
    #8 0x54b087 in boost::asio::detail::task_io_service_operation::complete(boost::asio::detail::task_io_service&, boost::system::error_code const&, unsigned long) /usr/include/boost/asio/detail/task_io_service_operation.hpp:38
    #9 0x54b087 in boost::asio::detail::task_io_service::do_run_one(boost::asio::detail::scoped_lock<boost::asio::detail::posix_mutex>&, boost::asio::detail::task_io_service_thread_info&, boost::system::error_code const&) /usr/include/boost/asio/detail/impl/task_io_service.ipp:384
    #10 0x54b087 in boost::asio::detail::task_io_service::run(boost::system::error_code&) /usr/include/boost/asio/detail/impl/task_io_service.ipp:153
    #11 0x6ccf34 in boost::asio::io_service::run() /usr/include/boost/asio/impl/io_service.ipp:59
    #12 0x6ccf34 in ndn::util::UtilTestFaceUri::CanonizeEmptyCallback::test_method() ../tests/unit-tests/util/face-uri.cpp:404
    #13 0x6cd2cf in CanonizeEmptyCallback_invoker ../tests/unit-tests/util/face-uri.cpp:380
    #14 0x44d2de in invoke<void (*)()> /usr/include/boost/test/utils/callback.hpp:56
    #15 0x44d2de in boost::unit_test::ut_detail::callback0_impl_t<boost::unit_test::ut_detail::unused, void (*)()>::invoke() /usr/include/boost/test/utils/callback.hpp:89
    #16 0x7f7220b855a0 (/usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0+0x685a0)
    #17 0x7f7220b60865 in boost::execution_monitor::catch_signals(boost::unit_test::callback0<int> const&) (/usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0+0x43865)
    #18 0x7f7220b610a2 in boost::execution_monitor::execute(boost::unit_test::callback0<int> const&) (/usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0+0x440a2)
    #19 0x7f7220b856a1 in boost::unit_test::unit_test_monitor_t::execute_and_translate(boost::unit_test::test_case const&) (/usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0+0x686a1)
    #20 0x7f7220b6f2f3 in boost::unit_test::framework_impl::visit(boost::unit_test::test_case const&) (/usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0+0x522f3)
    #21 0x7f7220b9e2d2 in boost::unit_test::traverse_test_tree(boost::unit_test::test_suite const&, boost::unit_test::test_tree_visitor&) (/usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0+0x812d2)
    #22 0x7f7220b9e2d2 in boost::unit_test::traverse_test_tree(boost::unit_test::test_suite const&, boost::unit_test::test_tree_visitor&) (/usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0+0x812d2)
    #23 0x7f7220b6a819 in boost::unit_test::framework::run(unsigned long, bool) (/usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0+0x4d819)
    #24 0x7f7220b83283 in boost::unit_test::unit_test_main(bool (*)(), int, char**) (/usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0+0x66283)
    #25 0x81f1cc in main /usr/include/boost/test/unit_test.hpp:59
    #26 0x7f721f674ec4 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)
    #27 0x43a6b8 (/home/davide/ndn-cxx/build/unit-tests+0x43a6b8)

0x61200000ee50 is located 272 bytes inside of 280-byte region [0x61200000ed40,0x61200000ee58)
freed by thread T0 here:
    #0 0x7f722194663f in operator delete(void*) (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x5863f)
    #1 0xa570b2 in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<ndn::dns::Resolver, std::allocator<ndn::dns::Resolver>, (__gnu_cxx::_Lock_policy)2> >::deallocate(std::_Sp_counted_ptr_inplace<ndn::dns::Resolver, std::allocator<ndn::dns::Resolver>, (__gnu_cxx::_Lock_policy)2>*, unsigned long) /usr/include/c++/4.9/ext/new_allocator.h:110
    #2 0xa570b2 in std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<ndn::dns::Resolver, std::allocator<ndn::dns::Resolver>, (__gnu_cxx::_Lock_policy)2> > >::deallocate(std::allocator<std::_Sp_counted_ptr_inplace<ndn::dns::Resolver, std::allocator<ndn::dns::Resolver>, (__gnu_cxx::_Lock_policy)2> >&, std::_Sp_counted_ptr_inplace<ndn::dns::Resolver, std::allocator<ndn::dns::Resolver>, (__gnu_cxx::_Lock_policy)2>*, unsigned long) /usr/include/c++/4.9/bits/alloc_traits.h:383
    #3 0xa570b2 in std::_Sp_counted_ptr_inplace<ndn::dns::Resolver, std::allocator<ndn::dns::Resolver>, (__gnu_cxx::_Lock_policy)2>::_M_destroy() /usr/include/c++/4.9/bits/shared_ptr_base.h:535

previously allocated by thread T0 here:
    #0 0x7f722194613f in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x5813f)
    #1 0xa54880 in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<ndn::dns::Resolver, std::allocator<ndn::dns::Resolver>, (__gnu_cxx::_Lock_policy)2> >::allocate(unsigned long, void const*) /usr/include/c++/4.9/ext/new_allocator.h:104
    #2 0xa54880 in std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<ndn::dns::Resolver, std::allocator<ndn::dns::Resolver>, (__gnu_cxx::_Lock_policy)2> > >::allocate(std::allocator<std::_Sp_counted_ptr_inplace<ndn::dns::Resolver, std::allocator<ndn::dns::Resolver>, (__gnu_cxx::_Lock_policy)2> >&, unsigned long) /usr/include/c++/4.9/bits/alloc_traits.h:357
    #3 0xa54880 in __shared_count<ndn::dns::Resolver, std::allocator<ndn::dns::Resolver>, const std::function<void(const boost::asio::ip::address&)>&, const std::function<void(const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&)>&, const std::function<bool(const boost::asio::ip::address&)>&, std::reference_wrapper<boost::asio::io_service> > /usr/include/c++/4.9/bits/shared_ptr_base.h:616
    #4 0xa54880 in __shared_ptr<std::allocator<ndn::dns::Resolver>, const std::function<void(const boost::asio::ip::address&)>&, const std::function<void(const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&)>&, const std::function<bool(const boost::asio::ip::address&)>&, std::reference_wrapper<boost::asio::io_service> > /usr/include/c++/4.9/bits/shared_ptr_base.h:1090
    #5 0xa54880 in shared_ptr<std::allocator<ndn::dns::Resolver>, const std::function<void(const boost::asio::ip::address&)>&, const std::function<void(const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&)>&, const std::function<bool(const boost::asio::ip::address&)>&, std::reference_wrapper<boost::asio::io_service> > /usr/include/c++/4.9/bits/shared_ptr.h:316
    #6 0xa54880 in allocate_shared<ndn::dns::Resolver, std::allocator<ndn::dns::Resolver>, const std::function<void(const boost::asio::ip::address&)>&, const std::function<void(const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&)>&, const std::function<bool(const boost::asio::ip::address&)>&, std::reference_wrapper<boost::asio::io_service> > /usr/include/c++/4.9/bits/shared_ptr.h:588
    #7 0xa54880 in make_shared<ndn::dns::Resolver, const std::function<void(const boost::asio::ip::address&)>&, const std::function<void(const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&)>&, const std::function<bool(const boost::asio::ip::address&)>&, std::reference_wrapper<boost::asio::io_service> > /usr/include/c++/4.9/bits/shared_ptr.h:604
    #8 0xa54880 in ndn::dns::asyncResolve(std::string const&, std::function<void (boost::asio::ip::address const&)> const&, std::function<void (std::string const&)> const&, boost::asio::io_service&, std::function<bool (boost::asio::ip::address const&)> const&, boost::chrono::duration<long, boost::ratio<1l, 1000000000l> > const&) ../src/util/dns.cpp:136

SUMMARY: AddressSanitizer: heap-use-after-free ../src/util/scheduler.cpp:168 ndn::Scheduler::onEvent(boost::system::error_code const&)
Shadow bytes around the buggy address:
  0x0c247fff9d70: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c247fff9d80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c247fff9d90: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa
  0x0c247fff9da0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c247fff9db0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
=>0x0c247fff9dc0: fd fd fd fd fd fd fd fd fd fd[fd]fa fa fa fa fa
  0x0c247fff9dd0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c247fff9de0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c247fff9df0: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa
  0x0c247fff9e00: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c247fff9e10: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Contiguous container OOB:fc
  ASan internal:           fe
==23852==ABORTING

Related issues 1 (0 open1 closed)

Blocks NFD - Task #2589: CI: enable AddressSanitizer for unit testsClosedDavide Pesavento

Actions
Actions #1

Updated by Davide Pesavento almost 10 years ago

  • Blocks Task #2589: CI: enable AddressSanitizer for unit tests added
Actions #2

Updated by Junxiao Shi almost 10 years ago

This appears to be an issue with Scheduler, not Resolver or its test suite.

Actions #3

Updated by Davide Pesavento almost 10 years ago

I believe the proper solution is to use a io_service::strand and an empty callback to keep the Resolver object alive until all handlers have been invoked.

Actions #4

Updated by Davide Pesavento almost 10 years ago

Junxiao Shi wrote:

This appears to be an issue with Scheduler, not Resolver or its test suite.

The scheduler is just an innocent victim I believe, it has no control over its lifetime. But I'm open for suggestions on how to best fix this. I'm worried that similar instances of this bug might lurk in several places inside NFD.

Actions #5

Updated by Junxiao Shi almost 10 years ago

Scheduler destructor is supposed to cancel all scheduled events. Failure to cancel deadline_timer isn't an excuse for Scheduler to cause a crash.

Actions #6

Updated by Davide Pesavento almost 10 years ago

That is not the problem. The event does get canceled, but canceling it does not prevent the associated callback from being invoked at some unspecified point in the future. And the object must stay alive until that point in time.

Actions #7

Updated by Junxiao Shi almost 10 years ago

The scheduled event (eg. resolver timeout callback) is cancelled, but Scheduler destructor fails to cancel the execution of Scheduler::onEvent.

One possible but ugly solution is to bind a destroyed flag to onEvent, and check this flag before accessing any member field:

class Scheduler {
private:
  shared_ptr<bool> m_isDestroyed;
public:
  Scheduler() : m_isDestroyed(make_shared<bool>(false)) {}
  ~Scheduler() { *m_isDestroyed = true; }
  onEvent(shared_ptr<bool> isDestroyed) { if (*isDestroyed) return; ... }
}
Actions #8

Updated by Davide Pesavento almost 10 years ago

Junxiao Shi wrote:

The scheduled event (eg. resolver timeout callback) is cancelled, but Scheduler destructor fails to cancel the execution of Scheduler::onEvent.

It doesn't because it can't. The callback is posted and eventually executed by boost and we have no control over that. This is just how it's supposed to work (according to boost doc).

One possible but ugly solution is to bind a destroyed flag to onEvent, and check this flag before accessing any member field:

Ugly and inefficient. Using a weak_ptr for this would be slightly better. But I still don't think it's a scheduler issue.

Actions #9

Updated by Alex Afanasyev almost 10 years ago

Which exactly "heap" is used in that function? When I wrote scheduler, I was fully aware that onEvent can be executed when scheduler instance is destroyed and there is logic in there to handle this. In other words, when error is set to something (e.g., cancelled), then there is no access to the deleted "this".

I'm not sure what exactly is the problem here. We can rewrite it to be static function with "this" as a bound parameter, though it would be kind of ugly and will have exactly the same effect.

Actions #10

Updated by Davide Pesavento almost 10 years ago

Alex Afanasyev wrote:

Which exactly "heap" is used in that function?

this->m_isEventExecuting https://github.com/named-data/ndn-cxx/blob/master/src/util/scheduler.cpp#L170

When I wrote scheduler, I was fully aware that onEvent can be executed when scheduler instance is destroyed and there is logic in there to handle this. In other words, when error is set to something (e.g., cancelled), then there is no access to the deleted "this".

The assumption that canceling an event always causes its callback to be invoked with an error is incorrect. As I said in the description, the event is canceled after the timer has expired, and thus after the callback has been posted to the queue for execution. At that point the canceling has no effect, i.e. it does not change the error code of the posted callback from success to operation_aborted.

Actions #11

Updated by Davide Pesavento almost 10 years ago

Have we reached any consensus on how to solve this issue?

Actions #12

Updated by Alex Afanasyev over 9 years ago

I don't think I see a clear way to solve this problem... I don't really like using additional weak_ptr in scheduler, as it would incur unnecessary overhead in most of the cases. I'm not yet sure what strand is... If it can help, then may be.

Actions #13

Updated by Davide Pesavento over 9 years ago

Using a strand is just a way to guarantee that all handlers are called in the proper order. The real fix is using an empty callback to keep the shared_ptr to the Resolver alive until there's no more work pending.

Actions #14

Updated by Davide Pesavento about 9 years ago

  • Status changed from New to In Progress
  • Assignee set to Davide Pesavento
  • % Done changed from 0 to 50
Actions #15

Updated by Davide Pesavento about 9 years ago

  • Status changed from In Progress to Code review
  • % Done changed from 50 to 90

http://gerrit.named-data.net/2542

In fact, the strand is not needed if we're not doing multithreaded stuff.

Actions #16

Updated by Davide Pesavento about 9 years ago

  • % Done changed from 90 to 100
Actions #17

Updated by Davide Pesavento about 9 years ago

  • Status changed from Code review to Closed
Actions

Also available in: Atom PDF