Project

General

Profile

Task #1922

EthernetFace: workaround with kqueue and Boost 1.56.0

Added by Alex Afanasyev about 5 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Category:
Build
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:

Description

EthernetFace obtains a file descriptor from pcap_get_selectable_fd API, and passes it to Boost.Asio.

On OS X and FreeBSD, Boost.Asio uses kqueue as an alternate to poll syscall, in order to gain better performance.

Since Boost 1.56, Boost.Asio's usage of kqueue becomes incompatible with libpcap's file descriptor.
Attempting to create an EthernetFace causes runtime error.

Although it's possible to disable kqueue and revert to poll syscall, the performance would be worse than using kqueue.

Therefore, EthernetFace is forcibly disabled on platforms with kqueue, when NFD is compiled with Boost 1.56 or above.

This Task is to find a workaround, so that EthernetFace can work with kqueue and Boost 1.56.0.


Related issues

Related to NFD - Bug #1877: EthernetFace creation fails on OS X with Boost 1.56.0Closed2014-08-17

History

#1 Updated by Alex Afanasyev about 5 years ago

  • Related to Bug #1877: EthernetFace creation fails on OS X with Boost 1.56.0 added

#2 Updated by Junxiao Shi about 5 years ago

  • Subject changed from EthernetFace doesn't work on platforms with kqueue (OS X, FreeBSD) with Boost 1.56.0 to EthernetFace: workaround with kqueue and Boost 1.56.0
  • Description updated (diff)
  • Start date deleted (08/24/2014)

#4 Updated by Davide Pesavento about 5 years ago

I think we should investigate whether it's feasible to replace libpcap with asio::generic::raw_protocol::socket, available since boost 1.54

#5 Updated by Junxiao Shi about 5 years ago

OSX poll(2) manpage doesn't mention a limit on how many file descriptors can be monitored in this syscall, so I believe the limit on file descriptors isn't a problem.

It's useful to know how much overhead would we incur if we disable kqueue and revert to poll syscall.

Currently I do not know how to design a benchmark that is suitable to the specific use case of NFD.

http://www.kegel.com/dkftpbench/Poller_bench.html has a generic benchmark with results from FreeBSD 4.x on 600MHz CPU:

  • with 100 socketpairs: poll takes 50us, kqueue takes 8us
  • with 1000 socketpairs: poll takes 552us, kqueue takes 8us
  • with 10000 socketpairs: poll takes 11559us, kqueue takes 8us

Typically NFD would operate with less than 200 sockets, and today's CPU is 1.8GHz or faster.
100 polls per second should be reasonable for an end node. It would takes 1.7ms, or 0.17% of total time.

A busy router node has more packets to process and more events in the scheduler, so there could 5000 polls per second.
That's 83ms per second, or 8.3% of total time. This seems too much.

#6 Updated by Junxiao Shi almost 5 years ago

  • Target version changed from v0.3 to Unsupported

During 20141006 conference call, Alex concludes that it's impossible to workaround this issue.

The only solution is asking users to install a different version of Boost.

#7 Updated by Alex Afanasyev almost 5 years ago

Just a heads up. Asio 1.10.5 that targets boost 1.57 (not yet included in the recently released beta of 1.57) claims to fix the regression.

#8 Updated by Davide Pesavento almost 5 years ago

boost 1.57.0 has been released. Can someone try and see if the regression has been fixed? If so, the check in boost-kqueue.py should be changed from >= to ==.

#9 Updated by Davide Pesavento almost 5 years ago

  • Status changed from New to Code review
  • Target version changed from Unsupported to v0.3
  • % Done changed from 0 to 100

http://gerrit.named-data.net/1444

If disabling for boost == 1.56.0 is an acceptable resolution for this task, good. Otherwise it should be rejected because we're not going to implement any other workarounds.

#10 Updated by Alex Afanasyev almost 5 years ago

Yes. Just checking for 1.56 is acceptable for me. In my opinion, there is nothing what we can or should do.

#11 Updated by Davide Pesavento almost 5 years ago

  • Category changed from Faces to Build
  • Assignee set to Davide Pesavento

#12 Updated by Junxiao Shi almost 5 years ago

  • Status changed from Code review to Resolved

Code is on http://gerrit.named-data.net/1444 but it's untested for OSX + boost-1.57.

I've sent a volunteer recruitment to nfd-dev for this test.

#13 Updated by Davide Pesavento almost 5 years ago

  • Status changed from Resolved to Feedback

Then we have no idea whether it's solved or not.

#14 Updated by Junxiao Shi almost 5 years ago

@Josh has volunteered to test this patch.

@Davide, please advice @Josh on how to test the patch.

#15 Updated by Davide Pesavento over 4 years ago

  • Status changed from Feedback to Code review

Josh has kindly verified that boost 1.57 fixed the regression. The posted patch can be merged.

#16 Updated by Davide Pesavento over 4 years ago

  • Status changed from Code review to Closed

Also available in: Atom PDF