Task #1922
closed
EthernetFace: workaround with kqueue and Boost 1.56.0
Added by Alex Afanasyev about 10 years ago.
Updated almost 10 years ago.
Description
EthernetFace obtains a file descriptor from pcap_get_selectable_fd
API, and passes it to Boost.Asio.
On OS X and FreeBSD, Boost.Asio uses kqueue as an alternate to poll
syscall, in order to gain better performance.
Since Boost 1.56, Boost.Asio's usage of kqueue becomes incompatible with libpcap's file descriptor.
Attempting to create an EthernetFace causes runtime error.
Although it's possible to disable kqueue and revert to poll
syscall, the performance would be worse than using kqueue.
Therefore, EthernetFace is forcibly disabled on platforms with kqueue, when NFD is compiled with Boost 1.56 or above.
This Task is to find a workaround, so that EthernetFace can work with kqueue and Boost 1.56.0.
- Related to Bug #1877: EthernetFace creation fails on OS X with Boost 1.56.0 added
- Subject changed from EthernetFace doesn't work on platforms with kqueue (OS X, FreeBSD) with Boost 1.56.0 to EthernetFace: workaround with kqueue and Boost 1.56.0
- Description updated (diff)
- Start date deleted (
08/24/2014)
I think we should investigate whether it's feasible to replace libpcap with asio::generic::raw_protocol::socket
, available since boost 1.54
OSX poll(2) manpage doesn't mention a limit on how many file descriptors can be monitored in this syscall, so I believe the limit on file descriptors isn't a problem.
It's useful to know how much overhead would we incur if we disable kqueue and revert to poll
syscall.
Currently I do not know how to design a benchmark that is suitable to the specific use case of NFD.
http://www.kegel.com/dkftpbench/Poller_bench.html has a generic benchmark with results from FreeBSD 4.x on 600MHz CPU:
- with 100 socketpairs:
poll
takes 50us, kqueue
takes 8us
- with 1000 socketpairs:
poll
takes 552us, kqueue
takes 8us
- with 10000 socketpairs:
poll
takes 11559us, kqueue
takes 8us
Typically NFD would operate with less than 200 sockets, and today's CPU is 1.8GHz or faster.
100 poll
s per second should be reasonable for an end node. It would takes 1.7ms, or 0.17% of total time.
A busy router node has more packets to process and more events in the scheduler, so there could 5000 poll
s per second.
That's 83ms per second, or 8.3% of total time. This seems too much.
- Target version changed from v0.3 to Unsupported
During 20141006 conference call, Alex concludes that it's impossible to workaround this issue.
The only solution is asking users to install a different version of Boost.
Just a heads up. Asio 1.10.5 that targets boost 1.57 (not yet included in the recently released beta of 1.57) claims to fix the regression.
boost 1.57.0 has been released. Can someone try and see if the regression has been fixed? If so, the check in boost-kqueue.py
should be changed from >=
to ==
.
- Status changed from New to Code review
- Target version changed from Unsupported to v0.3
- % Done changed from 0 to 100
http://gerrit.named-data.net/1444
If disabling for boost == 1.56.0 is an acceptable resolution for this task, good. Otherwise it should be rejected because we're not going to implement any other workarounds.
Yes. Just checking for 1.56 is acceptable for me. In my opinion, there is nothing what we can or should do.
- Category changed from Faces to Build
- Assignee set to Davide Pesavento
- Status changed from Code review to Resolved
- Status changed from Resolved to Feedback
Then we have no idea whether it's solved or not.
@Josh has volunteered to test this patch.
@Davide, please advice @Josh on how to test the patch.
- Status changed from Feedback to Code review
Josh has kindly verified that boost 1.57 fixed the regression. The posted patch can be merged.
- Status changed from Code review to Closed
Also available in: Atom
PDF