Bug #1984
closedEthernetFace crash upon NIC down
100%
Description
Run nfd with at least one Ethernet face, e.g.:
faceid=257 remote=ether://[01:00:5e:00:17:aa] local=dev://eth0
Run
ip link set eth0 down
Watch nfd committing suicide:
1411055327.657042 FATAL: [NFD] pcap_next_ex(): The interface went down
Updated by Junxiao Shi over 10 years ago
- Description updated (diff)
- Category set to Faces
- Target version set to v0.3
This shall be expected behavior. Operator is supposed to setup ifupdown scripts to restart NFD.
Updated by Davide Pesavento over 10 years ago
The presence of an ifupdown script that restarts NFD is orthogonal to this issue. NFD should not crash, period.
If I remember correctly, some time ago we decided that--in case of network connectivity changes--an external entity would send a HUP signal to NFD, which in turn would reload its config file and reinitialize the affected channels and multicast faces. This implies that NFD must survive the "interface down" event, otherwise using SIGHUP is pointless.
Updated by Junxiao Shi over 10 years ago
- Subject changed from nfd dies when the network interface used by an EthernetFace goes down to EthernetFace crash upon NIC down
- Assignee set to Davide Pesavento
- Estimated time set to 2.00 h
Confirmed.
This bug is due to improper error handling in EthernetFace.
EthernetFace::handleRead
throws an exception when pcap_next_ex
returns an error code.
It should invoke .fail
instead.
EthernetFace::handleRead
also throws exceptions when captured frame is too short or has incorrect Ethertype.
These exceptions are acceptable, because libpcap should never return such unexpected frames.
Updated by Davide Pesavento over 10 years ago
Junxiao Shi wrote:
EthernetFace::handleRead
also throws exceptions when captured frame is too short or has incorrect Ethertype.
These exceptions are acceptable, because libpcap should never return such unexpected frames.
Are you sure that libpcap never returns short frames? For robustness we should call fail
(or ignore the captured frame and continue) in such cases as well.
Updated by Junxiao Shi over 10 years ago
Are you sure that libpcap never returns short frames? For robustness we should call
fail
(or ignore the captured frame and continue) in such cases as well.
libpcap is expected not to return anything that does not match the bpf_program.
This is part of the contract, and it's unnecessary to double check.
Updated by Davide Pesavento over 10 years ago
Junxiao Shi wrote:
libpcap is expected not to return anything that does not match the bpf_program.
There is nothing in the BPF filter about too-short frames (so called runt frames). They should be discarded simply because they violate IEEE 802.3
This is part of the contract, and it's unnecessary to double check.
Assuming that you're talking about the ethertype, the check is already there. What I'm saying is that we should just ignore the packet or call fail
, instead of throwing an exception and dying. Are you arguing that we should remove the check altogether?
Updated by Junxiao Shi over 10 years ago
We should look at libpcap documentation to understand whether it should discard runt frames.
If yes, ASSERT that runt frames are not coming to our code.
If no, check for runt frames and ignore them.
Since bpf_program guarantees correct Ethertype, we can ASSERT Ethertype is correct.
In production builds, this check is effectively disabled because it's unnecessary.
Updated by Davide Pesavento about 10 years ago
- Status changed from New to In Progress
Updated by Davide Pesavento about 10 years ago
- Status changed from In Progress to Code review
- % Done changed from 0 to 80
Updated by Davide Pesavento about 10 years ago
- Status changed from Code review to Closed