Bug #5155
closed"bind: Cannot assign requested address" immediately after a new IPv6 address appears on an interface
Added by Alexander Lane over 3 years ago. Updated over 3 years ago.
100%
Description
While we have workarounds, Mini-NDN has had a persistent IPv6 related NFD crash for several months we've had trouble solving. The issue appears to occur when NFD tries to create a multicast IPv6 face on a Mini-NDN-Wifi interface while NetworkManager is enabled. In order to mitigate it, NetworkManager has to be disabled and an additional (~3s) wait time be given for NFD, or for the node to have no IPv6 address be assigned at all. Attached is a Mini-NDN script I used to consistently reproduce the issue (check logs of node a) and a log from one of these runs.
Mini-NDN can be found here, the installation scripts should include all needed dependencies: https://github.com/named-data/mini-ndn
Files
example_log.txt (6.75 KB) example_log.txt | Alexander Lane, 04/05/2021 10:13 PM | ||
example.py (1.86 KB) example.py | Alexander Lane, 04/05/2021 10:14 PM | ||
example_log_after_patch.txt (9.17 KB) example_log_after_patch.txt | Request logs | Alexander Lane, 04/08/2021 05:08 PM |
Updated by Davide Pesavento over 3 years ago
- Tracker changed from Task to Bug
- Subject changed from NFD Crashes with IPv6 in Mini-NDN-Wifi to "bind: Cannot assign requested address" with IPv6 in Mini-NDN-Wifi
- Category set to Faces
- Start date deleted (
04/06/2021)
Updated by Alexander Lane over 3 years ago
Davide Pesavento wrote in #note-3:
is this on Ubuntu? what version?
I've been working on 18.04 LTS in a VirtualBox VM, I can try to reproduce on 20.04/20.10 if that would be helpful.
Updated by Junxiao Shi over 3 years ago
I cannot reproduce this issue with the given script on Ubuntu 20.04.
The experiment was conducted on Virtual Wall 1 testbed, pcgen02-1p instance.
Version and network information are pasted below.
mini-ndn> a head -2 log/nfd.log
NFD version 0.7.1-21-g7249fb4d starting
Built with GNU C++ version 9.3.0, with GNU libstdc++ version 20200808, with Boost version 1.71.0, with libpcap version 1.9.1 (with TPACKET_V3), with WebSocket++ version 0.8.1, with ndn-cxx version 0.7.1-22-gc25e463e
mini-ndn> a nfdc face list scheme udp6
faceid=259 remote=udp6://[ff02::1234%a-wlan0]:56363 local=udp6://[2001::1]:35670 congestion={base-marking-interval=100ms default-threshold=65536B} mtu=8800 counters={in={0i 0d 0n 0B} out={0i 0d 0n 0B}} flags={non-local permanent multi-access congestion-marking}
mini-ndn> a ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
26: a-wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc htb state UP group default qlen 1000
link/ether 02:00:00:00:0a:00 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.1/8 brd 10.255.255.255 scope global a-wlan0
valid_lft forever preferred_lft forever
inet6 2001::1/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::ff:fe00:a00/64 scope link
valid_lft forever preferred_lft forever
mini-ndn> b nfdc face list scheme udp6
faceid=259 remote=udp6://[ff02::1234%b-eth0]:56363 local=udp6://[fe80::88ea:d1ff:fe54:acce%b-eth0]:45768 congestion={base-marking-interval=100ms default-threshold=65536B} mtu=8800 counters={in={0i 0d 0n 0B} out={0i 0d 0n 0B}} flags={non-local permanent multi-access congestion-marking}
mini-ndn> b ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: b-eth0@if28: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc htb state UP group default qlen 1000
link/ether 8a:ea:d1:54:ac:ce brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.0.0.2/8 brd 10.255.255.255 scope global b-eth0
valid_lft forever preferred_lft forever
inet6 fe80::88ea:d1ff:fe54:acce/64 scope link
valid_lft forever preferred_lft forever
My installation steps were:
# enable IPv4 NAT to reach GitHub
wget -O - -nv --ciphers DEFAULT@SECLEVEL=1 https://www.wall2.ilabt.iminds.be/enable-nat.sh | sudo bash
# install NFD nightly packages
echo "deb [trusted=yes] https://nfd-nightly-apt.ndn.today/ubuntu focal main" | sudo tee /etc/apt/sources.list.d/nfd-nightly.list
sudo apt-get update
sudo apt-get install -qq infoedit libndn-cxx-dev ndnchunks ndnpeek ndnping nfd nlsr
sudo systemctl stop nfd
sudo systemctl stop ndnping
# set Python 3 as default
sudo update-alternatives --remove-all python || true
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3 1
# install Mininet and Mini-NDN
git clone https://github.com/named-data/mini-ndn.git
cd mini-ndn
./install.sh -wmi
# experiment crashes without this file
sudo mkdir -p /etc/NetworkManager/conf.d/
sudo touch /etc/NetworkManager/conf.d/unmanaged.conf
# run the experiment script
wget https://redmine.named-data.net/attachments/download/946/example.py
sudo python example.py
Updated by Davide Pesavento over 3 years ago
Alex, can you please apply this patch https://gerrit.named-data.net/c/NFD/+/6400 and run the experiment again until it crashes, then attach the log file?
Updated by Alexander Lane over 3 years ago
Davide Pesavento wrote in #note-6:
Alex, can you please apply this patch https://gerrit.named-data.net/c/NFD/+/6400 and run the experiment again until it crashes, then attach the log file?
I have done so, should be attached. I will try 20.04 and see if my local behavior is consistent with what Junxiao is experiencing.
Updated by Davide Pesavento over 3 years ago
- Subject changed from "bind: Cannot assign requested address" with IPv6 in Mini-NDN-Wifi to "bind: Cannot assign requested address" immediately after a new IPv6 address appears on an interface
Updated by Davide Pesavento over 3 years ago
Alexander Lane wrote in #note-7:
I will try 20.04 and see if my local behavior is consistent with what Junxiao is experiencing.
No need. I have all the info I need. And I can easily reproduce the crash outside Mini-NDN.
Updated by Davide Pesavento over 3 years ago
TL;DR: it appears that the kernel doesn't like it if you try to bind a socket to a tentative IPv6 address.
Whenever an IPv6 address is added to an interface, it will first appear with the IFA_F_TENTATIVE
flag for a short period of time (1-2 seconds), after which the flag is dropped and the address can be used. NFD reacts to NetworkMonitor's onAddressAdded
signal to create new multicast faces and does not check the tentative flag on the new address. So, if NFD is quick enough, it may attempt to bind the MulticastUdpTransport's TX socket to the new address while it's still tentative, causing the system call to fail.
I suppose NetworkMonitor can be changed to ignore tentative addresses and only emit onAddressAdded
when the address becomes non-tentative (i.e., valid). I don't see a use case for reporting tentative addresses to "normal" applications (including NFD) as they can't effectively be used for communication.
On top of that, the UdpFactory code can be made more robust against unexpected exceptions from the lower levels.
Updated by Davide Pesavento over 3 years ago
- Status changed from New to In Progress
- Assignee set to Davide Pesavento
- Target version set to 22.02
- % Done changed from 0 to 50
Untested patch for ndn-cxx: https://gerrit.named-data.net/c/ndn-cxx/+/6424
Updated by Davide Pesavento over 3 years ago
- Project changed from NFD to ndn-cxx
- Category changed from Faces to Network
- Status changed from In Progress to Code review
- Target version changed from 22.02 to 0.8.0
- % Done changed from 50 to 100
Updated by Davide Pesavento over 3 years ago
- Status changed from Code review to Closed