Project

General

Profile

Bug #5155

"bind: Cannot assign requested address" immediately after a new IPv6 address appears on an interface

Added by Alexander Lane 2 months ago. Updated 21 days ago.

Status:
Closed
Priority:
Normal
Category:
Network
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:

Description

While we have workarounds, Mini-NDN has had a persistent IPv6 related NFD crash for several months we've had trouble solving. The issue appears to occur when NFD tries to create a multicast IPv6 face on a Mini-NDN-Wifi interface while NetworkManager is enabled. In order to mitigate it, NetworkManager has to be disabled and an additional (~3s) wait time be given for NFD, or for the node to have no IPv6 address be assigned at all. Attached is a Mini-NDN script I used to consistently reproduce the issue (check logs of node a) and a log from one of these runs.

Mini-NDN can be found here, the installation scripts should include all needed dependencies: https://github.com/named-data/mini-ndn


Files

example_log.txt (6.75 KB) example_log.txt Alexander Lane, 04/05/2021 10:13 PM
example.py (1.86 KB) example.py Alexander Lane, 04/05/2021 10:14 PM
example_log_after_patch.txt (9.17 KB) example_log_after_patch.txt Request logs Alexander Lane, 04/08/2021 05:08 PM
#1

Updated by Davide Pesavento 2 months ago

  • Tracker changed from Task to Bug
  • Subject changed from NFD Crashes with IPv6 in Mini-NDN-Wifi to "bind: Cannot assign requested address" with IPv6 in Mini-NDN-Wifi
  • Category set to Faces
  • Start date deleted (04/06/2021)
#2

Updated by Alexander Lane 2 months ago

  • Description updated (diff)
#3

Updated by Davide Pesavento 2 months ago

is this on Ubuntu? what version?

#4

Updated by Alexander Lane 2 months ago

Davide Pesavento wrote in #note-3:

is this on Ubuntu? what version?

I've been working on 18.04 LTS in a VirtualBox VM, I can try to reproduce on 20.04/20.10 if that would be helpful.

#5

Updated by Junxiao Shi 2 months ago

I cannot reproduce this issue with the given script on Ubuntu 20.04.
The experiment was conducted on Virtual Wall 1 testbed, pcgen02-1p instance.
Version and network information are pasted below.

mini-ndn> a head -2 log/nfd.log
NFD version 0.7.1-21-g7249fb4d starting
Built with GNU C++ version 9.3.0, with GNU libstdc++ version 20200808, with Boost version 1.71.0, with libpcap version 1.9.1 (with TPACKET_V3), with WebSocket++ version 0.8.1, with ndn-cxx version 0.7.1-22-gc25e463e

mini-ndn> a nfdc face list scheme udp6
faceid=259 remote=udp6://[ff02::1234%a-wlan0]:56363 local=udp6://[2001::1]:35670 congestion={base-marking-interval=100ms default-threshold=65536B} mtu=8800 counters={in={0i 0d 0n 0B} out={0i 0d 0n 0B}} flags={non-local permanent multi-access congestion-marking}

mini-ndn> a ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
26: a-wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc htb state UP group default qlen 1000
    link/ether 02:00:00:00:0a:00 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.1/8 brd 10.255.255.255 scope global a-wlan0
       valid_lft forever preferred_lft forever
    inet6 2001::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::ff:fe00:a00/64 scope link
       valid_lft forever preferred_lft forever

mini-ndn> b nfdc face list scheme udp6
faceid=259 remote=udp6://[ff02::1234%b-eth0]:56363 local=udp6://[fe80::88ea:d1ff:fe54:acce%b-eth0]:45768 congestion={base-marking-interval=100ms default-threshold=65536B} mtu=8800 counters={in={0i 0d 0n 0B} out={0i 0d 0n 0B}} flags={non-local permanent multi-access congestion-marking}

mini-ndn> b ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: b-eth0@if28: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc htb state UP group default qlen 1000
    link/ether 8a:ea:d1:54:ac:ce brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.0.0.2/8 brd 10.255.255.255 scope global b-eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::88ea:d1ff:fe54:acce/64 scope link
       valid_lft forever preferred_lft forever

My installation steps were:

# enable IPv4 NAT to reach GitHub
wget -O - -nv --ciphers DEFAULT@SECLEVEL=1 https://www.wall2.ilabt.iminds.be/enable-nat.sh | sudo bash

# install NFD nightly packages
echo "deb [trusted=yes] https://nfd-nightly-apt.ndn.today/ubuntu focal main" | sudo tee /etc/apt/sources.list.d/nfd-nightly.list
sudo apt-get update
sudo apt-get install -qq infoedit libndn-cxx-dev ndnchunks ndnpeek ndnping nfd nlsr
sudo systemctl stop nfd
sudo systemctl stop ndnping

# set Python 3 as default
sudo update-alternatives --remove-all python || true
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3 1

# install Mininet and Mini-NDN
git clone https://github.com/named-data/mini-ndn.git
cd mini-ndn
./install.sh -wmi

# experiment crashes without this file
sudo mkdir -p /etc/NetworkManager/conf.d/
sudo touch /etc/NetworkManager/conf.d/unmanaged.conf

# run the experiment script
wget https://redmine.named-data.net/attachments/download/946/example.py
sudo python example.py
#6

Updated by Davide Pesavento 2 months ago

Alex, can you please apply this patch https://gerrit.named-data.net/c/NFD/+/6400 and run the experiment again until it crashes, then attach the log file?

#7

Updated by Alexander Lane 2 months ago

Davide Pesavento wrote in #note-6:

Alex, can you please apply this patch https://gerrit.named-data.net/c/NFD/+/6400 and run the experiment again until it crashes, then attach the log file?

I have done so, should be attached. I will try 20.04 and see if my local behavior is consistent with what Junxiao is experiencing.

#8

Updated by Davide Pesavento 2 months ago

  • Subject changed from "bind: Cannot assign requested address" with IPv6 in Mini-NDN-Wifi to "bind: Cannot assign requested address" immediately after a new IPv6 address appears on an interface
#9

Updated by Davide Pesavento 2 months ago

Alexander Lane wrote in #note-7:

I will try 20.04 and see if my local behavior is consistent with what Junxiao is experiencing.

No need. I have all the info I need. And I can easily reproduce the crash outside Mini-NDN.

#10

Updated by Davide Pesavento 2 months ago

TL;DR: it appears that the kernel doesn't like it if you try to bind a socket to a tentative IPv6 address.

Whenever an IPv6 address is added to an interface, it will first appear with the IFA_F_TENTATIVE flag for a short period of time (1-2 seconds), after which the flag is dropped and the address can be used. NFD reacts to NetworkMonitor's onAddressAdded signal to create new multicast faces and does not check the tentative flag on the new address. So, if NFD is quick enough, it may attempt to bind the MulticastUdpTransport's TX socket to the new address while it's still tentative, causing the system call to fail.

I suppose NetworkMonitor can be changed to ignore tentative addresses and only emit onAddressAdded when the address becomes non-tentative (i.e., valid). I don't see a use case for reporting tentative addresses to "normal" applications (including NFD) as they can't effectively be used for communication.

On top of that, the UdpFactory code can be made more robust against unexpected exceptions from the lower levels.

#11

Updated by Davide Pesavento 28 days ago

  • Status changed from New to In Progress
  • Assignee set to Davide Pesavento
  • Target version set to v0.8
  • % Done changed from 0 to 50
#12

Updated by Davide Pesavento 27 days ago

  • Project changed from NFD to ndn-cxx
  • Category changed from Faces to Network
  • Status changed from In Progress to Code review
  • Target version changed from v0.8 to v0.8
  • % Done changed from 50 to 100
#13

Updated by Davide Pesavento 21 days ago

  • Status changed from Code review to Closed

Also available in: Atom PDF