FaceTable::remove called twice when UnixStreamFace is destroyed
I'm seeing the following in the log
1397777209.441760 INFO: [UnixStreamFace] [id:10,endpoint:/private/tmp/nfd.sock] Connection closed 1397777209.445571 INFO: [FaceTable] Removed face id=10 remote=fd://23 local=unix:///private/tmp/nfd.sock 1397777209.448699 INFO: [FaceTable] Removed face id=-1 remote=fd://23 local=unix:///private/tmp/nfd.sock
This is not critical, but we should find this second remove and eliminate it (if we can).
#2 Updated by Davide Pesavento over 5 years ago
I cannot reproduce on Linux.
1397842660.815849 INFO: [FaceTable] Added face id=4 remote=fd://17 local=unix:///run/nfd.sock 1397842660.826540 INFO: [UnixStreamFace] [id:4,endpoint:/run/nfd.sock] Connection closed 1397842660.829236 INFO: [FaceTable] Removed face id=4 remote=fd://17 local=unix:///run/nfd.sock
What are your steps?
#5 Updated by Davide Pesavento over 5 years ago
This is what I believe is happening:
- the peer (nrd) closes his side of the connection
StreamFace::handleReceive) is invoked with an error code (EOF), which in turn calls
closeSocket, which shuts down and closes the underlying socket
- then the
onFailnotification chain is started
FaceTable's handler is run first, that in turns fires its own event
onRemove, to which
FaceManager::onRemoveFaceexecutes and sends a
FaceEventNotificationvia the internal face
- >> the forwarder dispatches the Data packet coming from the internal face to the same unix stream face to which nrd was connected
- >>> the
StreamFaceschedules the transmission and returns immediately
- back in
FaceTable::remove, the face is finally erased from the table, its id set to -1, and the INFO message printed; note that the
onReceive(Interest|Data)callback chains are cleared, but
- when control goes back to boost::asio event loop, the previously scheduled Data sending is eventually attempted but fails because the socket is closed
processErrorCodeis called again, which triggers the second
onFailevent for the same face
- the whole
onRemovechain of handlers is repeated again, but this time the
FaceEventNotificationis not propagated to the (closing) unix face because it's no longer present in the face table (otherwise we would have had an infinite loop)
- the second
FaceTableINFO message, with
id=-1, is printed
- the whole
#7 Updated by Anonymous over 5 years ago
I can reproduce this using apache jmeter and the nfd-status-http-server.
- Start NFD (sudo ./build/bin/nfd)
- Start nfd-status-http-server (./build/bin/nfd-status-http-server -v)
- Load attached jmeter configuration and select "run"
NFD will crash with the following
tail of output (default log level=WARN, FaceTable TRACE. I also tried doing default at DEBUG, but I didn't notice anything particularly interesting):
1398458023.296022 INFO: [FaceTable] Added face id=52 remote=fd://26 local=unix:///private/var/run/nfd.sock 1398458023.397617 INFO: [FaceTable] Removed face id=47 remote=fd://28 local=unix:///private/var/run/nfd.sock 1398458023.398223 INFO: [FaceTable] Added face id=53 remote=fd://27 local=unix:///private/var/run/nfd.sock 1398458023.497706 INFO: [FaceTable] Removed face id=48 remote=fd://22 local=unix:///private/var/run/nfd.sock 1398458023.498518 INFO: [FaceTable] Added face id=54 remote=fd://28 local=unix:///private/var/run/nfd.sock 1398458023.522723 WARNING: [UnixStreamFace] [id:53,endpoint:/private/var/run/nfd.sock] Send or receive operation failed, closing socket: Broken pipe 1398458023.544633 INFO: [FaceTable] Removed face id=53 remote=fd://27 local=unix:///private/var/run/nfd.sock 1398458023.569262 INFO: [FaceTable] Removed face id=-1 remote=fd://27 local=unix:///private/var/run/nfd.sock 1398458023.645803 INFO: [FaceTable] Removed face id=49 remote=fd://23 local=unix:///private/var/run/nfd.sock 1398458023.647216 FATAL: [NFD] call to empty boost::function
Credit to Chengyu for finding this while testing the http server :)
The number of clients jmeter uses impacts reproducing the bug. The attached configuration uses 100, but adjust as necessary.
#8 Updated by Davide Pesavento over 5 years ago
- Category changed from Faces to Tables
- Priority changed from Low to Normal
Raising priority because it can apparently trigger a crash in nfd.
1398458023.647216 FATAL: [NFD] call to empty boost::function
Can you find out what function is it and/or who is trying to call it?