Bug #1360: producers cannot connect to running nfd if by mistake another instance of nfd is started. - NFD - NDN project issue tracking system

Actions

Copy link

Bug #1360

closed

producers cannot connect to running nfd if by mistake another instance of nfd is started.

Added by Syed Amin over 11 years ago. Updated over 11 years ago.

Status:

Closed

Priority:

Low

Assignee:

Davide Pesavento

Category:

Faces

Target version:

v0.1

Start date:

03/17/2014

Due date:

% Done:

90%

Estimated time:

Description

I by mistake started another instance of nfd on another terminal, which as expected quit after printing out the following messages:

$ sudo NFD=1 NRD=1 NFD_LOG=all ~/nfd/build/nfd --config /usr/local/etc/ndn/nfd.conf.sample
DEBUG: [NameTree] lookup /

DEBUG: [NameTree] insert /

DEBUG: [NameTree] Name / hash value = 2654435816 location = 488

DEBUG: [NameTree] Did not find /, need to insert it to the table

INFO: [StrategyChoice] setDefaultStrategy(/localhost/nfd/strategy/best-route) new entry

DEBUG: [FaceUri] URI [internal://] parsed into: internal, , ,

INFO: [FaceTable] addFace id=1

INFO: [InternalFace] registering callback for /localhost/nfd/fib

INFO: [InternalFace] registering callback for /localhost/nfd/faces

INFO: [InternalFace] registering callback for /localhost/nfd/control-header

INFO: [InternalFace] registering callback for /localhost/nfd/strategy-choice

DEBUG: [CommandValidator] generated certfile path: /usr/local/etc/ndn/keys/default.ndncert

INFO: [CommandValidator] Giving privilege "control-header" to identity /obaid/ksk-1395076550007

INFO: [CommandValidator] Giving privilege "faces" to identity /obaid/ksk-1395076550007

INFO: [CommandValidator] Giving privilege "fib" to identity /obaid/ksk-1395076550007

INFO: [CommandValidator] Giving privilege "strategy-choice" to identity /obaid/ksk-1395076550007

DEBUG: [TcpFactory] Channel [0.0.0.0:6363] created

DEBUG: [TcpFactory] Channel [[::]:6363] created

ERROR: [Main] Error: bind: Address already in use

However, after this I couldn't connect any producer to already running nfd. To connect any producer again I had to restart the running nfd as well.

Steps to reproduce the error:

On Terminal 1 start nfd:

$ sudo NFD=1 NRD=1 NFD_LOG=all ~/nfd/build/nfd --config /usr/local/etc/ndn/nfd.conf.sample

On Terminal 2 start another instance of nfd:

$ sudo NFD=1 NRD=1 NFD_LOG=all ~/nfd/build/nfd --config /usr/local/etc/ndn/nfd.conf.sample

(This will give the above mentioned errors and will quit)

Start producer:

$ NFD=1 ~/ndn-cpp-dev/build/examples/producer

ERROR: error while connecting to the forwarder (Connection refused)

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Junxiao Shi over 11 years ago

ndnd solves this problem by:

(old process) periodically checks whether UNIX socket still exists, and quits if the socket is gone
(new process) checks whether UNIX socket already exists during initialization, if yes, delete the socket and wait 8 seconds for old process to stop
deletes UNIX socket during normal shutdown

NFD doesn't always have a UNIX socket. Some script (or upstart job, not NFD itself) should record the PID into a file, and kill the old process before starting a new one.

Actions

Copy link

Updated by Alex Afanasyev over 11 years ago

But the particular problem is because the second run of NFD wasn't prevented at an early stage. Should we fix this somehow in NFD or defer this to upstart-like things?

Actions

Copy link

Updated by Junxiao Shi over 11 years ago

NFD itself doesn't need to prevent this. It should be left to upstart.

Code repository should have a bash script similar to ndndstart and ndndstop, to be used on platforms without upstart.

Actions

Copy link

Updated by Alex Afanasyev over 11 years ago

But it is kind of bad that we are removing unix socket file if NFD is accidentally run for the second time... In any case, I agree to defer this to upstart.

Actions

Copy link

Updated by Junxiao Shi over 11 years ago

Category set to Faces
Target version set to v0.1

Actions

Copy link

Updated by Davide Pesavento over 11 years ago

What exactly are the requirements here? Do we really want to support running multiple instances of nfd on the same machine concurrently?

Also, I wouldn't focus too much on upstart for this, it's dying since even debian and ubuntu will abandon it soon in favor of systemd, and almost all other major distros have switched or are switching to systemd as well.

Actions

Copy link

Updated by Junxiao Shi over 11 years ago

See #1367 on the solution to this bug.

Actions

Copy link

Updated by Davide Pesavento over 11 years ago

Assignee set to Davide Pesavento

Actions

Copy link

Updated by Davide Pesavento over 11 years ago

Status changed from New to In Progress
% Done changed from 0 to 90

commit:e22d8c84 fixed this bug in almost all cases. However there's still a small window of opportunity for a race condition between two nfd processes starting up roughly at the same time in the presence of a stale socket file. I plan to fix the race condition in a subsequent patch.

Actions

Copy link

#10

Updated by Junxiao Shi over 11 years ago

Status changed from In Progress to Closed

Actions

Copy link

#11

Updated by Davide Pesavento over 11 years ago

This bug is not completely fixed, why did you close it?

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

NFD

Tags

Bug #1360

producers cannot connect to running nfd if by mistake another instance of nfd is started.

Updated by Junxiao Shi over 11 years ago

Updated by Alex Afanasyev over 11 years ago

Updated by Junxiao Shi over 11 years ago

Updated by Alex Afanasyev over 11 years ago

Updated by Junxiao Shi over 11 years ago

Updated by Davide Pesavento over 11 years ago

Updated by Junxiao Shi over 11 years ago

Updated by Davide Pesavento over 11 years ago

Updated by Davide Pesavento over 11 years ago

Updated by Junxiao Shi over 11 years ago

Updated by Davide Pesavento over 11 years ago