Bug #2121: nfd-status-http-server subprocesses hang with data size over 65536 - NFD - NDN project issue tracking system

Actions

Copy link

Bug #2121

closed

nfd-status-http-server subprocesses hang with data size over 65536

Added by John DeHart over 10 years ago. Updated over 10 years ago.

Status:

Closed

Priority:

Urgent

Assignee:

Alex Afanasyev

Category:

Tools

Target version:

v0.3

Start date:

11/03/2014

Due date:

% Done:

100%

Estimated time:

1.00 h

Description

nfd-status-http-server.py spawns subprocesses of 'nfd-status -x' to gather
nfd status information. If the size of the data returned from such an
nfd-status is over 65536 the subprocess hangs. I believe this is
a limitation in the subprocess.PIPE mechanism.

There is some discussion of this issue here:
http://thraxil.org/users/anders/posts/2008/03/13/Subprocess-Hanging-PIPE-is-your-enemy/

This is a rather serious problem as of our Testbed nodes are now limited
in the number of links they can have. For example when I tried to add the WASEDA
node I wanted to add a link from WASEDA to ARIZONA but was unable to because
the ARIZONA node would eventually hang with all the stuck subprocesses caused
by the testbed status page trying to monitor its status.

Actions

Copy link

Updated by Davide Pesavento over 10 years ago

Pipes are not buffers. The reading side of a pipe is expected to consume available data as soon as possible. The 64K limitation actually comes from the kernel, but it's intended. From pipe(7):

A pipe has a limited capacity. If the pipe is full, then a write(2) will block or fail, depending on whether the O_NONBLOCK flag is set (see below). Different implementations have different limits for the pipe capacity. Applications should not rely on a particular capacity: an application should be designed so that a reading process consumes data as soon as it is available, so that a writing process does not remain blocked.
In Linux versions before 2.6.11, the capacity of a pipe was the same as the system page size (e.g., 4096 bytes on i386). Since Linux 2.6.11, the pipe capacity is 65536 bytes. Since Linux 2.6.35, the default pipe capacity is 65536 bytes, but the capacity can be queried and set using the fcntl(2) F_GETPIPE_SZ and F_SETPIPE_SZ operations. See fcntl(2) for more information.

Python's documentation has several warnings about incorrect usage of subprocess.PIPE:

Note: Do not use stdout=PIPE or stderr=PIPE with this function as that can deadlock based on the child process output volume. Use Popen with the communicate() method when you need pipes.

and on .wait():

Warning: This will deadlock when using stdout=PIPE and/or stderr=PIPE and the child process generates enough output to a pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use communicate() to avoid that.

So the solution is to use Popen.communicate() to periodically read from the pipe.

Actions

Copy link

Updated by Junxiao Shi over 10 years ago

Category set to Tools
Assignee set to Alex Afanasyev
Target version set to v0.3
Estimated time set to 3.00 h

Actions

Copy link

Updated by Alex Afanasyev over 10 years ago

Status changed from New to Code review
% Done changed from 0 to 100
Estimated time changed from 3.00 h to 1.00 h

Actions

Copy link

Updated by Junxiao Shi over 10 years ago

To trigger this Bug:

for I in $(seq 0 200); do echo $I | nc -u localhost 6363 & done
curl http://localhost:8080

Actions

Copy link

Updated by Junxiao Shi over 10 years ago

Status changed from Code review to Closed

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

NFD

Tags

Bug #2121

nfd-status-http-server subprocesses hang with data size over 65536

Updated by Davide Pesavento over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Alex Afanasyev over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago