Bug #4410
closed`run_tests.py test_ndnping` fails to terminate
Added by Junxiao Shi almost 7 years ago. Updated over 6 years ago.
100%
Description
Environment: Ubuntu 16.04 single node
Steps to reproduce:
./install_apps.py install_all
./run_tests.py test_ndnping
Expected: script terminates after testing completes
Actual: script prints "Ran 1 test in 6.022s OK" but fails to terminate
Diagnostics:
$ pstree -p $(pgrep run_tests.py | head -1)
run_tests.py(9466)───run_tests.py(9474)───sudo(9477)───nfd(9479)─┬─{nfd}(9481)
└─{nfd}(9482)
Same issue occurs with `./run_tests.py test_cs_freshness
Files
4833-3.txz (23.9 KB) 4833-3.txz | Junxiao Shi, 07/02/2018 06:15 AM | ||
4833-4.txz (23.5 KB) 4833-4.txz | Junxiao Shi, 07/05/2018 09:30 AM | ||
4833-5.txz (23.4 KB) 4833-5.txz | Junxiao Shi, 07/09/2018 09:00 AM |
Updated by Davide Pesavento almost 7 years ago
- Related to Bug #4379: integration tests: fix broken tests added
Updated by Eric Newberry almost 7 years ago
In the Vagrant environment, this script terminates when running all tests together (test_all
). I have not tried when running the single test, but I plan to test this shortly.
Updated by Eric Newberry almost 7 years ago
I am unable to replicate this issue in the Vagrant environment.
Updated by Eric Newberry almost 7 years ago
Junxiao, what particular environment are you encountering this issue on? Emulab?
Updated by Junxiao Shi almost 7 years ago
what particular environment are you encountering this issue on?
It’s a node on a private Emulab system.
Updated by Eric Newberry almost 7 years ago
- Start date deleted (
12/21/2017)
I'm actually seeing this issue when I run the integration tests on Ubuntu 16.04 64-bit in Virtualbox (Vagrant uses 14.04 64-bit). It appears with test_ndnping, but not test_cs_freshness in my environment.
Updated by Junxiao Shi almost 7 years ago
Cause of this issue is ProcessManager.killProcess
is using the wrong PID to kill process.
def killProcess(self, processKey):
if processKey not in self.results and processKey in self.subprocesses:
subprocess.call(['sudo', 'kill', str(self.subprocesses[processKey].pid)])
When a process is started, Python's subprocess module assigns self.subprocesses[processKey].pid
to be the PID of started process.
Since NFD daemonizes itself, the PID does not match nfd
process.
I confirmed this by inserting print
to Python code and comparing with pgrep
.
To fix this issue, ProcessManager.startNfd
and ProcessManager.killNfd
should use nfd-start
and nfd-stop
scripts.
Updated by Eric Newberry almost 7 years ago
- Status changed from New to In Progress
Updated by Eric Newberry almost 7 years ago
- Status changed from In Progress to Code review
- % Done changed from 0 to 100
Updated by Davide Pesavento almost 7 years ago
Junxiao Shi wrote:
Since NFD daemonizes itself, the PID does not match
nfd
process.
What are you talking about? NFD does not daemonize itself.
Updated by Eric Newberry almost 7 years ago
Junxiao, what I believe you're seeing in the issue description are two threads of nfd (in the curly braces). I believe (and may be wrong) that terminating the parent process would terminate both threads and that the existing code should work to terminate NFD.
Updated by Eric Newberry almost 7 years ago
Another thought: Perhaps kill
is terminating sudo
, leaving the nfd
process as an orphan.
Updated by Davide Pesavento almost 7 years ago
Eric Newberry wrote:
Another thought: Perhaps
kill
is terminatingsudo
, leaving thenfd
process as an orphan.
Yes, kill
is signaling the sudo
process, but sudo
should propagate the signal to its child process.
Updated by Eric Newberry almost 7 years ago
We could also just rewrite this test case as a Bash script, like most other test cases. These other tests cases do similar things, but do not seem to encounter this issue.
Updated by Junxiao Shi almost 7 years ago
We could also just rewrite this test case as a Bash script, like most other test cases.
After all tests moved to bash, there’s no need for a Python wrapper for each test. Use a bash script to invoke each test instead.
These other tests cases do similar things, but do not seem to encounter this issue.
They use nfd-stop
or killall nfd
mostly, as they should have.
Updated by Eric Newberry over 6 years ago
- Blocks Task #4380: Run integration tests for every Jenkins build added
Updated by Eric Newberry over 6 years ago
I ran each test individually on the Vagrant environment and found that the following ones failed to terminate:
- test_interest_aggregation
- test_ndnpeekpoke
- test_ndnping
- test_ndntraffic
Updated by Eric Newberry over 6 years ago
I pushed a change to rewrite test_ndnpeekpoke, test_ndnping, and test_ndntraffic as Bash-based tests, which resolves this issue. I decided to leave test_interest_aggregation to #4379, since it's broken anyway.
Updated by Junxiao Shi over 6 years ago
- File 4833-3.txz 4833-3.txz added
Change 4833,3 fails to terminate in ./run-vagrant-tests.sh
.
Node A has the following process when stuck. Test proceeds after executing nfd-stop
in node A.
vagrant@vagrant:~/integration-tests$ pstree -p 2694
run_tests.py(2694)---run_tests.py(5858)---sudo(5861)---nfd(5863)-+-{nfd}(5864)
`-{nfd}(5865)
Updated by Eric Newberry over 6 years ago
Junxiao Shi wrote:
Change 4833,3 fails to terminate in
./run-vagrant-tests.sh
.Node A has the following process when stuck. Test proceeds after executing
nfd-stop
in node A.vagrant@vagrant:~/integration-tests$ pstree -p 2694 run_tests.py(2694)---run_tests.py(5858)---sudo(5861)---nfd(5863)-+-{nfd}(5864) `-{nfd}(5865)
As I said in note 18, I'm not planning to fix test_interest_aggregation, so this is probably the cause.
Updated by Junxiao Shi over 6 years ago
- File 4833-4.txz 4833-4.txz added
Change 4833,3 fails to terminate in ./run-vagrant-tests.sh
.
I also see error messages during execution:
../permanent-face-test.sh: line 34: [[: 0
0: syntax error in expression (error token is "0")
./permanent-face-test.sh: line 52: [[: 100
0: syntax error in expression (error token is "0")
./permanent-face-test.sh: line 74: [[: 0
10: syntax error in expression (error token is "10")
Updated by Eric Newberry over 6 years ago
Junxiao Shi wrote:
I also see error messages during execution:
../permanent-face-test.sh: line 34: [[: 0 0: syntax error in expression (error token is "0") ./permanent-face-test.sh: line 52: [[: 100 0: syntax error in expression (error token is "0") ./permanent-face-test.sh: line 74: [[: 0 10: syntax error in expression (error token is "10")
Fixing the test these are occurring in is not part of this issue, but rather #4379.
Updated by Junxiao Shi over 6 years ago
Fixing the test these are occurring in is not part of this issue, but rather #4379.
I'm not judging which issue the error messages belong. I'm stating a fact of the appearance of these error messages, just like Jenkins fails the build whenever a test case fails regardless of whether it relates to the current commit.
Updated by Junxiao Shi over 6 years ago
- File 4833-5.txz 4833-5.txz added
Change 4833,5 terminates in ./run-vagrant-tests.sh
. Please continue fixing other errors, in this or other commits.
Updated by Junxiao Shi over 6 years ago
- Blocks Task #4656: Eliminate Python wrappers added
Updated by Eric Newberry over 6 years ago
- Status changed from Code review to Closed
Updated by Davide Pesavento almost 6 years ago
- Blocks deleted (Task #4656: Eliminate Python wrappers)