Project

General

Profile

Actions

Bug #4410

closed

`run_tests.py test_ndnping` fails to terminate

Added by Junxiao Shi about 7 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Integration Tests
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:

Description

Environment: Ubuntu 16.04 single node

Steps to reproduce:

  1. ./install_apps.py install_all
  2. ./run_tests.py test_ndnping

Expected: script terminates after testing completes
Actual: script prints "Ran 1 test in 6.022s OK" but fails to terminate

Diagnostics:

$ pstree -p $(pgrep run_tests.py | head -1)
run_tests.py(9466)───run_tests.py(9474)───sudo(9477)───nfd(9479)─┬─{nfd}(9481)
                                                                 └─{nfd}(9482)

Same issue occurs with `./run_tests.py test_cs_freshness


Files

4833-3.txz (23.9 KB) 4833-3.txz Junxiao Shi, 07/02/2018 06:15 AM
4833-4.txz (23.5 KB) 4833-4.txz Junxiao Shi, 07/05/2018 09:30 AM
4833-5.txz (23.4 KB) 4833-5.txz Junxiao Shi, 07/09/2018 09:00 AM

Related issues 2 (0 open2 closed)

Related to NFD - Bug #4379: integration tests: fix broken testsAbandoned

Actions
Blocks NFD - Task #4380: Run integration tests for every Jenkins buildAbandoned

Actions
Actions #1

Updated by Davide Pesavento about 7 years ago

  • Related to Bug #4379: integration tests: fix broken tests added
Actions #2

Updated by Eric Newberry about 7 years ago

In the Vagrant environment, this script terminates when running all tests together (test_all). I have not tried when running the single test, but I plan to test this shortly.

Actions #3

Updated by Eric Newberry about 7 years ago

I am unable to replicate this issue in the Vagrant environment.

Actions #4

Updated by Eric Newberry about 7 years ago

Junxiao, what particular environment are you encountering this issue on? Emulab?

Actions #5

Updated by Junxiao Shi about 7 years ago

what particular environment are you encountering this issue on?

It’s a node on a private Emulab system.

Actions #6

Updated by Eric Newberry almost 7 years ago

  • Start date deleted (12/21/2017)

I'm actually seeing this issue when I run the integration tests on Ubuntu 16.04 64-bit in Virtualbox (Vagrant uses 14.04 64-bit). It appears with test_ndnping, but not test_cs_freshness in my environment.

Actions #7

Updated by Junxiao Shi almost 7 years ago

Cause of this issue is ProcessManager.killProcess is using the wrong PID to kill process.

def killProcess(self, processKey):
    if processKey not in self.results and processKey in self.subprocesses:
        subprocess.call(['sudo', 'kill', str(self.subprocesses[processKey].pid)])

When a process is started, Python's subprocess module assigns self.subprocesses[processKey].pid to be the PID of started process.
Since NFD daemonizes itself, the PID does not match nfd process.
I confirmed this by inserting print to Python code and comparing with pgrep.

To fix this issue, ProcessManager.startNfd and ProcessManager.killNfd should use nfd-start and nfd-stop scripts.

Actions #8

Updated by Eric Newberry almost 7 years ago

  • Status changed from New to In Progress
Actions #9

Updated by Eric Newberry almost 7 years ago

  • Status changed from In Progress to Code review
  • % Done changed from 0 to 100
Actions #10

Updated by Davide Pesavento almost 7 years ago

Junxiao Shi wrote:

Since NFD daemonizes itself, the PID does not match nfd process.

What are you talking about? NFD does not daemonize itself.

Actions #11

Updated by Eric Newberry almost 7 years ago

Junxiao, what I believe you're seeing in the issue description are two threads of nfd (in the curly braces). I believe (and may be wrong) that terminating the parent process would terminate both threads and that the existing code should work to terminate NFD.

Actions #12

Updated by Eric Newberry almost 7 years ago

Another thought: Perhaps kill is terminating sudo, leaving the nfd process as an orphan.

Actions #13

Updated by Davide Pesavento almost 7 years ago

Eric Newberry wrote:

Another thought: Perhaps kill is terminating sudo, leaving the nfd process as an orphan.

Yes, kill is signaling the sudo process, but sudo should propagate the signal to its child process.

Actions #14

Updated by Eric Newberry almost 7 years ago

We could also just rewrite this test case as a Bash script, like most other test cases. These other tests cases do similar things, but do not seem to encounter this issue.

Actions #15

Updated by Junxiao Shi almost 7 years ago

We could also just rewrite this test case as a Bash script, like most other test cases.

After all tests moved to bash, there’s no need for a Python wrapper for each test. Use a bash script to invoke each test instead.

These other tests cases do similar things, but do not seem to encounter this issue.

They use nfd-stop or killall nfd mostly, as they should have.

Actions #16

Updated by Eric Newberry over 6 years ago

  • Blocks Task #4380: Run integration tests for every Jenkins build added
Actions #17

Updated by Eric Newberry over 6 years ago

I ran each test individually on the Vagrant environment and found that the following ones failed to terminate:

  • test_interest_aggregation
  • test_ndnpeekpoke
  • test_ndnping
  • test_ndntraffic
Actions #18

Updated by Eric Newberry over 6 years ago

I pushed a change to rewrite test_ndnpeekpoke, test_ndnping, and test_ndntraffic as Bash-based tests, which resolves this issue. I decided to leave test_interest_aggregation to #4379, since it's broken anyway.

Actions #19

Updated by Junxiao Shi over 6 years ago

Change 4833,3 fails to terminate in ./run-vagrant-tests.sh.

Node A has the following process when stuck. Test proceeds after executing nfd-stop in node A.

vagrant@vagrant:~/integration-tests$ pstree -p 2694
run_tests.py(2694)---run_tests.py(5858)---sudo(5861)---nfd(5863)-+-{nfd}(5864)
                                                                 `-{nfd}(5865)
Actions #20

Updated by Eric Newberry over 6 years ago

Junxiao Shi wrote:

Change 4833,3 fails to terminate in ./run-vagrant-tests.sh.

Node A has the following process when stuck. Test proceeds after executing nfd-stop in node A.

vagrant@vagrant:~/integration-tests$ pstree -p 2694
run_tests.py(2694)---run_tests.py(5858)---sudo(5861)---nfd(5863)-+-{nfd}(5864)
                                                                 `-{nfd}(5865)

As I said in note 18, I'm not planning to fix test_interest_aggregation, so this is probably the cause.

Actions #21

Updated by Junxiao Shi over 6 years ago

Change 4833,3 fails to terminate in ./run-vagrant-tests.sh.

I also see error messages during execution:

../permanent-face-test.sh: line 34: [[: 0
0: syntax error in expression (error token is "0")
./permanent-face-test.sh: line 52: [[: 100
0: syntax error in expression (error token is "0")
./permanent-face-test.sh: line 74: [[: 0
10: syntax error in expression (error token is "10")
Actions #22

Updated by Eric Newberry over 6 years ago

Junxiao Shi wrote:

I also see error messages during execution:

../permanent-face-test.sh: line 34: [[: 0
0: syntax error in expression (error token is "0")
./permanent-face-test.sh: line 52: [[: 100
0: syntax error in expression (error token is "0")
./permanent-face-test.sh: line 74: [[: 0
10: syntax error in expression (error token is "10")

Fixing the test these are occurring in is not part of this issue, but rather #4379.

Actions #23

Updated by Junxiao Shi over 6 years ago

Fixing the test these are occurring in is not part of this issue, but rather #4379.

I'm not judging which issue the error messages belong. I'm stating a fact of the appearance of these error messages, just like Jenkins fails the build whenever a test case fails regardless of whether it relates to the current commit.

Actions #24

Updated by Junxiao Shi over 6 years ago

Change 4833,5 terminates in ./run-vagrant-tests.sh. Please continue fixing other errors, in this or other commits.

Actions #25

Updated by Junxiao Shi over 6 years ago

  • Blocks Task #4656: Eliminate Python wrappers added
Actions #26

Updated by Eric Newberry over 6 years ago

  • Status changed from Code review to Closed
Actions #27

Updated by Davide Pesavento almost 6 years ago

  • Blocks deleted (Task #4656: Eliminate Python wrappers)
Actions

Also available in: Atom PDF