Skip to content

bpo-31173: Rewrite WSTOPSIG test of test_subprocess#3055

Merged
vstinner merged 1 commit into
python:masterfrom
vstinner:stopped
Aug 10, 2017
Merged

bpo-31173: Rewrite WSTOPSIG test of test_subprocess#3055
vstinner merged 1 commit into
python:masterfrom
vstinner:stopped

Conversation

@vstinner

@vstinner vstinner commented Aug 10, 2017

Copy link
Copy Markdown
Member

The current test_child_terminated_in_stopped_state() function test
creates a child process which calls ptrace(PTRACE_TRACEME, 0, 0) and
then crash (SIGSEGV). The problem is that calling os.waitpid() in the
parent process is not enough to close the process: the child process
remains alive and so the unit test leaks a child process in a
strange state. Closing the child process requires non-trivial code,
maybe platform specific.

Remove the functional test and replaces it with an unit test which
mocks os.waitpid() using a new _testcapi.W_STOPCODE() function to
test the WIFSTOPPED() path.

https://bugs.python.org/issue31173

The current test_child_terminated_in_stopped_state() function test
creates a child process which calls ptrace(PTRACE_TRACEME, 0, 0) and
then crash (SIGSEGV). The problem is that calling os.waitpid() in the
parent process is not enough to close the process: the child process
remains alive and so the unit test leaks a child process in a
strange state. Closing the child process requires non-trivial code,
maybe platform specific.

Remove the functional test and replaces it with an unit test which
mocks os.waitpid() using a new _testcapi.W_STOPCODE() function to
test the WIFSTOPPED() path.
@vstinner vstinner requested a review from gpshead August 10, 2017 10:04
@vstinner vstinner merged commit 7b7c6dc into python:master Aug 10, 2017
@vstinner vstinner deleted the stopped branch August 10, 2017 10:37
@gpshead

gpshead commented Aug 10, 2017

Copy link
Copy Markdown
Member

I think your mocking test makes sense here. LGTM.

Somewhat unfortunate to lose the integration test that triggers the actual OS behavior that led to the bug and confirms it is fixed, but if it is fragile and leaking resources that is not good.

Is there no way to unwedge the stuck killed but stopped for tracing process using a ptrace API?

@gpshead gpshead self-assigned this Aug 10, 2017
@vstinner

Copy link
Copy Markdown
Member Author

Somewhat unfortunate to lose the integration test that triggers the actual OS behavior that led to the bug and confirms it is fixed, but if it is fragile and leaking resources that is not good.

The leaking resource is a blocker point for me. When I ran "./python -m test -m test_child_terminated_in_stopped_state -F test_subprocess", my laptop started to die: the total memory usage was close to 100%!

Is there no way to unwedge the stuck killed but stopped for tracing process using a ptrace API?

I hate ptrace() because it makes processes enter very weird states. For example, sending SIGKILL to a traced process is not enough to kill it... "traced process": a process which just called ptrace(PTRACE_ME) with no other process attached to it ...

I spent some time working on a debugger written in pure Python: https://github.com/haypo/python-ptrace/

I was always amazed by the weird ptrace bugs I had. Sometimes, I just had to reboot my computer. It was impossible to kill some processes, my computer was stuck.

So yeah, usually I prefer functional tests over unit tests, but here I think that's an unit test is an acceptable compromise.

What do you think?

--

To be honest, I didn't try hard to workaround the weird zombie process. I tried two additional os.waitpid() on macOS, but I had another behaviour on Linux... So I just decided to give up and try something completely different ;-)

@bedevere-bot

Copy link
Copy Markdown

GH-3070 is a backport of this pull request to the 3.6 branch.

@gpshead

gpshead commented Aug 11, 2017

Copy link
Copy Markdown
Member

btw for reference, what platform (os and version) were you on where you saw the problem?

@vstinner

Copy link
Copy Markdown
Member Author

btw for reference, what platform (os and version) were you on where you saw the problem?

The increasing memory and https://bugs.python.org/issue31173#msg300066 message was on Linux.

My attempt to use additional os.waitpid() to workaround the issue was tried on macOS.

vstinner added a commit that referenced this pull request Aug 11, 2017
The current test_child_terminated_in_stopped_state() function test
creates a child process which calls ptrace(PTRACE_TRACEME, 0, 0) and
then crash (SIGSEGV). The problem is that calling os.waitpid() in the
parent process is not enough to close the process: the child process
remains alive and so the unit test leaks a child process in a
strange state. Closing the child process requires non-trivial code,
maybe platform specific.

Remove the functional test and replaces it with an unit test which
mocks os.waitpid() using a new _testcapi.W_STOPCODE() function to
test the WIFSTOPPED() path.
(cherry picked from commit 7b7c6dc)
vstinner added a commit that referenced this pull request Aug 11, 2017
The current test_child_terminated_in_stopped_state() function test
creates a child process which calls ptrace(PTRACE_TRACEME, 0, 0) and
then crash (SIGSEGV). The problem is that calling os.waitpid() in the
parent process is not enough to close the process: the child process
remains alive and so the unit test leaks a child process in a
strange state. Closing the child process requires non-trivial code,
maybe platform specific.

Remove the functional test and replaces it with an unit test which
mocks os.waitpid() using a new _testcapi.W_STOPCODE() function to
test the WIFSTOPPED() path.
(cherry picked from commit 7b7c6dc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants