rootwrap returns wrong exit code of subprocess

Bug #1364822 reported by Jakub Libosvar
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
oslo.rootwrap
Fix Released
Medium
Jakub Libosvar

Bug Description

If child process of rootwrap is terminated by signal, exit code should be 128 + N where N is number of signal. According [1] subprocess.Popen.returncode returns -N. sys.exit interprets negative signed byte as unsigned thus original exit code is different.

e.g.
./rootwrap process # starts two processes
kill -9 $(pidof process) # Popen.returncode is -9, rootwrap calls sys.exit(-9)
echo $?
247 # instead of expected 137 (-9 signed char is binary 11110111 which is 247 in unsigned char)

 [1] https://docs.python.org/2/library/subprocess.html#subprocess.Popen.returncode

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.rootwrap (master)

Fix proposed to branch: master
Review: https://review.openstack.org/118569

Changed in oslo.rootwrap:
assignee: nobody → Jakub Libosvar (libosvar)
status: New → In Progress
Revision history for this message
Yuriy Taraday (yorik-sar) wrote :

Can you please point out why should return value be equal to 128 + signal number?

From what I see this value is used by shells when their children die because of signals, but I can't see any pointers if this is expected behavior for process wrappers.

Revision history for this message
Jakub Libosvar (libosvar) wrote :

http://www.tldp.org/LDP/abs/html/exitcodes.html

If you send SIGKILL to a child of sudo, sudo will exit with 137 (in BASH).

Revision history for this message
Thierry Carrez (ttx) wrote :

That sounds like the expected behavior in bash... but the convention (default) in subprocess.Popen is to return -N. Why should we prefer the former ? If there is no strong case for it (like an actual bug that would fix) I think I prefer to follow Python stdlib convention over bash convention...

Revision history for this message
Jakub Libosvar (libosvar) wrote :

I would agree with "I think I prefer to follow Python stdlib convention over bash convention" if rootwrap was meant to be a Python library. But it's rather a CLI tool and thus imo return code should be according bash convention.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote :

Python's subprocess module converts the exit code to something easier for python programs calling subprocess to interpret: http://hg.python.org/cpython/file/7b0fdc1e917a/Lib/subprocess.py#l1343

We should probably have rootwrap exit with the same value as its child, since we want to treat it as a thin wrapper and transmit as much information as possible to the caller. When called from another python program via subprocess, that exit status will be re-converted to the negative signal value and when called from a shell script, the exit code could be interpreted using normal POSIX conventions of having the "exit code" in the low-order bits and the signal in the high-order bits.

Unfortunately, the actual exit status value is lost in the process because the raw value is never stored anywhere. I see two ways to fix that.

1. We could manually reverse the conversion, as in the proposed patch. I think this is probably safe, but it seems a little clunky.

2. We could subclass Popen() to replace the _handle_exitstatus() method with a version that saves the raw exit code and then calls the parent class implementation. The method is private to Popen, so this implementation is brittle.

I'm leaning toward keeping the implementation in the proposed patch (approach 1).

Changed in oslo.rootwrap:
importance: Undecided → Medium
milestone: none → juno-rc1
Revision history for this message
Yuriy Taraday (yorik-sar) wrote :

Now that I know that's POSIX(-ish) behavior, I think we should do that, too. But only for standalone rootwrap, daemon mode already passes raw Popen.return_status to caller. So we'll have a bit of a difference between how we handle standalone and daemon calls in the caller.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote :

Isn't the return_status passed to the caller in daemon mode already passing through the -1*SIGNUM conversion? If so, I think the net effect will be the same for python programs using subprocess to invoke the command line tool and using the daemon mode.

Revision history for this message
Yuriy Taraday (yorik-sar) wrote :

I mean we'll need to pass return value from daemon as is and add some preprocessing to return value of standalone rootwrap call.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote :

OK, I see what you mean. That sounds like the right thing to do.

Changed in oslo.rootwrap:
milestone: juno-rc1 → next-juno
Thierry Carrez (ttx)
Changed in oslo.rootwrap:
milestone: next-juno → next-kilo
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.rootwrap (master)

Reviewed: https://review.openstack.org/118569
Committed: https://git.openstack.org/cgit/openstack/oslo.rootwrap/commit/?id=56e9cb5b076ae2b5a687e50f940e459b506aa128
Submitter: Jenkins
Branch: master

commit 56e9cb5b076ae2b5a687e50f940e459b506aa128
Author: Jakub Libosvar <email address hidden>
Date: Wed Sep 3 10:34:57 2014 +0200

    Fix exit of subprocess in case it was terminated by signal

    If child process of rootwrap is terminated by signal, exit code should
    be 128 + N, where N is number of signal. According [1]
    subprocess.Popen.returncode returns -N. sys.exit interprets negative
    signed byte as unsigned thus original exit code is 256 - N instead of
    128 + N.

    [1]
    https://docs.python.org/2/library/subprocess.html#subprocess.Popen.returncode

    Change-Id: I43b4c959a98f8f620466c77de4cd5b724b09e5a8
    Closes-Bug: #1364822

Changed in oslo.rootwrap:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in oslo.rootwrap:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.