su: kill child process group on signal, not just immediate child
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
shadow (Debian) |
Fix Released
|
Unknown
|
|||
shadow (Ubuntu) |
Fix Released
|
High
|
Colin Watson | ||
Raring |
Fix Released
|
High
|
Colin Watson |
Bug Description
[Impact] Operational pain on Ubuntu builders every time a build hangs.
[Test Case] See below, starting with 'pgrep sleep'.
[Regression Potential] It's su; we should be pretty careful. Running saucy builds with this for a while will help, and seeing whether anyone objects to the broader process-killing.
Original report follows:
Imported from Debian bug http://
Package: shadow
Version: 1:4.1.5.1-1
Severity: normal
User: <email address hidden>
Usertags: origin-ubuntu saucy
For some time I've noticed that, when an Ubuntu build times out (150
minutes with no output), sbuild tries to terminate it, and I see a
"Session terminated, terminating shell... ...terminated." message in the
log (which is from su), but the build does not actually terminate
properly. Now, in both Debian and Ubuntu, sbuild invokes builds using
something like this simplified command:
sudo chroot $chroot su $username -s sh -c "cd $dir && exec dpkg-buildpackage"
When su receives a signal, it passes it on to its child process (it has
to go to unusual lengths here because it starts new sessions). However,
it only kills its immediate child, not the associated process group.
This means that you can do something like this:
$ pgrep sleep
$ su cjwatson -c 'sh -c "sleep 1h"'
Password:
[wait a few seconds]
^C
Session terminated, terminating shell...Sessions still open, not unmounting
...killed.
$ pgrep sleep
32421
This is inconvenient; in this case it means we often have to ask
sysadmins to manually kill processes for us. I don't have much
visibility into Debian buildds but I suspect there are similar problems
there from time to time.
Could su please kill the process group associated with its immediate
child process instead? This should just be a matter of negating the pid
passed to kill. If it did that, then I think it would do a much better
job of cleaning up after itself.
Thanks,
--
Colin Watson [<email address hidden>]
Related branches
description: | updated |
Changed in shadow (Debian): | |
importance: | Undecided → Unknown |
Changed in shadow (Debian): | |
status: | New → Fix Committed |
Changed in shadow (Debian): | |
status: | Fix Committed → Fix Released |
I can only reproduce this in raring and saucy. It looks like this was triggered by the change in 4.1.5 (Debian #628843) to avoid giving noninteractive children a controlling terminal.