reboot does not return under systemd
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| | systemd (Ubuntu) |
Low
|
Martin Pitt | ||
Bug Description
If you send a shutdown or reboot command over SSH to a Trusty or Utopic host, the command will consistently finish successfully prior to the SSH connection being closed, meaning your SSH client will exit with a return-code of zero:
For example:
$ ssh root@myhost shutdown -h now
$ echo $?
0
Or
$ ssh root@myhost reboot
$ echo $?
0
However, on Vivid now that the switch-over to systemd has happened, running the same consistently results in the abrupt closure of the SSH connection prior to the command finishing, meaning your SSH client will exit with a return-code of 255:
$ ssh root@my_vivid_host shutdown -h now
Connection to localhost closed by remote host.
$ echo $?
255
Although in retrospect is was a bit naive for me to rely on this (actually quite fragile) behavior, it had at least been consistent in Ubuntu for some time (back to at least Raring from my personal experience, but likely back even farther).
This isn't technically a systemd bug, but I still think it's something worth mentioning in the release notes as I bet I'm not the only person who built some "clever" hacks around the previous behavior :P
| tags: | added: systemd vivid |
| tags: |
added: systemd-boot removed: systemd |
| Jason Gerard DeRose (jderose) wrote : | #2 |
Also, just to clarify, this is definitely a change (or in my mind regression) introduced by systemd. Yesterday, the System76 image master tool worked fine and dandy with an up-to-date Vivid VM, as it has throughout the rest of the previous Vivid dev cycle.
Today things broke.
| Launchpad Janitor (janitor) wrote : | #3 |
Status changed to 'Confirmed' because the bug affects multiple users.
| Changed in systemd (Ubuntu): | |
| status: | New → Confirmed |
| Changed in systemd (Ubuntu): | |
| status: | Confirmed → Invalid |
| Changed in openssh (Ubuntu): | |
| status: | Confirmed → In Progress |
| assignee: | nobody → Martin Pitt (pitti) |
| Changed in openssh (Ubuntu): | |
| importance: | Undecided → Low |
| Changed in systemd (Ubuntu): | |
| importance: | Undecided → Low |
| no longer affects: | systemd (Ubuntu) |
| Changed in openssh (Ubuntu): | |
| status: | In Progress → Triaged |
| Martin Pitt (pitti) wrote : | #4 |
After both "systemctl stop ssh" or a complete "apt-get purge openssh-server" existing ssh connections continue to work, and the various "sshd: login [priv]" processes continue to run, just the "/usr/sbin/sshd -D" master process goes away, as expected.
sshd.service has KillMode=process which does that (kills the master process, but none of its children). So I cannot reproduce this.
Do you have some more information how to reproduce this behaviour? Can you perhaps interrupt the image build process right before the purging, check "ps aux|grep ssh", then purge, check ps again, see what happened? Do you have steps or something that I can run which reproduces this?
| Changed in openssh (Ubuntu): | |
| status: | Triaged → Incomplete |
| summary: |
- systemd changes behavior of apt-get remove openssh-server + stopping ssh.service closes existing ssh connections |
So interestingly, this isn't happening when I just type these commands into an SSH session. But if you create a script like this in say /tmp/test.sh:
#!/bin/bash
apt-get -y purge openssh-server ssh-import-id
apt-get -y autoremove
shutdown -h now
And then execute this through an ssh call like this:
$ ssh root@whatever /tmp/test.sh
I get the disconnection problem.
| Jason Gerard DeRose (jderose) wrote : | #6 |
Hmm, now I'm thinking this has nothing to do with openssh-server.
I think the problem is actually that when I run this over SSH:
# shutdown -h now
My ssh client exists with status 255... whereas running the same thing prior to the flip-over to systemd would exit with status 0.
| Jason Gerard DeRose (jderose) wrote : | #7 |
Okay, here's a simple way to reproduce:
$ ssh root@whatever shutdown -h now
$ echo $?
On Vivid, the exist status from the ssh client will be 255. On Trusty and Utopic it will be 0.
| Jason Gerard DeRose (jderose) wrote : | #8 |
Also, on Vivid there will be this error: "Connection to localhost closed by remote host."
| Jason Gerard DeRose (jderose) wrote : | #9 |
Same problem when running `reboot`, which I'd say is even more important for automation. Port 2204 is forwarding to a qemu VM running Utopic, port 2207 is running Vivid:
jderose@
jderose@
0
jderose@
Connection to localhost closed by remote host.
jderose@
255
jderose@
| Martin Pitt (pitti) wrote : | #10 |
Ah right, that makes more sense. By definition, "shutdown -h now" or "poweroff" etc. are racy -- sometimes it "survives" and comes back, sometimes the shutdown of the machine is too fast. It seems that with systemd it's just a bit faster. I suggest calling "shutdown -h +1", or if you don't want to wait for a minute, perhaps "(sleep 3; poweroff ) &"?
| summary: |
- stopping ssh.service closes existing ssh connections + reboot does not return under systemd |
| affects: | openssh (Ubuntu) → systemd (Ubuntu) |
| Changed in systemd (Ubuntu): | |
| status: | Incomplete → Invalid |
| Jason Gerard DeRose (jderose) wrote : | #11 |
Martin,
Okay, much thanks!
| Egmont Koblinger (egmont-gmail) wrote : | #12 |
> perhaps "(sleep 3; poweroff ) &"?
I recently did something similar, and noticed that the ssh channel doesn't close as long as the background process's file descriptors use the ssh channel. So you'll probably have more luck with:
(sleep 3; poweroff) </dev/null &>/dev/null &
| tags: | added: regression-release |
| description: | updated |
| Jason Gerard DeRose (jderose) wrote : | #13 |
I reworked the description as my original assessment was quite off.
But after more thought, I think this behavior change is something that really needs mentioning in the Vivid releases notes.
After all, the perceived "correct" behavior of a system strongly tends toward what the actual behavior has been historically.
Yet one of these kids is clearly not like the other :D
| god (humper) wrote : | #14 |
I think mentioning this "bug" in release notes would only confuse users: the bug here is in the person who relied on undocumented racy behavior - this have nothing to do with systemd or ubuntu or sshd.
| Jason Gerard DeRose (jderose) wrote : | #15 |
Except that previously this wasn't racy behavior in practice.
I have automation tooling that has executed tens of thousands of reboot and shutdown commands over SSH in this way with perfect consistency over the last two years. The moment the switchover to systemd happened in Vivid, this tooling broken.
I get that my assumptions weren't robust and that I have to change my tools to accommodate systemd (which I already did). But communicating this change is courteous and helpful, very Ubuntu if you will.
And this isn't something that needs to be prominent. It just effects people working with servers, VMs, etc, not everyday desktop users.
| Egmont Koblinger (egmont-gmail) wrote : | #16 |
IMHO this particular issue is way too specific, and is likely to affect very few people only. And Google very easily finds this page.
What might be worth documenting is a few generic words about "reboot" and some other similar commands slightly changing behavior due to systemd. E.g. reboot just sends a command to systemd and immediately returns -- this gives a great clue to what's wrong in your use case, yet is way more generic and probably useful for more people. Or mention that locally logged in users can now execute "reboot" without sudo (I don't think it was like this previously). Or just simply something along the lines of "due to systemd, some system tools such as reboot now might behave slightly differently"...


Status changed to 'Confirmed' because the bug affects multiple users.