reboot does not return under systemd

Bug #1429938 reported by Jason Gerard DeRose
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Invalid
Low
Martin Pitt

Bug Description

If you send a shutdown or reboot command over SSH to a Trusty or Utopic host, the command will consistently finish successfully prior to the SSH connection being closed, meaning your SSH client will exit with a return-code of zero:

For example:

$ ssh root@myhost shutdown -h now
$ echo $?
0

Or

$ ssh root@myhost reboot
$ echo $?
0

However, on Vivid now that the switch-over to systemd has happened, running the same consistently results in the abrupt closure of the SSH connection prior to the command finishing, meaning your SSH client will exit with a return-code of 255:

$ ssh root@my_vivid_host shutdown -h now
Connection to localhost closed by remote host.
$ echo $?
255

Although in retrospect is was a bit naive for me to rely on this (actually quite fragile) behavior, it had at least been consistent in Ubuntu for some time (back to at least Raring from my personal experience, but likely back even farther).

This isn't technically a systemd bug, but I still think it's something worth mentioning in the release notes as I bet I'm not the only person who built some "clever" hacks around the previous behavior :P

tags: added: systemd vivid
tags: added: systemd-boot
removed: systemd
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in openssh (Ubuntu):
status: New → Confirmed
Revision history for this message
Jason Gerard DeRose (jderose) wrote :

Also, just to clarify, this is definitely a change (or in my mind regression) introduced by systemd. Yesterday, the System76 image master tool worked fine and dandy with an up-to-date Vivid VM, as it has throughout the rest of the previous Vivid dev cycle.

Today things broke.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in systemd (Ubuntu):
status: New → Confirmed
Martin Pitt (pitti)
Changed in systemd (Ubuntu):
status: Confirmed → Invalid
Changed in openssh (Ubuntu):
status: Confirmed → In Progress
assignee: nobody → Martin Pitt (pitti)
Changed in openssh (Ubuntu):
importance: Undecided → Low
Changed in systemd (Ubuntu):
importance: Undecided → Low
no longer affects: systemd (Ubuntu)
Martin Pitt (pitti)
Changed in openssh (Ubuntu):
status: In Progress → Triaged
Revision history for this message
Martin Pitt (pitti) wrote :

After both "systemctl stop ssh" or a complete "apt-get purge openssh-server" existing ssh connections continue to work, and the various "sshd: login [priv]" processes continue to run, just the "/usr/sbin/sshd -D" master process goes away, as expected.

sshd.service has KillMode=process which does that (kills the master process, but none of its children). So I cannot reproduce this.

Do you have some more information how to reproduce this behaviour? Can you perhaps interrupt the image build process right before the purging, check "ps aux|grep ssh", then purge, check ps again, see what happened? Do you have steps or something that I can run which reproduces this?

Changed in openssh (Ubuntu):
status: Triaged → Incomplete
summary: - systemd changes behavior of apt-get remove openssh-server
+ stopping ssh.service closes existing ssh connections
Revision history for this message
Jason Gerard DeRose (jderose) wrote : Re: stopping ssh.service closes existing ssh connections

So interestingly, this isn't happening when I just type these commands into an SSH session. But if you create a script like this in say /tmp/test.sh:

#!/bin/bash
apt-get -y purge openssh-server ssh-import-id
apt-get -y autoremove
shutdown -h now

And then execute this through an ssh call like this:

$ ssh root@whatever /tmp/test.sh

I get the disconnection problem.

Revision history for this message
Jason Gerard DeRose (jderose) wrote :

Hmm, now I'm thinking this has nothing to do with openssh-server.

I think the problem is actually that when I run this over SSH:

# shutdown -h now

My ssh client exists with status 255... whereas running the same thing prior to the flip-over to systemd would exit with status 0.

Revision history for this message
Jason Gerard DeRose (jderose) wrote :

Okay, here's a simple way to reproduce:

$ ssh root@whatever shutdown -h now
$ echo $?

On Vivid, the exist status from the ssh client will be 255. On Trusty and Utopic it will be 0.

Revision history for this message
Jason Gerard DeRose (jderose) wrote :

Also, on Vivid there will be this error: "Connection to localhost closed by remote host."

Revision history for this message
Jason Gerard DeRose (jderose) wrote :

Same problem when running `reboot`, which I'd say is even more important for automation. Port 2204 is forwarding to a qemu VM running Utopic, port 2207 is running Vivid:

jderose@jgd-kudp1:~$ ssh root@localhost -p 2204 reboot
jderose@jgd-kudp1:~$ echo $?
0
jderose@jgd-kudp1:~$ ssh root@localhost -p 2207 reboot
Connection to localhost closed by remote host.
jderose@jgd-kudp1:~$ echo $?
255
jderose@jgd-kudp1:~$

Revision history for this message
Martin Pitt (pitti) wrote :

Ah right, that makes more sense. By definition, "shutdown -h now" or "poweroff" etc. are racy -- sometimes it "survives" and comes back, sometimes the shutdown of the machine is too fast. It seems that with systemd it's just a bit faster. I suggest calling "shutdown -h +1", or if you don't want to wait for a minute, perhaps "(sleep 3; poweroff ) &"?

summary: - stopping ssh.service closes existing ssh connections
+ reboot does not return under systemd
affects: openssh (Ubuntu) → systemd (Ubuntu)
Changed in systemd (Ubuntu):
status: Incomplete → Invalid
Revision history for this message
Jason Gerard DeRose (jderose) wrote :

Martin,

Okay, much thanks!

Revision history for this message
Egmont Koblinger (egmont-gmail) wrote :

> perhaps "(sleep 3; poweroff ) &"?

I recently did something similar, and noticed that the ssh channel doesn't close as long as the background process's file descriptors use the ssh channel. So you'll probably have more luck with:

(sleep 3; poweroff) </dev/null &>/dev/null &

Rolf Leggewie (r0lf)
tags: added: regression-release
description: updated
Revision history for this message
Jason Gerard DeRose (jderose) wrote :

I reworked the description as my original assessment was quite off.

But after more thought, I think this behavior change is something that really needs mentioning in the Vivid releases notes.

After all, the perceived "correct" behavior of a system strongly tends toward what the actual behavior has been historically.

Yet one of these kids is clearly not like the other :D

Revision history for this message
god (humper) wrote :

I think mentioning this "bug" in release notes would only confuse users: the bug here is in the person who relied on undocumented racy behavior - this have nothing to do with systemd or ubuntu or sshd.

Revision history for this message
Jason Gerard DeRose (jderose) wrote :

Except that previously this wasn't racy behavior in practice.

I have automation tooling that has executed tens of thousands of reboot and shutdown commands over SSH in this way with perfect consistency over the last two years. The moment the switchover to systemd happened in Vivid, this tooling broken.

I get that my assumptions weren't robust and that I have to change my tools to accommodate systemd (which I already did). But communicating this change is courteous and helpful, very Ubuntu if you will.

And this isn't something that needs to be prominent. It just effects people working with servers, VMs, etc, not everyday desktop users.

Revision history for this message
Egmont Koblinger (egmont-gmail) wrote :

IMHO this particular issue is way too specific, and is likely to affect very few people only. And Google very easily finds this page.

What might be worth documenting is a few generic words about "reboot" and some other similar commands slightly changing behavior due to systemd. E.g. reboot just sends a command to systemd and immediately returns -- this gives a great clue to what's wrong in your use case, yet is way more generic and probably useful for more people. Or mention that locally logged in users can now execute "reboot" without sudo (I don't think it was like this previously). Or just simply something along the lines of "due to systemd, some system tools such as reboot now might behave slightly differently"...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.