lxd cannot shutdown container

Bug #1668049 reported by Frans
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

I found this issue when working with lxd and submitted a ticket there but the lxc people said it was related to systemd, not lxc.

The issue is: I have a container running under lxc/lxd. lxc stop of the container does not work whereas if I execute "shutdown now" inside the container it does stop as expected.

The associated lxd ticket is here:
https://github.com/lxc/lxd/issues/2947

The output of journalctl -a is at the end of this message

The syslog of that same run can be found here: http://paste.ubuntu.com/24071244/

and this is the output of ps aux:

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.1 37808 4576 ? Ss 11:19 0:00 /sbin/init
root 48 0.0 0.0 35276 3004 ? Ss 11:19 0:00 /lib/systemd/systemd-journald
root 49 0.0 0.0 41724 1912 ? Ss 11:19 0:00 /lib/systemd/systemd-udevd
message+ 514 0.0 0.0 42900 2460 ? Ss 11:19 0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root 704 0.0 0.0 18252 2508 ? Ss 11:55 0:00 bash
root 720 0.0 0.0 34556 1992 ? R+ 11:59 0:00 ps aux

Note that this may be timing related. If I stop the container immediately after starting things work. If I stop after a minute or so, the stop will not get through. (even not on a 2nd or 3rd attempt). However if I leave the container for a while then a new lxc stop will terminate the container. (this happened after an hour or so).

The issue is reproducible but seems to be dependend on what is running inside the container.

container runs ubuntu 16.04 on a 16.04 host.

systemd --version says:
systemd 229
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ -LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN

lxc sends SIGRTMIN+3 to systemd to stop. This was apparently discussed here: https://github.com/lxc/lxc/issues/1085

Let me know if more info is needed.

Frans

-- Logs begin at Sun 2017-02-26 11:19:12 UTC, end at Sun 2017-02-26 11:23:05 UTC. --
Feb 26 11:19:12 D-wan-61-1 systemd-journald[48]: Runtime journal (/run/log/journal/) is 8.0M, max 196.7M, 188.7M free.
Feb 26 11:19:12 D-wan-61-1 systemd-journald[48]: Journal started
Feb 26 11:19:12 D-wan-61-1 systemd-sysctl[42]: Couldn't write '4 4 1 7' to 'kernel/printk', ignoring: Permission denied
Feb 26 11:19:12 D-wan-61-1 systemd-sysctl[42]: Couldn't write '176' to 'kernel/sysrq', ignoring: Permission denied
Feb 26 11:19:12 D-wan-61-1 systemd-sysctl[42]: Couldn't write '1' to 'fs/protected_symlinks', ignoring: Permission denied
Feb 26 11:19:12 D-wan-61-1 systemd-sysctl[42]: Couldn't write '1' to 'net/ipv4/tcp_syncookies', ignoring: No such file or directory
Feb 26 11:19:12 D-wan-61-1 systemd-sysctl[42]: Couldn't write '1' to 'kernel/kptr_restrict', ignoring: Permission denied
Feb 26 11:19:12 D-wan-61-1 systemd-sysctl[42]: Couldn't write '1' to 'fs/protected_hardlinks', ignoring: Permission denied
Feb 26 11:19:12 D-wan-61-1 systemd-sysctl[42]: Couldn't write '1' to 'kernel/yama/ptrace_scope', ignoring: Permission denied
Feb 26 11:19:12 D-wan-61-1 systemd-sysctl[42]: Couldn't write '65536' to 'vm/mmap_min_addr', ignoring: Permission denied
Feb 26 11:19:12 D-wan-61-1 systemd[1]: systemd-sysctl.service: Main process exited, code=exited, status=1/FAILURE
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Failed to start Apply Kernel Variables.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: systemd-sysctl.service: Unit entered failed state.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: systemd-sysctl.service: Failed with result 'exit-code'.
Feb 26 11:19:12 D-wan-61-1 systemd-journald[48]: Forwarding to syslog missed 1 messages.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Started Uncomplicated firewall.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Failed to reset devices.list on /system.slice/ufw.service: Operation not permitted
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Started udev Kernel Device Manager.
Feb 26 11:19:12 D-wan-61-1 mount[45]: mount: permission denied
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Started Nameserver information manager.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: dev-hugepages.mount: Mount process exited, code=exited status=32
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Failed to mount Huge Pages File System.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: dev-hugepages.mount: Unit entered failed state.
Feb 26 11:19:12 D-wan-61-1 systemd-remount-fs[44]: mount: can't find LABEL=cloudimg-rootfs
Feb 26 11:19:12 D-wan-61-1 systemd-remount-fs[44]: /bin/mount for / exited with exit status 1.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: systemd-remount-fs.service: Main process exited, code=exited, status=1/FAILURE
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Failed to start Remount Root and Kernel File Systems.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: systemd-remount-fs.service: Unit entered failed state.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: systemd-remount-fs.service: Failed with result 'exit-code'.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Failed to reset devices.list on /system.slice/systemd-random-seed.service: Operation not perm
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Starting Load/Save Random Seed...
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Failed to reset devices.list on /system.slice/systemd-udev-trigger.service: Operation not per
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Starting udev Coldplug all Devices...
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Failed to reset devices.list on /system.slice/systemd-journal-flush.service: Operation not pe
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Starting Flush Journal to Persistent Storage...
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Failed to reset devices.list on /system.slice/cloud-init-local.service: Operation not permitt
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Starting Initial cloud-init job (pre-networking)...
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Failed to reset devices.list on /system.slice/resolvconf.service: Operation not permitted
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Started Flush Journal to Persistent Storage.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Started Load/Save Random Seed.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: lvm2-lvmetad.socket: Trigger limit hit, refusing further activation.
Feb 26 11:19:12 D-wan-61-1 lvm[43]: Daemon lvmetad returned error 104
Feb 26 11:19:12 D-wan-61-1 lvm[43]: WARNING: Failed to connect to lvmetad. Falling back to internal scanning.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: lvm2-lvmetad.socket: Unit entered failed state.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Started Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Reached target Local File Systems (Pre).
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Reached target Local File Systems.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Failed to reset devices.list on /system.slice/system-getty.slice: Operation not permitted
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Created slice system-getty.slice.
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Failed to reset devices.list on /system.slice/systemd-tmpfiles-setup.service: Operation not p
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Starting Create Volatile Files and Directories...
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Failed to reset devices.list on /system.slice/plymouth-read-write.service: Operation not perm
Feb 26 11:19:12 D-wan-61-1 systemd[1]: Starting Tell Plymouth To Write Out Runtime Data...

Revision history for this message
Christian Brauner (cbrauner) wrote :

Note, that since a while LXC is sending SIGRTMIN+3 to systemd. So unless systemd has changed it's shutdown/halt signal again LXC should send the right signal.

Revision history for this message
Frans (fransmeulenbroeks) wrote :

I don't think systemd changed its shutdown signal as I have other 1604 based containers that stop as expected. To me it seems that systemd wants to stop a process and hangs waiting for it (whereas shutdown now does stop the system)

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

`shutdown now` executes poweroff command which in systemd signals manpage documented as:
SIGRTMIN+4 Powers off the machine, starts the poweroff.target unit. This is mostly equivalent to systemctl start poweroff.target.

The logs indicate that lxc is sending
SIGRTMIN+3 Halts the machine, starts the halt.target unit. This is mostly equivalent to systemctl start halt.target.

Note that in addition to above, systemd have more immediate variants for these:
SIGRTMIN+13 Immediately halts the machine.
SIGRTMIN+14 Immediately powers off the machine.

Could you please start the affected containter and try `shutdown --halt now` and check if that hangs as well?

Revision history for this message
Frans (fransmeulenbroeks) wrote : Re: [Bug 1668049] Re: lxd cannot shutdown container
Download full text (9.0 KiB)

Hi,
"shutdown --halt now" stops the container as does "shutdown now"
If I omit the now part and just do shutdown --halt
I get:

root@D-wan-61-1:~# shutdown --halt
Failed to set wall message, ignoring: Refusing activation, D-Bus is shutting down.
Failed to call ScheduleShutdown in logind, proceeding with immediate shutdown: Refusing activation, D-Bus is shutting down
and I get a prompt again

Just running
shutdowngives the same message.

      From: Dimitri John Ledkov <email address hidden>
 To: <email address hidden>
 Sent: Monday, 27 February 2017, 12:34
 Subject: [Bug 1668049] Re: lxd cannot shutdown container

`shutdown now` executes poweroff command which in systemd signals manpage documented as:
SIGRTMIN+4 Powers off the machine, starts the poweroff.target unit. This is mostly equivalent to systemctl start poweroff.target.

The logs indicate that lxc is sending
SIGRTMIN+3 Halts the machine, starts the halt.target unit. This is mostly equivalent to systemctl start halt.target.

Note that in addition to above, systemd have more immediate variants for these:
SIGRTMIN+13 Immediately halts the machine.
SIGRTMIN+14 Immediately powers off the machine.

Could you please start the affected containter and try `shutdown --halt
now` and check if that hangs as well?

--
You received this bug notification because you are subscribed to the bug
report.
https://bugs.launchpad.net/bugs/1668049

Title:
  lxd cannot shutdown container

Status in systemd package in Ubuntu:
  New

Bug description:
  I found this issue when working with lxd and submitted a ticket there
  but the lxc people said it was related to systemd, not lxc.

  The issue is: I have a container running under lxc/lxd. lxc stop of
  the container does not work whereas if I execute "shutdown now" inside
  the container it does stop as expected.

  The associated lxd ticket is here:
  https://github.com/lxc/lxd/issues/2947

  The output of journalctl -a is at the end of this message

  The syslog of that same run can be found here:
  http://paste.ubuntu.com/24071244/

  and this is the output of ps aux:

  USER        PID %CPU %MEM    VSZ  RSS TTY      STAT START  TIME COMMAND
  root          1  0.0  0.1  37808  4576 ?        Ss  11:19  0:00 /sbin/init
  root        48  0.0  0.0  35276  3004 ?        Ss  11:19  0:00 /lib/systemd/systemd-journald
  root        49  0.0  0.0  41724  1912 ?        Ss  11:19  0:00 /lib/systemd/systemd-udevd
  message+    514  0.0  0.0  42900  2460 ?        Ss  11:19  0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
  root        704  0.0  0.0  18252  2508 ?        Ss  11:55  0:00 bash
  root        720  0.0  0.0  34556  1992 ?        R+  11:59  0:00 ps aux

  Note that this may be timing related. If I stop the container
  immediately after starting things work. If I stop after a minute or
  so, the stop will not get through. (even not on a 2nd or 3rd attempt).
  However if I leave the container for a while then a new lxc stop will
  terminate the container. (this happened after an hour or so).

  The issue is reproducible but seems to be dependend on what is running
  inside the container.

  container ...

Read more...

Revision history for this message
Frans (fransmeulenbroeks) wrote :

Sorry for the long message, I did reply from email and forgot to delete the old message (actually assumed that only new text would be uploaded to launchpad).

Frans

Revision history for this message
Stéphane Graber (stgraber) wrote :

Is this still an issue?

Revision history for this message
Frans (fransmeulenbroeks) wrote :
Download full text (8.1 KiB)

 I have no idea.
I changed jobs and have no more access to the infrastructure where this issue showed up.

Closing or rejecting this ticket is fine with me.

     Op maandag 1 juni 2020 18:31:17 CEST schreef Stéphane Graber <email address hidden>:

 Is this still an issue?

--
You received this bug notification because you are subscribed to the bug
report.
https://bugs.launchpad.net/bugs/1668049

Title:
  lxd cannot shutdown container

Status in systemd package in Ubuntu:
  New

Bug description:
  I found this issue when working with lxd and submitted a ticket there
  but the lxc people said it was related to systemd, not lxc.

  The issue is: I have a container running under lxc/lxd. lxc stop of
  the container does not work whereas if I execute "shutdown now" inside
  the container it does stop as expected.

  The associated lxd ticket is here:
  https://github.com/lxc/lxd/issues/2947

  The output of journalctl -a is at the end of this message

  The syslog of that same run can be found here:
  http://paste.ubuntu.com/24071244/

  and this is the output of ps aux:

  USER        PID %CPU %MEM    VSZ  RSS TTY      STAT START  TIME COMMAND
  root          1  0.0  0.1  37808  4576 ?        Ss  11:19  0:00 /sbin/init
  root        48  0.0  0.0  35276  3004 ?        Ss  11:19  0:00 /lib/systemd/systemd-journald
  root        49  0.0  0.0  41724  1912 ?        Ss  11:19  0:00 /lib/systemd/systemd-udevd
  message+    514  0.0  0.0  42900  2460 ?        Ss  11:19  0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
  root        704  0.0  0.0  18252  2508 ?        Ss  11:55  0:00 bash
  root        720  0.0  0.0  34556  1992 ?        R+  11:59  0:00 ps aux

  Note that this may be timing related. If I stop the container
  immediately after starting things work. If I stop after a minute or
  so, the stop will not get through. (even not on a 2nd or 3rd attempt).
  However if I leave the container for a while then a new lxc stop will
  terminate the container. (this happened after an hour or so).

  The issue is reproducible but seems to be dependend on what is running
  inside the container.

  container runs ubuntu 16.04 on a 16.04 host.

  systemd --version says:
  systemd 229
  +PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ -LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN

  lxc sends SIGRTMIN+3 to systemd to stop. This was apparently discussed
  here: https://github.com/lxc/lxc/issues/1085

  Let me know if more info is needed.

  Frans

  -- Logs begin at Sun 2017-02-26 11:19:12 UTC, end at Sun 2017-02-26 11:23:05 UTC. --
  Feb 26 11:19:12 D-wan-61-1 systemd-journald[48]: Runtime journal (/run/log/journal/) is 8.0M, max 196.7M, 188.7M free.
  Feb 26 11:19:12 D-wan-61-1 systemd-journald[48]: Journal started
  Feb 26 11:19:12 D-wan-61-1 systemd-sysctl[42]: Couldn't write '4 4 1 7' to 'kernel/printk', ignoring: Permission denied
  Feb 26 11:19:12 D-wan-61-1 systemd-sysctl[42]: Couldn't write '176' to 'kernel/sysrq', ignoring: Permission denied
  Feb 26 11:19:12 D-wan-61-1 systemd-sysctl[42]: Couldn't write '1' to 'fs/protected_symlinks', ignoring: P...

Read more...

Changed in systemd (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.