Activity log for bug #1958284

Date Who What changed Old value New value Message
2022-01-18 18:19:28 Jean Raby bug added bug
2022-01-18 18:37:57 Jean Raby description With systemd v245 as shipped with 20.04, the shutdown sequence does not use the value of `DefaultTimeoutStopSec` to wait for remaining processes, it instead uses the compiled in default of 90s. This is most visible with services that use `KillMode=process` (docker, k8s, k3s, etc...), especially if the remaining processes do not handle `SIGTERM` or choose to ignore it. For example: ``` [ OK ] Finished Reboot. [ OK ] Reached target Reboot. [ 243.652848 ] systemd-shutdown[1]: Waiting for process: containerd-shim, containerd-shim, containerd-shim, fluent-bit --- hangs here for 90s even if DefaultTimeoutStopSec is set to a lower value --- ``` The bug has been fixed upstream here: https://github.com/systemd/systemd/commit/7d9eea2bd3d4f83668c7a78754d201b22 Marc was kind enough to package the patch for 20.04 so I could test it (https://launchpad.net/~mdeslaur/+archive/ubuntu/testing/+sourcepub/13210617/+listing-archive-extra) and with that package, I can confirm that it indeed fixes the issue. Here's a few github issues I stumbled upon while trying to debug this, along with a short writeup of the workaround I ended up using: - https://github.com/moby/moby/issues/41831 - https://github.com/k3s-io/k3s/issues/2400 - http://localhost:8000/debugging-90s-hangs-during-shutdown-on-ubuntu-2004.html Of course, it would be much better if all the processes would properly handle `SIGTERM`, but having a way to enforce a maximum wait time at shutdown is a decent workaround. Given that the patch is relatively simple, would it be possible to add it the package for 20.04? Thanks With systemd v245 as shipped with 20.04, the shutdown sequence does not use the value of `DefaultTimeoutStopSec` to wait for remaining processes, it instead uses the compiled in default of 90s. This is most visible with services that use `KillMode=process` (docker, k8s, k3s, etc...), especially if the remaining processes do not handle `SIGTERM` or choose to ignore it. For example: ``` [ OK ] Finished Reboot. [ OK ] Reached target Reboot. [ 243.652848 ] systemd-shutdown[1]: Waiting for process: containerd-shim, containerd-shim, containerd-shim, fluent-bit --- hangs here for 90s even if DefaultTimeoutStopSec is set to a lower value --- ``` The bug has been fixed upstream here: https://github.com/systemd/systemd/commit/7d9eea2bd3d4f83668c7a78754d201b22 Marc was kind enough to package the patch for 20.04 so I could test it (https://launchpad.net/~mdeslaur/+archive/ubuntu/testing/+sourcepub/13210617/+listing-archive-extra) and with that package, I can confirm that it indeed fixes the issue. Here's a few github issues I stumbled upon while trying to debug this, along with a short writeup of the workaround I ended up using: - https://github.com/moby/moby/issues/41831 - https://github.com/k3s-io/k3s/issues/2400 - https://github.com/systemd/systemd/issues/16991 - https://raby.sh/debugging-90s-hangs-during-shutdown-on-ubuntu-2004.html Of course, it would be much better if all the processes would properly handle `SIGTERM`, but having a way to enforce a maximum wait time at shutdown is a decent workaround. Given that the patch is relatively simple, would it be possible to add it the package for 20.04? Thanks
2022-01-18 18:52:33 Launchpad Janitor systemd (Ubuntu): status New Confirmed
2022-01-18 19:24:34 Marc Deslauriers bug added subscriber Marc Deslauriers
2022-01-20 13:51:47 Lukas Märdian tags rls-ff-incoming
2022-01-20 13:51:55 Lukas Märdian nominated for series Ubuntu Focal
2022-01-20 13:51:55 Lukas Märdian bug task added systemd (Ubuntu Focal)
2022-01-20 16:57:45 Brian Murray tags rls-ff-incoming fr-1987 rls-ff-incoming
2022-01-20 16:59:01 Brian Murray tags fr-1987 rls-ff-incoming
2022-01-20 17:00:52 Brian Murray systemd (Ubuntu Focal): status New Confirmed
2022-01-20 17:02:44 Lukas Märdian systemd (Ubuntu Focal): importance Undecided Medium
2022-03-21 23:20:12 Ivan Kurnosov bug added subscriber Ivan Kurnosov
2022-03-22 20:36:00 Nick Rosbrook description With systemd v245 as shipped with 20.04, the shutdown sequence does not use the value of `DefaultTimeoutStopSec` to wait for remaining processes, it instead uses the compiled in default of 90s. This is most visible with services that use `KillMode=process` (docker, k8s, k3s, etc...), especially if the remaining processes do not handle `SIGTERM` or choose to ignore it. For example: ``` [ OK ] Finished Reboot. [ OK ] Reached target Reboot. [ 243.652848 ] systemd-shutdown[1]: Waiting for process: containerd-shim, containerd-shim, containerd-shim, fluent-bit --- hangs here for 90s even if DefaultTimeoutStopSec is set to a lower value --- ``` The bug has been fixed upstream here: https://github.com/systemd/systemd/commit/7d9eea2bd3d4f83668c7a78754d201b22 Marc was kind enough to package the patch for 20.04 so I could test it (https://launchpad.net/~mdeslaur/+archive/ubuntu/testing/+sourcepub/13210617/+listing-archive-extra) and with that package, I can confirm that it indeed fixes the issue. Here's a few github issues I stumbled upon while trying to debug this, along with a short writeup of the workaround I ended up using: - https://github.com/moby/moby/issues/41831 - https://github.com/k3s-io/k3s/issues/2400 - https://github.com/systemd/systemd/issues/16991 - https://raby.sh/debugging-90s-hangs-during-shutdown-on-ubuntu-2004.html Of course, it would be much better if all the processes would properly handle `SIGTERM`, but having a way to enforce a maximum wait time at shutdown is a decent workaround. Given that the patch is relatively simple, would it be possible to add it the package for 20.04? Thanks [Impact] The systemd shutdown sequence does not honor systemd-system.conf settings when waiting for remaining processes. This means that, for example, if a systemd service specifies KillMode=process and a process remaining from that service does not properly handle SIGTERM, then the remaining process will not be killed until after the compiled-in default value of DefaultTimeoutStopSec (90s), even if the user has changed the setting of DefaultTimeoutStopSec. In such cases, this impacts users by significantly increasing the time required for shutdown/reboot. [Test Plan] * Create a new script, /usr/local/bin/loop-ignore-sigterm: ``` #!/bin/bash loop_forever() { while true; do sleep 1; done } ( trap 'echo Ignoring SIGTERM...' SIGTERM loop_forever ) loop_forever ``` This script will spawn a subshell which will loop forever and ignore SIGTERM. This will force systemd to wait for the subprocess at reboot/shutdown, and eventually send SIGKILL after TimeoutStopSec (DefaultTimeoutStopSec in this case). * Make the script executable: $ chmod +x /usr/local/bin/loop-ignore-sigterm * Create a systemd service for this script. Add the following to /etc/systemd/system/loop-ignore-sigterm.service: ``` [Service] KillMode=process ExecStart=/usr/local/bin/loop-ignore-sigterm ``` * Start the service: $ systemctl start loop-ignore-sigterm.service * Edit /etc/systemd/system.conf, and uncomment the 'DefaultTimeoutStopSec=90s' line. Modify 90s to something much shorter, e.g. 20s. * Re-exec the daemon so this new default takes effect: $ systemctl daemon-reexec * Reboot, and monitor the logs. Observe that systemd-shutdown will wait for the loop-ignore-sigterm process for 90s, instead of the 20s configured earlier. [Where problems could occur] The patch moves the reset_arguments() call to the end of main, which means reset_arguments() is no longer called before daemon re-execution (if that branch is taken). If anything in that code path relied on reset_arguments() being called before re-executing, those assumptions could be broken. Any such problems would potentially be seen during daemon re-execution, e.g. when calling systemctl daemon-reexec. [ Original Description ] With systemd v245 as shipped with 20.04, the shutdown sequence does not use the value of `DefaultTimeoutStopSec` to wait for remaining processes, it instead uses the compiled in default of 90s. This is most visible with services that use `KillMode=process` (docker, k8s, k3s, etc...), especially if the remaining processes do not handle `SIGTERM` or choose to ignore it. For example: ``` [ OK ] Finished Reboot. [ OK ] Reached target Reboot. [ 243.652848 ] systemd-shutdown[1]: Waiting for process: containerd-shim, containerd-shim, containerd-shim, fluent-bit --- hangs here for 90s even if DefaultTimeoutStopSec is set to a lower value --- ``` The bug has been fixed upstream here: https://github.com/systemd/systemd/commit/7d9eea2bd3d4f83668c7a78754d201b22 Marc was kind enough to package the patch for 20.04 so I could test it (https://launchpad.net/~mdeslaur/+archive/ubuntu/testing/+sourcepub/13210617/+listing-archive-extra) and with that package, I can confirm that it indeed fixes the issue. Here's a few github issues I stumbled upon while trying to debug this, along with a short writeup of the workaround I ended up using: - https://github.com/moby/moby/issues/41831 - https://github.com/k3s-io/k3s/issues/2400 - https://github.com/systemd/systemd/issues/16991 - https://raby.sh/debugging-90s-hangs-during-shutdown-on-ubuntu-2004.html Of course, it would be much better if all the processes would properly handle `SIGTERM`, but having a way to enforce a maximum wait time at shutdown is a decent workaround. Given that the patch is relatively simple, would it be possible to add it the package for 20.04? Thanks
2022-03-23 15:25:12 Launchpad Janitor merge proposal linked https://code.launchpad.net/~enr0n/ubuntu/+source/systemd/+git/systemd/+merge/417577
2022-03-24 09:24:11 Lukas Märdian systemd (Ubuntu Focal): status Confirmed In Progress
2022-03-24 09:24:54 Lukas Märdian systemd (Ubuntu): status Confirmed Fix Released
2022-03-24 20:57:31 Nick Rosbrook bug added subscriber Nick Rosbrook
2022-03-28 18:24:01 Brian Murray systemd (Ubuntu Focal): status In Progress Fix Committed
2022-03-28 18:24:02 Brian Murray bug added subscriber Ubuntu Stable Release Updates Team
2022-03-28 18:24:04 Brian Murray bug added subscriber SRU Verification
2022-03-28 18:24:08 Brian Murray tags verification-needed verification-needed-focal
2022-03-29 15:31:14 Nick Rosbrook tags verification-needed verification-needed-focal verification-done-focal verification-needed
2022-04-05 18:19:38 Nick Rosbrook tags verification-done-focal verification-needed verification-done verification-done-focal
2022-04-13 22:54:41 Launchpad Janitor systemd (Ubuntu Focal): status Fix Committed Fix Released
2022-04-13 22:54:55 Brian Murray removed subscriber Ubuntu Stable Release Updates Team