snapd fails to shut down, gets killed after 1m30s during reboot

Bug #1901847 reported by Marius Gedminas on 2020-10-28
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
snapd
Medium
Unassigned
snapd (Ubuntu)
Undecided
Unassigned

Bug Description

I installed some updated and was prompted to reboot. The shutdown part of the reboot took a long time (over 1 minute 30 seconds), and I saw that systemd was waiting for snapd to shut down cleanly before killing it with SIGKILL 90 seconds later.

journalctl -b -1 -e shows this:

spal. 28 11:03:02 blynas systemd[1]: NetworkManager.service: Succeeded.
spal. 28 11:03:02 blynas systemd[1]: Stopped Network Manager.
spal. 28 11:03:02 blynas systemd[1]: Stopping D-Bus System Message Bus...
spal. 28 11:03:02 blynas systemd[1]: dbus.service: Succeeded.
spal. 28 11:03:02 blynas systemd[1]: dbus.service: Unit process 989 (dbus-daemon) remains running after unit stopped.
spal. 28 11:03:02 blynas systemd[1]: dbus.service: Unit process 322309 (python3) remains running after unit stopped.
spal. 28 11:03:02 blynas systemd[1]: Stopped D-Bus System Message Bus.
spal. 28 11:03:02 blynas thermald[1025]: terminating on user request ..
spal. 28 11:03:03 blynas systemd[1]: thermald.service: Succeeded.
spal. 28 11:03:03 blynas systemd[1]: Stopped Thermal Daemon Service.
spal. 28 11:03:03 blynas clamd[1036]: Wed Oct 28 11:03:03 2020 -> --- Stopped at Wed Oct 28 11:03:03 2020
spal. 28 11:03:03 blynas clamd[1036]: Wed Oct 28 11:03:03 2020 -> Socket file removed.
spal. 28 11:03:03 blynas systemd[1]: clamav-daemon.service: Succeeded.
spal. 28 11:03:03 blynas systemd[1]: Stopped Clam AntiVirus userspace daemon.
spal. 28 11:03:26 blynas snapd[1013]: daemon.go:586: WARNING: cannot gracefully shut down in-flight snapd API activity within: 25s
spal. 28 11:04:31 blynas systemd[1]: snapd.service: State 'stop-sigterm' timed out. Killing.
spal. 28 11:04:31 blynas systemd[1]: snapd.service: Killing process 1013 (snapd) with signal SIGKILL.
spal. 28 11:04:31 blynas systemd[1]: snapd.service: Main process exited, code=killed, status=9/KILL
spal. 28 11:04:31 blynas systemd[1]: snapd.service: Failed with result 'timeout'.
spal. 28 11:04:31 blynas systemd[1]: Stopped Snap Daemon.
spal. 28 11:04:31 blynas systemd[1]: snapd.service: Triggering OnFailure= dependencies.
spal. 28 11:04:31 blynas systemd[1]: Requested transaction contradicts existing jobs: Transaction for snapd.failure.service/start is destructive (systemd-remount-fs.service has 'stop' job queued, but 'start' is included in transaction).
spal. 28 11:04:31 blynas systemd[1]: snapd.service: Failed to enqueue OnFailure= job, ignoring: Transaction for snapd.failure.service/start is destructive (systemd-remount-fs.service has 'stop' job queued, but 'start' is included in transaction).
spal. 28 11:04:31 blynas systemd[1]: Stopped target Basic System.

ProblemType: Bug
DistroRelease: Ubuntu 20.10
Package: snapd 2.47.1+20.10.1
ProcVersionSignature: Ubuntu 5.8.0-26.27-generic 5.8.14
Uname: Linux 5.8.0-26-generic x86_64
ApportVersion: 2.20.11-0ubuntu50
Architecture: amd64
CasperMD5CheckResult: skip
CurrentDesktop: ubuntu:GNOME
Date: Wed Oct 28 12:29:28 2020
EcryptfsInUse: Yes
InstallationDate: Installed on 2019-06-12 (503 days ago)
InstallationMedia: Ubuntu 19.04 "Disco Dingo" - Release amd64 (20190416)
SourcePackage: snapd
UpgradeStatus: Upgraded to groovy on 2020-10-23 (5 days ago)
modified.conffile..etc.sudoers.d.99-snapd.conf: [inaccessible: [Errno 13] Permission denied: '/etc/sudoers.d/99-snapd.conf']

Marius Gedminas (mgedmin) wrote :
Zygmunt Krynicki (zyga) wrote :

I wonder what is causing that activity:

 spal. 28 11:03:26 blynas snapd[1013]: daemon.go:586: WARNING: cannot gracefully shut down in-flight snapd API activity within: 25s

This is why we are timing out on shutdown.

Marius Gedminas (mgedmin) wrote :

It would be nice if snapd would log something about what it's doing to the journal.

I did run a `snap refresh --list` before rebooting and I saw that a chromium update was being delayed since chromium was running. I closed chromium a few seconds before initiating the reboot. Perhaps snapd thought it was a good opportunity to start the upgrade?

After the reboot I did a manual `sudo snap refresh` to be sure chromium would be upgraded before I launched it, and I saw the progress bar go from 0 to 100%, rather slowly. I wasn't paying much attention, but I think it said it was downloading the new version of chromium.

Marius Gedminas (mgedmin) wrote :

Today this happened again. I installed a bunch of updates with update-manager, was prompted to reboot, checked with 'snap refresh --list' to see if there were any snap updates pending (there weren't), and let update-manager do the reboot.

snapd took 1m30s to shut down.

lapkr. 11 09:08:29 blynas systemd[1]: Stopped D-Bus System Message Bus.
lapkr. 11 09:08:31 blynas dbus-daemon[992]: [system] Activating systemd to hand-off: service name='org.freedesktop.PackageKit' unit='packagekit.service' requested by ':1.2851' (uid=0 pid=3021119 comm="/usr/bin/gdbus call --system --dest org.freedeskto" label="unconfined")
lapkr. 11 09:08:38 blynas AptDaemon.Worker[2993491]: INFO: Finished transaction /org/debian/apt/transaction/dff292185c0a4a19af71998f9f7e813c
lapkr. 11 09:08:38 blynas org.debian.apt[2993491]: /usr/lib/python3/dist-packages/aptdaemon/progress.py:490: Warning: Source ID 49 was not found when attempting to remove it
lapkr. 11 09:08:38 blynas org.debian.apt[2993491]: GLib.source_remove(id)
lapkr. 11 09:08:38 blynas org.debian.apt[2993491]: /usr/lib/python3/dist-packages/aptdaemon/progress.py:490: Warning: Source ID 50 was not found when attempting to remove it
lapkr. 11 09:08:38 blynas org.debian.apt[2993491]: GLib.source_remove(id)
lapkr. 11 09:08:38 blynas org.debian.apt[2993491]: /usr/lib/python3/dist-packages/aptdaemon/progress.py:490: Warning: Source ID 51 was not found when attempting to remove it
lapkr. 11 09:08:38 blynas org.debian.apt[2993491]: GLib.source_remove(id)
lapkr. 11 09:08:38 blynas org.debian.apt[2993491]: 09:08:38 AptDaemon.Worker [INFO]: Finished transaction /org/debian/apt/transaction/dff292185c0a4a19af71998f9f7e813c
lapkr. 11 09:08:53 blynas snapd[1016]: daemon.go:586: WARNING: cannot gracefully shut down in-flight snapd API activity within: 25s
lapkr. 11 09:08:56 blynas dbus-daemon[992]: [system] Failed to activate service 'org.freedesktop.PackageKit': timed out (service_start_timeout=25000ms)
lapkr. 11 09:08:56 blynas dbus-daemon[992]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
lapkr. 11 09:09:58 blynas systemd[1]: snapd.service: State 'stop-sigterm' timed out. Killing.
lapkr. 11 09:09:58 blynas systemd[1]: snapd.service: Killing process 1016 (snapd) with signal SIGKILL.
lapkr. 11 09:09:58 blynas systemd[1]: snapd.service: Main process exited, code=killed, status=9/KILL
lapkr. 11 09:09:58 blynas systemd[1]: snapd.service: Failed with result 'timeout'.
lapkr. 11 09:09:58 blynas systemd[1]: Stopped Snap Daemon.
lapkr. 11 09:09:58 blynas systemd[1]: snapd.service: Triggering OnFailure= dependencies.
lapkr. 11 09:09:58 blynas systemd[1]: Requested transaction contradicts existing jobs: Transaction for snapd.failure.service/start is destructive (swapfile.swap has 'stop' job queued, but 'start' is included in transaction).
lapkr. 11 09:09:58 blynas systemd[1]: snapd.service: Failed to enqueue OnFailure= job, ignoring: Transaction for snapd.failure.service/start is destructive (swapfile.swap has 'stop' job queued, but 'start' is included in transaction).
lapkr. 11 09:09:58 blynas systemd[1]: Stopped target Basic System.

I can confirm that I've seen this on systems other than Ubuntu. Probably needs further investigation.

Changed in snapd:
status: New → Triaged
importance: Undecided → Medium
Changed in snapd (Ubuntu):
status: New → Incomplete
status: Incomplete → Confirmed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers