snapd fails to shut down, gets killed after 1m30s during reboot

Bug #1901847 reported by Marius Gedminas
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
snapd
Triaged
Medium
Unassigned
snapd (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

I installed some updated and was prompted to reboot. The shutdown part of the reboot took a long time (over 1 minute 30 seconds), and I saw that systemd was waiting for snapd to shut down cleanly before killing it with SIGKILL 90 seconds later.

journalctl -b -1 -e shows this:

spal. 28 11:03:02 blynas systemd[1]: NetworkManager.service: Succeeded.
spal. 28 11:03:02 blynas systemd[1]: Stopped Network Manager.
spal. 28 11:03:02 blynas systemd[1]: Stopping D-Bus System Message Bus...
spal. 28 11:03:02 blynas systemd[1]: dbus.service: Succeeded.
spal. 28 11:03:02 blynas systemd[1]: dbus.service: Unit process 989 (dbus-daemon) remains running after unit stopped.
spal. 28 11:03:02 blynas systemd[1]: dbus.service: Unit process 322309 (python3) remains running after unit stopped.
spal. 28 11:03:02 blynas systemd[1]: Stopped D-Bus System Message Bus.
spal. 28 11:03:02 blynas thermald[1025]: terminating on user request ..
spal. 28 11:03:03 blynas systemd[1]: thermald.service: Succeeded.
spal. 28 11:03:03 blynas systemd[1]: Stopped Thermal Daemon Service.
spal. 28 11:03:03 blynas clamd[1036]: Wed Oct 28 11:03:03 2020 -> --- Stopped at Wed Oct 28 11:03:03 2020
spal. 28 11:03:03 blynas clamd[1036]: Wed Oct 28 11:03:03 2020 -> Socket file removed.
spal. 28 11:03:03 blynas systemd[1]: clamav-daemon.service: Succeeded.
spal. 28 11:03:03 blynas systemd[1]: Stopped Clam AntiVirus userspace daemon.
spal. 28 11:03:26 blynas snapd[1013]: daemon.go:586: WARNING: cannot gracefully shut down in-flight snapd API activity within: 25s
spal. 28 11:04:31 blynas systemd[1]: snapd.service: State 'stop-sigterm' timed out. Killing.
spal. 28 11:04:31 blynas systemd[1]: snapd.service: Killing process 1013 (snapd) with signal SIGKILL.
spal. 28 11:04:31 blynas systemd[1]: snapd.service: Main process exited, code=killed, status=9/KILL
spal. 28 11:04:31 blynas systemd[1]: snapd.service: Failed with result 'timeout'.
spal. 28 11:04:31 blynas systemd[1]: Stopped Snap Daemon.
spal. 28 11:04:31 blynas systemd[1]: snapd.service: Triggering OnFailure= dependencies.
spal. 28 11:04:31 blynas systemd[1]: Requested transaction contradicts existing jobs: Transaction for snapd.failure.service/start is destructive (systemd-remount-fs.service has 'stop' job queued, but 'start' is included in transaction).
spal. 28 11:04:31 blynas systemd[1]: snapd.service: Failed to enqueue OnFailure= job, ignoring: Transaction for snapd.failure.service/start is destructive (systemd-remount-fs.service has 'stop' job queued, but 'start' is included in transaction).
spal. 28 11:04:31 blynas systemd[1]: Stopped target Basic System.

ProblemType: Bug
DistroRelease: Ubuntu 20.10
Package: snapd 2.47.1+20.10.1
ProcVersionSignature: Ubuntu 5.8.0-26.27-generic 5.8.14
Uname: Linux 5.8.0-26-generic x86_64
ApportVersion: 2.20.11-0ubuntu50
Architecture: amd64
CasperMD5CheckResult: skip
CurrentDesktop: ubuntu:GNOME
Date: Wed Oct 28 12:29:28 2020
EcryptfsInUse: Yes
InstallationDate: Installed on 2019-06-12 (503 days ago)
InstallationMedia: Ubuntu 19.04 "Disco Dingo" - Release amd64 (20190416)
SourcePackage: snapd
UpgradeStatus: Upgraded to groovy on 2020-10-23 (5 days ago)
modified.conffile..etc.sudoers.d.99-snapd.conf: [inaccessible: [Errno 13] Permission denied: '/etc/sudoers.d/99-snapd.conf']

Revision history for this message
Marius Gedminas (mgedmin) wrote :
Revision history for this message
Zygmunt Krynicki (zyga) wrote :

I wonder what is causing that activity:

 spal. 28 11:03:26 blynas snapd[1013]: daemon.go:586: WARNING: cannot gracefully shut down in-flight snapd API activity within: 25s

This is why we are timing out on shutdown.

Revision history for this message
Marius Gedminas (mgedmin) wrote :

It would be nice if snapd would log something about what it's doing to the journal.

I did run a `snap refresh --list` before rebooting and I saw that a chromium update was being delayed since chromium was running. I closed chromium a few seconds before initiating the reboot. Perhaps snapd thought it was a good opportunity to start the upgrade?

After the reboot I did a manual `sudo snap refresh` to be sure chromium would be upgraded before I launched it, and I saw the progress bar go from 0 to 100%, rather slowly. I wasn't paying much attention, but I think it said it was downloading the new version of chromium.

Revision history for this message
Marius Gedminas (mgedmin) wrote :

Today this happened again. I installed a bunch of updates with update-manager, was prompted to reboot, checked with 'snap refresh --list' to see if there were any snap updates pending (there weren't), and let update-manager do the reboot.

snapd took 1m30s to shut down.

lapkr. 11 09:08:29 blynas systemd[1]: Stopped D-Bus System Message Bus.
lapkr. 11 09:08:31 blynas dbus-daemon[992]: [system] Activating systemd to hand-off: service name='org.freedesktop.PackageKit' unit='packagekit.service' requested by ':1.2851' (uid=0 pid=3021119 comm="/usr/bin/gdbus call --system --dest org.freedeskto" label="unconfined")
lapkr. 11 09:08:38 blynas AptDaemon.Worker[2993491]: INFO: Finished transaction /org/debian/apt/transaction/dff292185c0a4a19af71998f9f7e813c
lapkr. 11 09:08:38 blynas org.debian.apt[2993491]: /usr/lib/python3/dist-packages/aptdaemon/progress.py:490: Warning: Source ID 49 was not found when attempting to remove it
lapkr. 11 09:08:38 blynas org.debian.apt[2993491]: GLib.source_remove(id)
lapkr. 11 09:08:38 blynas org.debian.apt[2993491]: /usr/lib/python3/dist-packages/aptdaemon/progress.py:490: Warning: Source ID 50 was not found when attempting to remove it
lapkr. 11 09:08:38 blynas org.debian.apt[2993491]: GLib.source_remove(id)
lapkr. 11 09:08:38 blynas org.debian.apt[2993491]: /usr/lib/python3/dist-packages/aptdaemon/progress.py:490: Warning: Source ID 51 was not found when attempting to remove it
lapkr. 11 09:08:38 blynas org.debian.apt[2993491]: GLib.source_remove(id)
lapkr. 11 09:08:38 blynas org.debian.apt[2993491]: 09:08:38 AptDaemon.Worker [INFO]: Finished transaction /org/debian/apt/transaction/dff292185c0a4a19af71998f9f7e813c
lapkr. 11 09:08:53 blynas snapd[1016]: daemon.go:586: WARNING: cannot gracefully shut down in-flight snapd API activity within: 25s
lapkr. 11 09:08:56 blynas dbus-daemon[992]: [system] Failed to activate service 'org.freedesktop.PackageKit': timed out (service_start_timeout=25000ms)
lapkr. 11 09:08:56 blynas dbus-daemon[992]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
lapkr. 11 09:09:58 blynas systemd[1]: snapd.service: State 'stop-sigterm' timed out. Killing.
lapkr. 11 09:09:58 blynas systemd[1]: snapd.service: Killing process 1016 (snapd) with signal SIGKILL.
lapkr. 11 09:09:58 blynas systemd[1]: snapd.service: Main process exited, code=killed, status=9/KILL
lapkr. 11 09:09:58 blynas systemd[1]: snapd.service: Failed with result 'timeout'.
lapkr. 11 09:09:58 blynas systemd[1]: Stopped Snap Daemon.
lapkr. 11 09:09:58 blynas systemd[1]: snapd.service: Triggering OnFailure= dependencies.
lapkr. 11 09:09:58 blynas systemd[1]: Requested transaction contradicts existing jobs: Transaction for snapd.failure.service/start is destructive (swapfile.swap has 'stop' job queued, but 'start' is included in transaction).
lapkr. 11 09:09:58 blynas systemd[1]: snapd.service: Failed to enqueue OnFailure= job, ignoring: Transaction for snapd.failure.service/start is destructive (swapfile.swap has 'stop' job queued, but 'start' is included in transaction).
lapkr. 11 09:09:58 blynas systemd[1]: Stopped target Basic System.

Revision history for this message
Maciej Borzecki (maciek-borzecki) wrote :

I can confirm that I've seen this on systems other than Ubuntu. Probably needs further investigation.

Changed in snapd:
status: New → Triaged
importance: Undecided → Medium
Changed in snapd (Ubuntu):
status: New → Incomplete
status: Incomplete → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.