snapd.seeded.service never completes preventing full boot to default target

Bug #1806070 reported by Simon Déziel
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
snapd
High
Unassigned
snapd (Ubuntu)
Undecided
Unassigned

Bug Description

systemd-analyze always says "Bootup is not yet finished. Please try again later." no matter how long I wait (hours, days, ...).

The graphical.target (default one) is indeed never reached due to snapd.seeded.service still running:

$ systemctl list-jobs
JOB UNIT TYPE STATE
  8 graphical.target start waiting
120 snapd.seeded.service start running
  9 multi-user.target start waiting
174 ureadahead-stop.timer start waiting
117 systemd-update-utmp-runlevel.service start waiting

5 jobs listed.

To let the boot complete, I need to manually force a service stop with:

$ sudo systemctl stop snapd.seeded.service

Then, all the waiting jobs are fired and complete ~instantly thus completing the boot process and fixing systemd-analyse:

$ systemctl list-jobs
No jobs running.

$ systemd-analyze
Startup finished in 3.360s (firmware) + 15.447s (loader) + 11.693s (kernel) + 2h 15min 59.902s (userspace) = 2h 16min 30.404s
graphical.target reached after 2h 15min 59.892s in userspace

So clearly something's wrong with snapd.seeded.service:

$ systemd-analyze blame | head
2h 15min 55.944s snapd.seeded.service
          6.602s NetworkManager-wait-online.service
          4.115s plymouth-quit-wait.service
          3.624s <email address hidden>
          2.458s systemd-udev-settle.service
          2.403s systemd-cryptsetup@nvme0n1p4_crypt.service
          1.265s dev-mapper-nvme0n1p3_crypt.device
           839ms unbound.service
           800ms fwupd.service
           709ms dev-loop2.device

snapd's journal entries:

$ journalctl -b0 -u snapd | cat -
-- Logs begin at Tue 2018-10-23 09:45:59 EDT, end at Fri 2018-11-30 11:00:02 EST. --
Nov 30 08:32:54 simon-lemur systemd[1]: Starting Snappy daemon...
Nov 30 08:32:55 simon-lemur snapd[1952]: AppArmor status: apparmor is enabled and all features are available
Nov 30 08:32:55 simon-lemur snapd[1952]: 2018/11/30 08:32:55.130505 helpers.go:521: cannot retrieve info for snap "gnome-characters": cannot find installed snap "gnome-characters" at revision 103: missing file /snap/gnome-characters/103/meta/snap.yaml
Nov 30 08:32:55 simon-lemur snapd[1952]: 2018/11/30 08:32:55.140720 daemon.go:343: started snapd/2.34.2+18.04 (series 16; classic) ubuntu/18.04 (amd64) linux/4.15.0-39-generic.
Nov 30 08:32:55 simon-lemur systemd[1]: Started Snappy daemon.
Nov 30 08:32:55 simon-lemur snapd[1952]: 2018/11/30 08:32:55.143355 stateengine.go:101: state ensure error: Get https://api.snapcraft.io/api/v1/snaps/sections: dial tcp: lookup api.snapcraft.io on 127.0.0.1:53: read udp 127.0.0.1:37197->127.0.0.1:53: read: connection refused

Additional information:

$ lsb_release -rd
Description: Ubuntu 18.04.1 LTS
Release: 18.04

$ apt-cache policy snapd
snapd:
  Installed: 2.34.2+18.04
  Candidate: 2.34.2+18.04
  Version table:
 *** 2.34.2+18.04 500
        500 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages
        100 /var/lib/dpkg/status
     2.32.5+18.04 500
        500 http://archive.ubuntu.com/ubuntu bionic/main amd64 Packages

$ snap list
Name Version Rev Tracking Publisher Notes
canonical-livepatch 8.0.2 41 stable canonical -
core 16-2.33.1 4917 stable canonical core
git-ubuntu 0.7.4+git16.0a79cbc 391 stable canonical classic
gnome-3-26-1604 3.26.0 70 stable/… canonical -
gnome-calculator 3.28.2 180 stable/… canonical -
lxd 3.2 7792 stable canonical -
vlc 3.0.4 555 stable videolan -

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: snapd 2.34.2+18.04
ProcVersionSignature: Ubuntu 4.15.0-39.42-generic 4.15.18
Uname: Linux 4.15.0-39-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
ApportVersion: 2.20.9-0ubuntu7.5
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Fri Nov 30 10:51:28 2018
InstallationDate: Installed on 2018-07-15 (138 days ago)
InstallationMedia: Ubuntu 18.04 LTS "Bionic Beaver" - Release amd64 (20180714)
ProcEnviron:
 LANG=en_CA.UTF-8
 TERM=xterm-256color
 SHELL=/bin/bash
 XDG_RUNTIME_DIR=<set>
 PATH=(custom, no user)
SourcePackage: snapd
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Simon Déziel (sdeziel) wrote :
Revision history for this message
Simon Déziel (sdeziel) wrote :

I just realized now that this issue is preventing snap from refreshing all my snaps:

$ sudo snap refresh lxd
error: cannot refresh "lxd": snap "lxd" has "seed" change in progress

An example of this is that lxd is stuck at version 3.2 while it should be at 3.7:

$ snap info lxd | sed -n '/^channels:/,$ p'
channels:
  stable: 3.7 (9664) 53MB -
  candidate: 3.8 (9786) 54MB -
  beta: ↑
  edge: git-ae0a670 (9788) 54MB -
  3.0/stable: 3.0.3 (9663) 53MB -
  3.0/candidate: 3.0.3 (9663) 53MB -
  3.0/beta: ↑
  3.0/edge: git-18c9b88 (9673) 53MB -
  2.0/stable: 2.0.11 (8023) 28MB -
  2.0/candidate: 2.0.11 (8023) 28MB -
  2.0/beta: ↑
  2.0/edge: git-c7c4cc8 (9257) 26MB -
installed: 3.2 (7792) 57MB -

Revision history for this message
Samuele Pedroni (pedronis) wrote :

what does "snap changes" says on such a system?

Zygmunt Krynicki (zyga)
Changed in snapd:
importance: Undecided → High
status: New → Triaged
Revision history for this message
Simon Déziel (sdeziel) wrote :
Revision history for this message
Simon Déziel (sdeziel) wrote :

@pedronis:

$ snap changes
ID Status Spawn Ready Summary
31 Doing 2 days ago, at 09:31 EST - Initialize system state

Revision history for this message
Samuele Pedroni (pedronis) wrote :

mmh, given that, what does "snap tasks 31" shows?

Revision history for this message
Simon Déziel (sdeziel) wrote :
Revision history for this message
Samuele Pedroni (pedronis) wrote :

from the output of "snap tasks 31" it looks like that change requested snapd to restart but that never happened which is unexpected.

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

We have a suspicion this is a bug that was fixed and released via core snap but not released via SRU. Possible PRs https://github.com/snapcore/snapd/pull/5827 and https://github.com/snapcore/snapd/pull/6027

Upon aborting and re-trying seeding we see: https://paste.ubuntu.com/p/pM2bBdqH5w/ with one thing standing out:

Doing today at 11:40 EST - Automatically connect eligible plugs and slots of snap "gnome-calculator"

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

Dear reporter, can you please try to install the snapd package from the -proposed pocket. If this resolves the issue we will conclude it was the deadlock problem described above. We need ensure that the SRU is released so that everyone affected can update snapd using the classic package.

Revision history for this message
Simon Déziel (sdeziel) wrote :

@zyga, I'm happy to report the -proposed version fixed this bug and also allowed all my snap to be refreshed. Thanks for all your help, I really appreciate!

$ sudo apt-get install snapd
...
The following packages will be upgraded:
   snapd (2.34.2+18.04 => 2.35.5+18.04)

$ snap version
snap 2.36.2
snapd 2.36.2
series 16
ubuntu 18.04
kernel 4.15.0-42-generic

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

That's great to know, thank you!

Changed in snapd:
status: Triaged → Confirmed
status: Confirmed → Fix Released
John Lenton (chipaca)
Changed in snapd (Ubuntu):
status: New → Fix Released
Revision history for this message
Chad Smith (chad.smith) wrote :
Download full text (3.2 KiB)

I'm wondering if we are seeing a regression here. On a focal host system, I am unable to launch disco or eoan containers as snapd.seeded service doesn't complete:

root@mytest:~# systemd-analyze
Bootup is not yet finished (org.freedesktop.systemd1.Manager.FinishTimestampMonotonic=0).
Please try again later.
Hint: Use 'systemctl list-jobs' to see active jobs

root@mytest:~# snap version
snap 2.42.1+19.04
snapd 2.42.1+19.04
series 16
ubuntu 19.04
kernel 5.3.0-19-generic
root@mytest:~# systemctl list-jobs
JOB UNIT TYPE STATE
117 snapd.seeded.service start running
 90 systemd-update-utmp-runlevel.service start waiting
123 cloud-config.service start waiting
118 cloud-init.target start waiting
100 snapd.autoimport.service start waiting
  1 graphical.target start waiting
  2 multi-user.target start waiting
122 cloud-final.service start waiting

8 jobs listed.
root@mytest:~# grep CODE /etc/os-release
VERSION_CODENAME=disco
UBUNTU_CODENAME=disco
root@mytest:~#

My host system:
csmith@downtown:~/src/ubuntu-sru $ dpkg -l | grep snap
ii chromium-codecs-ffmpeg-extra 78.0.3904.108-0ubuntu1 amd64 Transitional package - chromium-codecs-ffmpeg-extra -> chromium-ffmpeg snap
ii ebtables 2.0.10.4+snapshot20181205-1ubuntu1 amd64 Ethernet bridge frame table administration
ii gir1.2-snapd-1:amd64 1.54-0ubuntu1 amd64 Typelib file for libsnapd-glib1
ii gnome-software-plugin-snap 3.30.6-2ubuntu11 amd64 Snap support for GNOME Software
ii libsnapd-glib1:amd64 1.54-0ubuntu1 amd64 GLib snapd library
ii libsnappy1v5:amd64 1.1.7-2 amd64 fast compression/decompression library
ii lxd 1:0.7 all Transitional package - lxd -> snap (lxd)
ii snapd 2.42.1+20.04 amd64 Daemon and tooling that enable snap packages
ii snapd-login-service 1.23-0ubuntu1 amd64 Daemon to allow non-root access to snapd

csmith@downtown:~/src/ubuntu-sru $ snap list
Name Version Rev Tracking Publisher Notes
aws-cli 1.16.266 151 stable aws✓ classic
core 16-2.42.4 8213 stable canonical✓ core
core18 20191126 1279 stable canonical✓ base
git-ubuntu 0.7.4+git16.0a79cbc 391 stable canonical✓ classic
go 1.13.5 4901 stable mwhudson classic
hub v2.13.0 26 stable felicianotech classic
juju 2.7.0 9876 stable canonical✓ classic
lxd 3.0.4 1134...

Read more...

Revision history for this message
Chad Smith (chad.smith) wrote :

From within the container:

$ snap changes
1 Error today at 20:00 UTC today at 20:00 UTC Initialize system state
2 Error today at 20:00 UTC today at 20:00 UTC Initialize device

Revision history for this message
Chad Smith (chad.smith) wrote :

it seems we are running into something with network not being properly setup in a lxc snapshot. The issue/feature for snapd.seed seems to be that broken network configuration will result in snapd.seeded will wait indefinitely on network connectivity. I'll file a separate bug if need be with more details after triage.

Revision history for this message
Johan Ehnberg (johan-ehnberg) wrote :

Chad, were you able to find the config that caused the effect? There is a similar issue open here:
https://bugs.launchpad.net/ubuntu/+source/snapd/+bug/1878225

I am trying to find out if it is a regression or something in my config.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers