snapd crashed and can't start: "Re-refresh task has 1 tasks waiting for it"

Bug #1860324 reported by Colin Watson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
snapd
Confirmed
Undecided
Unassigned

Bug Description

I noticed that I had a snap that was stuck on an old revision for some reason and attempted to force it to refresh to stable:

$ snap info gitlptools
name: gitlptools
[...]
snap-id: LyyEad7cnNXM2rmBu9ySp7MJDWTKfAsF
tracking: latest/stable
refresh-date: 2 days ago, at 16:24 GMT
channels:
  stable: b9b9618 2019-02-20 (36) 14MB classic
  candidate: ↑
  beta: ↑
  edge: 0+git.cfe2bdc-dirty 2017-12-19 (3) 14MB classic
installed: 0+git.a2477ff (9) 14MB classic
$ snap refresh --stable gitlptools
error: cannot communicate with server: Get http://localhost/v2/changes/1277: dial unix /run/snapd.socket: connect: connection refused

Now snapd is stopped and can't start back up at all. syslog says:

Jan 20 09:54:37 niejwein systemd[1]: Starting Snappy daemon...
Jan 20 09:54:37 niejwein snapd[9861]: AppArmor status: apparmor is enabled and all features are available
Jan 20 09:54:37 niejwein snapd[9861]: AppArmor status: apparmor is enabled and all features are available
Jan 20 09:54:37 niejwein snapd[9861]: daemon.go:346: started snapd/2.43.1 (series 16; classic) ubuntu/18.04 (amd64) linux/4.15.0-72-generic.
Jan 20 09:54:37 niejwein snapd[9861]: daemon.go:439: adjusting startup timeout by 2m35s (pessimistic estimate of 30s plus 5s per snap)
Jan 20 09:54:38 niejwein systemd[1]: Started Snappy daemon.
Jan 20 09:54:38 niejwein snapd[9861]: handlers.go:2589: PANIC Re-refresh task has 1 tasks waiting for it.
Jan 20 09:54:38 niejwein snapd[9861]: panic: Re-refresh task has 1 tasks waiting for it.
Jan 20 09:54:38 niejwein snapd[9861]: goroutine 45 [running]:
Jan 20 09:54:38 niejwein snapd[9861]: github.com/snapcore/snapd/logger.Panicf(0x555e924913b6, 0x2c, 0xc420455e68, 0x1, 0x1)
Jan 20 09:54:38 niejwein snapd[9861]: #011/build/snapd-IqyPU7/snapd-2.43.1/_build/src/github.com/snapcore/snapd/logger/logger.go:67 +0x171
Jan 20 09:54:38 niejwein snapd[9861]: github.com/snapcore/snapd/overlord/snapstate.(*SnapManager).doCheckReRefresh(0xc4200a6690, 0xc4203cd9e0, 0xc4202a2d70, 0x0, 0x0)
Jan 20 09:54:38 niejwein snapd[9861]: #011/build/snapd-IqyPU7/snapd-2.43.1/_build/src/github.com/snapcore/snapd/overlord/snapstate/handlers.go:2589 +0x567
Jan 20 09:54:38 niejwein snapd[9861]: github.com/snapcore/snapd/overlord/snapstate.(*SnapManager).(github.com/snapcore/snapd/overlord/snapstate.doCheckReRefresh)-fm(0xc4203cd9e0, 0xc4202a2d70, 0x555e92e30320, 0x0)
Jan 20 09:54:38 niejwein snapd[9861]: #011/build/snapd-IqyPU7/snapd-2.43.1/_build/src/github.com/snapcore/snapd/overlord/snapstate/snapmgr.go:429 +0x40
Jan 20 09:54:38 niejwein snapd[9861]: github.com/snapcore/snapd/overlord/state.(*TaskRunner).run.func1(0x0, 0x0)
Jan 20 09:54:38 niejwein snapd[9861]: #011/build/snapd-IqyPU7/snapd-2.43.1/_build/src/github.com/snapcore/snapd/overlord/state/taskrunner.go:195 +0xbb
Jan 20 09:54:38 niejwein snapd[9861]: github.com/snapcore/snapd/vendor/gopkg.in/tomb%2ev2.(*Tomb).run(0xc4202a2d70, 0xc4204384b0)
Jan 20 09:54:38 niejwein snapd[9861]: #011/build/snapd-IqyPU7/snapd-2.43.1/_build/src/github.com/snapcore/snapd/vendor/gopkg.in/tomb.v2/tomb.go:163 +0x2d
Jan 20 09:54:38 niejwein snapd[9861]: created by github.com/snapcore/snapd/vendor/gopkg.in/tomb%2ev2.(*Tomb).Go
Jan 20 09:54:38 niejwein snapd[9861]: #011/build/snapd-IqyPU7/snapd-2.43.1/_build/src/github.com/snapcore/snapd/vendor/gopkg.in/tomb.v2/tomb.go:159 +0xbb
Jan 20 09:54:38 niejwein systemd[1]: snapd.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Jan 20 09:54:38 niejwein systemd[1]: snapd.service: Failed with result 'exit-code'.
Jan 20 09:54:38 niejwein systemd[1]: snapd.service: Triggering OnFailure= dependencies.
Jan 20 09:54:38 niejwein systemd[1]: snapd.service: Service hold-off time over, scheduling restart.
Jan 20 09:54:38 niejwein systemd[1]: snapd.service: Scheduled restart job, restart counter is at 5.
Jan 20 09:54:38 niejwein systemd[1]: Stopped Snappy daemon.
Jan 20 09:54:38 niejwein systemd[1]: snapd.service: Start request repeated too quickly.
Jan 20 09:54:38 niejwein systemd[1]: snapd.service: Failed with result 'exit-code'.
Jan 20 09:54:38 niejwein systemd[1]: Failed to start Snappy daemon.
Jan 20 09:54:38 niejwein systemd[1]: snapd.socket: Failed with result 'service-start-limit-hit'.
Jan 20 09:54:38 niejwein systemd[1]: snapd.service: Triggering OnFailure= dependencies.

Help?

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: snapd 2.42.1+18.04
ProcVersionSignature: Ubuntu 4.15.0-72.81-generic 4.15.18
Uname: Linux 4.15.0-72-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
ApportVersion: 2.20.9-0ubuntu7.9
Architecture: amd64
CurrentDesktop: GNOME
Date: Mon Jan 20 09:56:18 2020
EcryptfsInUse: Yes
InstallationDate: Installed on 2013-03-01 (2515 days ago)
InstallationMedia: Ubuntu 13.04 "Raring Ringtail" - Alpha amd64 (20130223.1)
SourcePackage: snapd
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Colin Watson (cjwatson) wrote :
Revision history for this message
Maciej Borzecki (maciek-borzecki) wrote :

Can you provide the output of `snap changes` and `snap change --last=refresh`?

If snapd is dead, make sure to stop it via `systemctl stop snapd{.service,socket}` and then run:

snap debug state /var/lib/snapd/state.json
snap debug state /var/lib/snapd/state.json --change=<id>

to obtain the same information as snap change/changes commands.

Revision history for this message
Colin Watson (cjwatson) wrote :

snapd is indeed dead and refuses to start, but here's the "snap debug state" output:

$ sudo snap debug state /var/lib/snapd/state.json
ID Status Spawn Ready Label Summary
1271 Done 3 days ago, at 22:37 GMT 3 days ago, at 22:40 GMT auto-refresh Auto-refresh snaps "core", "juju"
1272 Done 2 days ago, at 15:59 GMT 2 days ago, at 15:59 GMT configure-snap Change configuration of "core" snap
1273 Done 2 days ago, at 16:24 GMT 2 days ago, at 16:24 GMT refresh-snap Refresh snap "gitlptools"
1274 Done 2 days ago, at 16:25 GMT 2 days ago, at 16:29 GMT refresh-snap Refresh "lxd" snap from "candidate" channel
1275 Done yesterday at 22:16 GMT yesterday at 22:16 GMT auto-refresh Auto-refresh snap "gitlptools"
1276 Done today at 09:51 GMT today at 09:51 GMT refresh-snap Refresh snap "gitlptools"
1277 Doing today at 09:52 GMT 0000-12-31 refresh-snap Refresh "gitlptools" snap
$ sudo snap debug state /var/lib/snapd/state.json --change=1277
Lanes ID Status Spawn Ready Kind Summary
0 8234 Done today at 09:52 GMT today at 09:52 GMT refresh-aliases Refresh aliases for snap "gitlptools"
0 8235 Doing today at 09:52 GMT 0000-12-31 check-rerefresh Consider re-refresh of "gitlptools"
0 8236 Do today at 09:52 GMT 0000-12-31 switch-snap-channel Switch snap "gitlptools" from channel "latest/stable" to "stable"

Revision history for this message
Colin Watson (cjwatson) wrote :

$ sudo cat /var/lib/snapd/state.json | jq '.data.snaps.gitlptools'
{
  "type": "app",
  "sequence": [
    {
      "name": "gitlptools",
      "snap-id": "LyyEad7cnNXM2rmBu9ySp7MJDWTKfAsF",
      "revision": "36",
      "channel": "stable",
      "title": "Launchpad tooling for Git",
      "summary": "Tools to make git work well with Launchpad",
      "description": "A collection of plugins to make working with git and Launchpad a nicer experience.\n\nCurrently ships:\n\n`git-lp-open` for opening the current branch in your browser\n\n`git-lp-propose` for creating a merge proposal for the current branch\n\n`git-lp-approve` to top-approve a merge proposal for the current branch."
    },
    {
      "name": "gitlptools",
      "snap-id": "LyyEad7cnNXM2rmBu9ySp7MJDWTKfAsF",
      "revision": "9",
      "channel": "latest/stable",
      "title": "Launchpad tooling for Git",
      "summary": "Tools to make git work well with Launchpad",
      "description": "A collection of plugins to make working with git and Launchpad a nicer experience.\n\nCurrently ships:\n\n`git-lp-open` for opening the current branch in your browser\n\n`git-lp-propose` for creating a merge proposal for the current branch\n\n`git-lp-approve` to top-approve a merge proposal for the current branch."
    }
  ],
  "active": true,
  "current": "9",
  "channel": "latest/stable",
  "classic": true,
  "aliases": {
    "git-lp-open": {
      "auto": "git-lp-open"
    },
    "git-lp-propose": {
      "auto": "git-lp-propose"
    }
  },
  "user-id": 3
}

Revision history for this message
Paweł Stołowski (stolowski) wrote :

I had a quick look and it seems that UpdateWithDeviceContext can append "snap-switch-channel" to taskset from doUpdate(), which means a task gets scheduled after "check-rerefresh" (which panics if it finds any tasks waiting for it). AFAIR "check-rerefresh" was meant to be the last in the change.

Revision history for this message
John Lenton (chipaca) wrote :

If you're feeling brave and need to get unstuck, edit the state.json to remove the check-rerefresh task and you should be able to start snapd again.

Revision history for this message
John Lenton (chipaca) wrote :

Easier would be to set it to Done instead of removing it though

Revision history for this message
John Lenton (chipaca) wrote :

(to be clear that means set status to 4 in the json)

Revision history for this message
Maciej Borzecki (maciek-borzecki) wrote :

After banging my head against the unit tests, I got the PR up that reproduces this scenario: https://github.com/snapcore/snapd/pull/8024 The fix will end up in the same PR at some point.

John Lenton (chipaca)
affects: snapd (Ubuntu) → snapd
Changed in snapd:
status: New → In Progress
assignee: nobody → John Lenton (chipaca)
Zygmunt Krynicki (zyga)
Changed in snapd:
assignee: John Lenton (chipaca) → nobody
status: In Progress → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.