snapd startup crash after install of standalone microk8s

Bug #1939218 reported by Eugene
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
snapd
Fix Released
High
Miguel Pires

Bug Description

Ubuntu 18.04.5
snapd/bionic-updates,now 2.49.2+18.04 amd64 [installed,automatic]

I've experienced several instances of snap auto-updates breaking a production microk8s cluster and so tried to install microk8s in "standalone" mode so that auto-updates are disabled and updates can be performed manually.

Taking inspiration from
https://github.com/ubuntu/microk8s/pull/1658/files

I ran

snap download microk8s --channel=1.19/stable
snap install microk8s_2339.snap

snapd crashed and now crashes after a restart.

I realise I didn't follow the instructions precisely to see what would happen (didn't run snap assert, didn't add --dangerous nor --classic) but a crash like this is till unexpected..

Revision history for this message
Eugene (euug) wrote :
Eugene (euug)
description: updated
Revision history for this message
Eugene (euug) wrote :

After restoring the VM to a snapshot and running

microk8s.stop
snap ack microk8s_2339.assert
snap install microk8s_2339.snap --classic --dangerous
microk8s.start

everything seems to work

no longer affects: snappy
Changed in snapd:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Luca (lnardelli) wrote :

Happened to me as well (https://bugs.launchpad.net/snapd/+bug/1946996), unfortunately I purged snap to work around the issue, but if needed I can try to reproduce that to get the state.json file that was mentioned in the duplicate issue

Revision history for this message
Miguel Pires (miguelpires1) wrote :

Hi. I haven't been able to reproduce this. If you can still reproduce it, I'd appreciate it if you can run:

sudo cat /var/lib/snapd/state.json | jq '{microk8s: .data.snaps.microk8s, changes: .changes, tasks: .tasks}'

And upload the output here. Thanks

Revision history for this message
Luca (lnardelli) wrote :

Hmm, I just tried to run `sudo snap install microk8s_2546.snap` (without any dangerous or classic flag) while having the same revision manually installed and I got this

```
sudo snap install microk8s_2546.snap
error: cannot perform the following tasks:
- Run configure hook of "microk8s" snap if present (run hook "configure": signal: segmentation fault)
```

This didn't happen when I broke my snapd. I assume this segfault is another issue?

Revision history for this message
Luca (lnardelli) wrote :
Download full text (3.7 KiB)

So, I think I've managed to reproduce it. Here's what I did:

Removed the snap from the system.
Installed 1.20/stable from the snap store, then tried to manually upgrade to 1.21/stable (rev 2546)

```
sudo snap install microk8s --channel=1.20/stable --classic
microk8s (1.20/stable) v1.20.11 from Canonical✓ installed
sudo snap install microk8s_2546.snap
error: cannot perform the following tasks:
- Mount snap "microk8s" (2546) (snap "microk8s" requires classic confinement)
```
Looks like in this case it's asking me to add the classic flag. So I did not proceed with that, instead I updated to 1.21/stable using `snap refresh` (everything went smoothly), then tried installing the manual snap with `sudo snap install microk8s_2546.snap` (again, no dangerous or classic flag here!). Snap accepted the command, and then crashed while in "waiting for services to restart" with the following error `error: cannot communicate with server: Get http://localhost/v2/changes/29: dial unix /run/snapd.socket: connect: connection refused
`

Now, if I get the logs from snapd (journalctl -u snapd.service) I see the slice out of bounds error.

```
Oct 25 20:48:39 pk01 snapd[2424200]: taskrunner.go:271: [change 27 "Mount snap \"microk8s\" (2546)" task] failed: snap "microk8s" requires classic confinement
Oct 25 20:48:39 pk01 systemd[1]: snapd.service: Got notification message from PID 2438844, but reception only permitted for main PID 2424200
Oct 25 20:48:40 pk01 snapd[2424200]: handlers.go:512: Reported install problem for "microk8s" as eed1092a-35d4-11ec-b6c8-fa163e6cac46 OOPSID
Oct 25 20:49:43 pk01 snapd[2424200]: api_snaps.go:303: Installing snap "microk8s" revision unset
Oct 25 20:53:11 pk01 snapd[2424200]: storehelpers.go:557: cannot refresh snap "microk8s": snap has no updates available
Oct 25 20:54:05 pk01 snapd[2424200]: taskrunner.go:271: [change 29 "Run configure hook of \"microk8s\" snap if present" task] failed: run hook "configure": signal: segmentation fault
Oct 25 20:54:09 pk01 snapd[2424200]: panic: runtime error: slice bounds out of range
Oct 25 20:54:09 pk01 snapd[2424200]: goroutine 756084 [running]:
Oct 25 20:54:09 pk01 snapd[2424200]: github.com/snapcore/snapd/overlord/snapstate.(*SnapManager).undoLinkSnap(0xc4200c0100, 0xc42076de60, 0xc4205092c0, 0x0, 0x0)
Oct 25 20:54:09 pk01 snapd[2424200]: /build/snapd/parts/snapd-deb/build/_build/src/github.com/snapcore/snapd/overlord/snapstate/handlers.go:1648 +0x1cce
Oct 25 20:54:09 pk01 snapd[2424200]: github.com/snapcore/snapd/overlord/snapstate.(*SnapManager).(github.com/snapcore/snapd/overlord/snapstate.undoLinkSnap)-fm(0xc42076de60, 0xc4205092c0, 0x56440e703ba0, 0x1)
Oct 25 20:54:09 pk01 snapd[2424200]: /build/snapd/parts/snapd-deb/build/_build/src/github.com/snapcore/snapd/overlord/snapstate/snapmgr.go:435 +0x40
Oct 25 20:54:09 pk01 snapd[2424200]: github.com/snapcore/snapd/overlord/state.(*TaskRunner).run.func1(0x0, 0x0)
Oct 25 20:54:09 pk01 snapd[2424200]: /build/snapd/parts/snapd-deb/build/_build/src/github.com/snapcore/snapd/overlord/state/taskrunner.go:203 +0xbe
Oct 25 20:54:09 pk01 snapd[2424200]: github.com/snapcore/snapd/vendor/gopkg.in/tomb%2ev2.(*Tomb).run(0xc4205092c...

Read more...

Revision history for this message
Miguel Pires (miguelpires1) wrote :

Thanks for the reproducing steps and additional info, those were very helpful

https://github.com/snapcore/snapd/pull/10987 addresses one issue related to this (which should prevent triggering the panic) but there's other sides to this which will be addressed in follow-up PRs. https://github.com/snapcore/snapd/pull/10991 might be one of those follow-ups

Changed in snapd:
assignee: nobody → Miguel Pires (miguelpires1)
status: Triaged → In Progress
Changed in snapd:
status: In Progress → Fix Committed
Changed in snapd:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.