console-conf when run with auto-refresh of core20 will crash and become non-responsive

Bug #1880156 reported by Ian Johnson on 2020-05-22
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
snapd
High
Unassigned
subiquity
Undecided
Unassigned

Bug Description

When booting a VM image on amd64, I tried to configure the device with console-conf, but it seems that an auto-refresh of the core20 snap was in progress and subsequently snapd attempted to reboot the device immediately, causing what I think are the following bugs:

1. console-conf hung after entering my email and hitting "Done", it didn't show any progress or anything and was just hung until the VM rebooted itself a while later, maybe a minute or slightly less
2. upon rebooting I could no longer use console-conf at all, the message "Press enter to configure" was displayed but upon hitting enter some sort of other message appeared on the screen immediately and then all text went away leaving me with an empty screen, and furthermore being unable to configure console-conf at all. I waited for a minute or two before giving up and rebooting the VM as it normally is much much quicker to run in this VM, but perhaps it would

Upon a manual reboot after 2 I was then able to use console-conf and login successfully.

To be clear, during 2 I still could not login to the device over SSH with my configured credentials.

I think the sequencing is basically this:

1. snapd starts to auto-refresh core20
2. I hit enter and console-conf starts to run
3. snapd finishes the auto-refresh
4. I finish entering in my email
5. console-conf is stuck trying to run `snap create-user` which for some reason just hangs
6. eventually snapd's scheduled reboot for the refresh of the core20 snap hard-reboots the system

Probably this is a snapd bug in that when snapd has requested a scheduled reboot for the base snap, etc. it should fail quickly with some message that it can't respond now.

What I think console-conf could do better is to have some sort of spinning animation while waiting for snap create-user to finish/return, because there could be other situations where snap create-user takes longer than expected, i.e. high system load from background service snaps or internet connectivity issues talking to the store or even some kind of store issue.

Ian Johnson (anonymouse67) wrote :

I can reproduce issue 1 with the same image again, but I have not been able to reproduce issue 2.

Ian Johnson (anonymouse67) wrote :

Another thing that maybe console-conf could do would be something like what the subiquity snap does and have the UI for console-conf ask if it should check for updates before continuing, but in this case it would check for updates to the core20 snap and if available it should make clear to the user it needs to reboot for the new console-conf to run.

Changed in snapd:
importance: Undecided → Medium
Dimitri John Ledkov (xnox) wrote :

I'm confused how a refresh of core20 can do anything. Given that core20 is the rootfs, and we don't change rootfs, unless we reboot.

Can snapd, refresh snapd.snap without reboot? Cause that may cause a hang too.

Do you have logs from that system? Ie. all the snapd changes?

On the subiquity live server we disable snap refreshes by default. Can we inhibit / pause refreshes and reboots, if somebody launched consoleconf? (with a timeout, i.e. kill console-conf if it is not completed in 10min)

Dimitri John Ledkov (xnox) wrote :

Which images / channels were you using?

Changed in subiquity:
status: New → Incomplete
Changed in snapd:
status: New → Incomplete
Michael Vogt (mvo) on 2020-06-09
Changed in snapd:
status: Incomplete → Triaged
importance: Medium → High
Ian Johnson (anonymouse67) wrote :

@xnox I was using the beta channels, probably with a custom built image.

> I'm confused how a refresh of core20 can do anything.

A refresh of core20 will cause snapd to stop responding to it's REST API until it has rebooted, causing console-conf to hang trying to run `snap create-user`

> Can snapd, refresh snapd.snap without reboot?

Yes the snapd snap can be refreshed without a reboot.

> Do you have logs from that system? Ie. all the snapd changes?

Not anymore but this is pretty easily reproduced if you just build an image with the beta core20 snap as a revision older than what's currently on the beta channel.

> Can we inhibit / pause refreshes and reboots, if somebody launched consoleconf?

Sort of, see https://snapcraft.io/docs/system-options#heading--refresh specifically refresh.hold key. Note that setting this key is slightly buggy though as per https://bugs.launchpad.net/snapd/+bug/1856063

Changed in subiquity:
status: Incomplete → New
Changed in snapd:
status: Triaged → New
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers