console-conf's usage of snap managed could be better

Bug #1901769 reported by Ian Johnson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
snapd
Triaged
Medium
Unassigned
subiquity
New
Undecided
Unassigned

Bug Description

Currently, console-conf will call `snap managed` to determine if the first-time setup UI should be shown or not. The issue with this is that it is done in such a way that if things go wrong or have unfortunate timing, the user experience is not great. There are two such problems that should be handled:

1) snapd is refreshing snaps and as such may not respond to it's REST API, waiting to be restarted in the case of a snapd snap refresh, or waiting for the whole system to reboot in the case of a kernel/base refresh
2) in the case that snapd seeding becomes broken or otherwise is undone, the "snap" command may disappear and so running `snap managed` will exit with status code 1.

I would propose that instead of just blindly calling snap managed like this, console-conf implement a more intelligent waiting on the result of `snap managed`, possibly with a timeout where if that command doesn't end up succeeding after like 5 minutes, console-conf gives up and goes into some sort of debugging information mode where it shows the kind of information mentioned in https://bugs.launchpad.net/subiquity/+bug/1901767 about broken snapd seeds.

A simple version without the broken seed detection could be the following:

diff --git a/bin/console-conf-wrapper b/bin/console-conf-wrapper
index a6f1a8db..ccfcec2f 100755
--- a/bin/console-conf-wrapper
+++ b/bin/console-conf-wrapper
@@ -29,7 +29,16 @@ if grep -q 'snapd_recovery_mode=install' /proc/cmdline ; then
     sleep infinity
 fi

-if [ "$(snap managed)" = "true" ]; then
+# sometimes this might be running while snapd is not available due to a refresh,
+# and so it will make this whole script exit with status code 1 and print a
+# confusing message about "connection refused", so instead we should loop
+# waiting for `snap managed` to return exit status 0, then check if the result
+# was true
+# TODO: should we eventually timeout here if for example snapd is broken and
+# this will never complete?
+until IS_DEV_MANAGED=$(snap managed); do sleep 0.1; done
+
+if [ "$IS_DEV_MANAGED" = "true" ]; then
     # check if we have extrausers that have no password set
     if grep -qE '^[-a-z0-9+.-_]+:x:' /var/lib/extrausers/passwd && ! grep -qE '^[-a-z0-9+.-_]+:\$[0-9]+\$.*:' /var/lib/extrausers/shadow; then
         tty=$(tty)

The above patch is against the subiquity repo, the issue with it is that it effectively means that _nothing_ is shown by console-conf if the system is broken forever, whereas at least now you will eventually see some sort of "snap command not found" or some sort of other python traceback when console-conf tries to talk to snapd, etc. which while insufficient debugging information is more useful than nothing and so I don't want to lose what little information we currently have :-)

Revision history for this message
Ian Johnson (anonymouse67) wrote :

Adding snapd task in case we want to change the behavior of `snap managed` to block or wait until snapd is back from maintenance for snap refreshes, etc. to make this easier for console-conf to implement.

Revision history for this message
Samuele Pedroni (pedronis) wrote :

Ian, did we finish improving this?

Changed in snapd:
assignee: nobody → Ian Johnson (anonymouse67)
Revision history for this message
Ian Johnson (anonymouse67) wrote :

@pedronis no, this was a follow-up task to the other console-conf work.

In snapd we could add an option to snap managed or something like this to implement waiting for a real response, etc. but I think this needs some design decisions on how much work we want to do in snapd vs work in console-conf-wrapper to call `snap managed` in a more robust way.

This could be done almost entirely in snapd by adding a special option to `snap managed`, and then all console-conf would need to do is use that option and add a little bit of handling around what if the snap command doesn't exist, etc. and just rely on the snapd work.

Or this could be done entirely in console-conf using a loop like I pasted in the description with a bit more error messages, etc. to make it more user-friendly.

Changed in snapd:
assignee: Ian Johnson (anonymouse67) → nobody
assignee: nobody → Samuele Pedroni (pedronis)
Changed in snapd:
status: New → Triaged
importance: Undecided → Medium
assignee: Samuele Pedroni (pedronis) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.