(late commands) Hangs every now and then

Bug #1939176 reported by Thomas
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
subiquity
Fix Committed
Undecided
Michael Hudson-Doyle

Bug Description

Running the autoinstaller several times in a row, one out of two tries the installation stops/hangs and does not continue. There is no error reported on the progress screen.

Subiquity version is the latest from here here https://cdimage.ubuntu.com/ubuntu-server/focal/daily/current/

I downloaded the installer log files and compared them. It looks to me as the order of commands is not deterministic?

Here are the major differences:

subiquity-client-debug.log (fail):
2021-08-05 16:26:51,634 DEBUG subiquity.client.controllers.identity:49 IdentityController.done next_screen user_spec=IdentityData(realname='SCOE Administrator', username='scoe-adm', hostname='plato2')
2021-08-05 16:26:51,640 DEBUG subiquity.client.client:449 showing InstallConfirmation over <IdentityView selectable box/flow widget>
2021-08-05 16:26:53,514 INFO subiquity/Identity:105 finish: completed SUCCESS
2021-08-05 16:26:53,514 INFO subiquity/SSH:100 start: starting UI

subiquity-client-debug.log (ok):
2021-08-05 13:46:04,384 DEBUG subiquity.client.controllers.identity:49 IdentityController.done next_screen user_spec=IdentityData(realname='SCOE Administrator', username='scoe-adm', hostname='plato2')
2021-08-05 13:46:04,392 INFO subiquity/Identity:105 finish: completed SUCCESS
2021-08-05 13:46:04,393 INFO subiquity/SSH:100 start: starting UI
2

subiquity-server-debug.log:
* The fail log runs the probert.dasd commands before the subiquitycore.utils command *
2021-08-05 13:45:49,725 DEBUG root:39 start: subiquity/Filesystem/_probe/probe_once: restricted=False
2021-08-05 13:45:49,729 DEBUG probert.dasd:131 Probing DASD devies
2021-08-05 13:45:49,729 DEBUG probert.dasd:134 DASD devices only present on s390x, arch=x86_64
2021-08-05 13:45:50,076 DEBUG subiquitycore.utils:83 arun_command ['setupcon', '--save', '--force', '--keyboard-only'] exited with code 0

subiquity-server-info.log (fail):
2021-08-05 16:26:13,015 INFO root:39 finish: subiquity/Network/apply_config: SUCCESS: silent=False
2021-08-05 16:26:23,023 ERROR root:39 finish: subiquity/Mirror/apply_autoinstall_config/waiting: FAIL:
2021-08-05 16:26:23,030 INFO root:39 finish: subiquity/apply_autoinstall_config: SUCCESS:

subiquity-server-info.log (ok):
2021-08-05 13:46:00,220 INFO root:39 finish: subiquity/Network/apply_config: SUCCESS: silent=False
2021-08-05 13:46:04,392 INFO aiohttp.access:206 [05/Aug/2021:13:46:04 +0000] "POST /identity HTTP/1.1" 200 195 "-" "Python/3.6 aiohttp/3.7.4.post0"
2021-08-05 13:46:04,395 INFO aiohttp.access:206 [05/Aug/2021:13:46:04 +0000] "GET /ssh HTTP/1.1" 200 186 "-" "Python/3.6 aiohttp/3.7.4.post0"

Tags: fr-1600 impish
Revision history for this message
Thomas (teclab-at) wrote :
Revision history for this message
Dan Bungert (dbungert) wrote :

Are you able to share your autoinstall.yaml?

Revision history for this message
Dan Bungert (dbungert) wrote :

I ran thru autoinstall 5 times so far and haven't been able to see it yet based on a modified version of autoinstall-user-data from installer.ok in your zip. If you're able to share your autoinstall.yaml or something like it maybe I'll have more luck seeing the issue.

Changed in subiquity:
status: New → Incomplete
Revision history for this message
Thomas (teclab-at) wrote :

Attached the installer config

Revision history for this message
Thomas (teclab-at) wrote :

Anything I should check when it hangs again?

Dan Bungert (dbungert)
Changed in subiquity:
status: Incomplete → New
Revision history for this message
Dan Bungert (dbungert) wrote :

Thanks Thomas. I can reproduce this issue using your autoinstall.cfg file, slightly modified to make the storage config match a VM. If I think of more info needed I'll let you know.

Changed in subiquity:
status: New → Confirmed
Revision history for this message
Thomas (teclab-at) wrote :

Wow that's great!
Thank you very much!

Dan Bungert (dbungert)
tags: added: fr-1600
tags: added: impish
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Argh this is a grotty logic bug.

The subiquity server has this mode where when a request is made for a "postinstall" controller (like identity) and the server is in the "wait for confirmation" mode, the server doesn't actually process the request but instead returns a response indicating that the client should ask for confirmation instead. The client does this and then advances to the next screen. This is intended to happen for the first GET request for a screen, where all this behaviour makes things work.

Unfortunately what happens here is that the server goes into "wait for confirmation" mode while the identity screen is showing and so the request that is supposed to configure the identity data doesn't get processed.

Probably the fix is to only do the x-override-status: confirm thing on requests that are the first for a given screen, which means finding some way to identify such requests, something we don't have currently.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

This would be much easier to fix with https://www.python.org/dev/peps/pep-0555/, how is that conversion to core20 going I wonder.

Changed in subiquity:
status: Confirmed → In Progress
assignee: nobody → Michael Hudson-Doyle (mwhudson)
Dan Bungert (dbungert)
summary: - Hangs every now and then
+ (late commands) Hangs every now and then
Dan Bungert (dbungert)
Changed in subiquity:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.