Backport casper changes from groovy to support interactive network boot on focal, and improved interractive boot UX on server

Bug #1884933 reported by Dimitri John Ledkov
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Undecided
Unassigned
casper (Ubuntu)
Undecided
Unassigned
Focal
High
Unassigned

Bug Description

[Impact]

 * With legacy-server iso in focal it is possible to network boot and interactively setup networking configuration.
 * This feature lacks in the initrd used by subiquity, and thus is not available on the live-server.
 * This feature is now implemented in groovy and has been tested to work on multiple architectures, including s390x specific hardware (zdev code path)
 * Backport this feature to focal for 20.04.1

[Test Case - interractive boot]

 * Build live-server iso with proposed pocket enabled

 * Download just the kernel & initrd artefacts

 * Boot kerenl & initrd and specify ignore_uuid kernel cmdline parameter, but nothing about ip= or url= and _without_ any subiquity iso attached.

 * Wait for boot to offer to perform network boot, configure networking, and accept default url, await to be booted to subiquity welcome screen

[Test case - improved UX]

 * During live server boot, there is no plymouth running, and thus some messages displayed during boot are ugly warnings, which are harmless and actually should not be there or improved:

  - Newline added between progress '.....' and the check result message
  - Warning "Connection to playmouth" is dropped, as server normally boots without plymouth, thus this warning should not be printed.
  - Failure to mount /cow -> the code to expose /cow in the root never worked, and always printed error, stop doing that.

 * Ideally, booting kernel+initrd with "quiet" should result in crisp experience without a single warning message printed, until one is informed that local installation media was not found and remote network boot is offered.

 * Test case is to check that fsck progress dots look nice, there is no inline mixed '....' with text, and that there is no "Connection to plymouth" error message, or an error message about /cow.

[Regression Potential]

 * Previously when booting kernel/initrd without iso attached to the machine, the boot would fail and drop to emergency shell. Now interactive network setup is offered, and if that fails, then drop to emergency shell. So whilst there is added interactivity, eventually dropping to emergency shell is still there.

[Other Info]

 * There are many other bug reports requesting this feature, all of which will be closed once this update lands on the daily isos.

description: updated
summary: - Backport casper from groovy to support interactive network boot on focal
+ Backport casper changes from groovy to support interactive network boot
+ on focal, and improved interractive boot UX on server
Changed in casper (Ubuntu):
status: New → Fix Released
Changed in casper (Ubuntu Focal):
status: New → In Progress
importance: Undecided → High
milestone: none → ubuntu-20.04.1
tags: added: id-5ef35d33cea0e54b0224e329
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Dimitri, or anyone else affected,

Accepted casper into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/casper/1.445.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To properly test it you will need to obtain and boot a daily build of a Live CD for focal. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in casper (Ubuntu Focal):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-focal
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

I've made initrds (and matching kernels though I guess they're not actually different) and uploaded them here: https://people.canonical.com/~mwh/lp1884933/

Revision history for this message
Frank Heimes (fheimes) wrote :

I tested this now using your kernel and initrd, Michael, on a z/VM guest.

The interactive boot UX seems to work fine, the installer boot completed, and I could login to the initial subiquity UI.

So far so good - regarding this ticket.

I proceeded with the installation, and first of all selected NOT to update the installer to the latest level, and ran into the following two things:

In the zdev setup screen I noticed that some standard devices are now marked as failed:

======================================================================
  Zdev setup [ Help ]
======================================================================
  ID ONLINE NAMES

  generic-ccw
  0.0.0009 >
  0.0.000c failed >
  0.0.000d failed >
  0.0.000e failed >

 dasd-eckd
  0.0.0190 >
  0.0.0191 >
  0.0.019d >
  0.0.019e >
  0.0.0200 >
  0.0.0300 >
  0.0.0400 >
  0.0.0592 >

I enabled only DASD 0200 - and notice that there are no FCP/multipath devices in the system.

But later on the installer failed due to a curtin issue with the multipath detection.

Please see details in the attached log and find a tgz with crash and log in the following tgz.

(probably worth to open a separate ticket - but I will also retry with updating subiquity to latest level)

Revision history for this message
Frank Heimes (fheimes) wrote :
Revision history for this message
Frank Heimes (fheimes) wrote :
Download full text (3.4 KiB)

Same situation with "subiquity 20.06.1 - 1937".

▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
  Zdev setup [ Help ]
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
  ID ONLINE NAMES

  generic-ccw
  0.0.0009 ▸
  0.0.000c failed ▸
  0.0.000d failed ▸
  0.0.000e failed ▸

  dasd-eckd
...
  0.0.0200 online dasda ▸
...
                       [ Continue ]
                       [ Back ]

 Detected multipath support, reload maps
 Running command ['multipath', '-r'] with allowed return codes [0] (capture=False)
 Jul 07 07:45:23 | DM multipath kernel driver not loaded
 finish: cmd-install/stage-partitioning/builtin/cmd-block-meta/clear-holders: FAIL: removing previous storage devices
 TIMED BLOCK_META: 0.814
 finish: cmd-install/stage-partitioning/builtin/cmd-block-meta: FAIL: curtin command block-meta
 Traceback (most recent call last):
   File "/snap/subiquity/1937/lib/python3.6/site-packages/curtin/commands/main.py", line 202, in main
     ret = args.func(args)
   File "/snap/subiquity/1937/lib/python3.6/site-packages/curtin/commands/main.py", line 202, in main
     ret = args.func(args)
   File "/snap/subiquity/1937/lib/python3.6/site-packages/curtin/log.py", line 97, in wrapper
     return log_time("TIMED %s: " % msg, func, *args, **kwargs)
   File "/snap/subiquity/1937/lib/python3.6/site-packages/curtin/log.py", line 79, in log_time
     return func(*args, **kwargs)
   File "/snap/subiquity/1937/lib/python3.6/site-packages/curtin/commands/block_meta.py", line 102, in block_meta
     meta_clear(devices, state.get('report_stack_prefix', ''))
   File "/snap/subiquity/1937/lib/python3.6/site-packages/curtin/commands/block_meta.py", line 1864, in meta_clear
     clear_holders.start_clear_holders_deps()
   File "/snap/subiquity/1937/lib/python3.6/site-packages/curtin/block/clear_holders.py", line 686, in start_clear_holders_deps
     multipath.reload()
   File "/snap/subiquity/1937/lib/python3.6/site-packages/curtin/block/multipath.py", line 239, in reload
     util.subp(['multipath', '-r'])
   File "/snap/subiquity/1937/lib/python3.6/site-packages/curtin/util.py", line 275, in subp
     util.subp(['multipath', '-r'])
   File "/snap/subiquity/1937/lib/python3.6/site-packages/curtin/util.py", line 275, in subp
     return _subp(*args, **kwargs)
   File "/snap/subiquity/1937/lib/python3.6/site-packages/curtin/util.py", line 141, in _subp
     cmd=args)
 curtin.util.ProcessExecutionError: Unexpected error while running command.
 Command: ['multipath', '-r']
 Exit code: 1
 Reason: -
 Stdout: ''
 Stderr: ''
 Unexpected error while running command.
 curtin: Installation faile...

Read more...

Changed in ubuntu-z-systems:
status: New → In Progress
Revision history for this message
Frank Heimes (fheimes) wrote :

Btw. I totally forgot to mention a little type in the interactive boot UX:

...
Initramfs unpacking failed: Decoding failed
Unable to find a medium container a live file system
Attempt interactive netboot from a URL?
...

container --> containing (or so)

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Hm so multipath -r fails in this system? Can you test in a way that doesn't use the interactive boot set up this ticket is about? Because I don't think it's relevant but it would be nice to be sure.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

I fixed the typo Frank mentioned in the git repo, it doesn't seem worth a new upload.

I made myself an amd64 ISO with an initrd built with casper from proposed and tested a few boot scenarios. They all worked and I observed the desired UX improvements. I also tested the interactive netboot and that worked, although as expected there is a significant delay on amd64 and lots of messages about failing to mount /dev/sr0. We can worry about those in the future!

Together with Frank's testing, I'm going to mark this verification done.

tags: added: verification-done-focal
removed: verification-needed verification-needed-focal
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

I have now also made a s390x build with proposed enabled at https://people.canonical.com/~xnox/casper-sru/

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package casper - 1.445.1

---------------
casper (1.445.1) focal; urgency=medium

  * casper SRU for point release LP: #1884933:
    - casper: Add interactive network configuration
    - casper-md5check: drop error about connecting to plymouth
    - casper-md5check: separate messages in text output mode
    - casper: Drop exposing /cow tree, never worked, and just prints an error.

 -- Dimitri John Ledkov <email address hidden> Wed, 24 Jun 2020 12:56:38 +0100

Changed in casper (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for casper has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers