Crash and failure installing focal
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| subiquity |
Undecided
|
Unassigned | ||
| curtin (Ubuntu) |
High
|
Ryan Harper | ||
| Eoan |
Undecided
|
Unassigned | ||
| Focal |
High
|
Ryan Harper | ||
| util-linux (Debian) |
Fix Released
|
Unknown
|
||
| util-linux (Ubuntu) |
Medium
|
Mauricio Faria de Oliveira | ||
| Eoan |
Medium
|
Mauricio Faria de Oliveira | ||
| Focal |
Medium
|
Mauricio Faria de Oliveira |
Bug Description
[Impact]
* lsblk no longer prints a partition's parent
kernel device name (the wholedisk).
(i.e., 'lsblk -no PKNAME /dev/partition')
* Another impact is the 'removable media' check
always return zero for partitions.
(i.e., 'lsblk -no RM /dev/partition')
* The regression was introduced on v2.34, only
Eoan (v2.34) and later are affected.
Disco (v2.33) and earlier are not affected.
* The regression is fixed in v2.35, in commit
e3bb9bfb76c1 ("lsblk: force to print PKNAME
for partition"); fixes RM for partition too.
[Test Case]
* $ lsblk -no PKNAME /dev/vda1 # partition
* Expected output: vda # wholedisk
* Current output: (nothing)
* $ lsblk -no RM /dev/sdb1 # partition in removable disk
* Expected output: 1 # removable media
* Current output: 0 # not removable media
[Regression Potential]
* Columns that depend on a partition device's
parent device (i.e., seen as 'wholedisk')
could in theory show incorrect values if
another bug is present in v2.34 for that.
* Other usages of 'parent' pointer in the
function have been examined and reported
(e.g. issue w/ removable media column),
and others found to not have issues
(e.g. --merge option, to group multiple
parents of a device, as in RAID.)
[Other Info]
* The impacts to the curtin source package
have been addressed in other way, it no
longer requires util-linux, comment #14.
* util-linux github issue:
https:/
[Original Bug Description]
During an install of the daily live image for 20.04 Ubuntu Server, the installer first crashed and restarted itself, then failed to install the system.
Attached are the logs left on the install USB key.
Related branches
- Ryan Harper: Approve on 2020-02-14
- Server Team CI bot: Approve (continuous-integration) on 2020-02-14
- Lee Trager (community): Approve on 2020-02-13
- Dan Watkins: Approve on 2020-02-13
-
Diff: 369 lines (+234/-8)5 files modifiedcurtin/commands/curthooks.py (+1/-0)
examples/tests/uefi_reuse_esp.yaml (+105/-0)
helpers/common (+60/-2)
tests/vmtests/__init__.py (+16/-6)
tests/vmtests/test_reuse_uefi_esp.py (+52/-0)
Ryan Harper (raharper) wrote : | #2 |
Ryan Harper (raharper) wrote : | #3 |
@Lee
The efi_dev parsing code from the centos8 branch isn't happy:
Command: ['sh', '-c', 'exec "$0" "$@" 2>&1', 'install-grub', '--uefi', '--update-nvram', '--os-family=
Exit code: 1
Reason: -
Stdout: carryover command line params ''
setting GRUB_CMDLINE_
updated /target/
curtin uefi: installing grub-efi-amd64 to: /boot/efi
+ echo before grub-install efiboot settings
before grub-install efiboot settings
+ efibootmgr -v
Timeout: 1 seconds
BootOrder: 0005,0006,0007
Boot0005* UEFI: IP4 Realtek PCIe GBE Family Controller PciRoot(
Boot0006* UEFI: IP6 Realtek PCIe GBE Family Controller PciRoot(
Boot0007* UEFI: SanDisk U3 Cruzer Micro 8.02 HD(1,MBR,
+ bootid=ubuntu
+ efi_disk=/dev/
+ efi_part_num=1
+ grubpost=
+ grubcmd=
+ dpkg-reconfigure grub-efi-amd64
Ryan Harper (raharper) wrote : | #4 |
We only see this failure on shim/secure-boot enabled setups.
Changed in curtin (Ubuntu): | |
importance: | Undecided → High |
status: | New → Triaged |
Ryan Harper (raharper) wrote : | #5 |
OK, I think I've found the bug; there are two issues;
if [ "${#grubdevs_
# Currently UEFI can only be pointed to one system partition. If
# for some reason multiple install locations are given only use the
# first.
efi_
elif [ "${#grubdevs_
error "Only one grub device supported on UEFI!"
exit 1
else
# If no storage configuration was given try to determine the system
# partition.
efi_dev=$(awk -v "MP=${mp}/boot/efi" '$2 == MP { print $1 }' /proc/mounts)
fi
The [ -f "${grubdevs_
This fails as they are block devices; the check should be [ -b "${grubdevs_
Because this fails, we fall into the else clause, which is able to figure out from
/proc/mounts that the efi_dev is /dev/sda1.
Now, further down when we convert the efi_dev into the disk and partition we run this code
# The partition number of block device name need to be determined here
# so both getting the UEFI device from Curtin config and discovering it
# work.
lsblk -no pkname $efi_dev
returns an empty string; that's because 'pkname' is an unknown column in lsblk,
rather the value should be 'kname'. This error results in efi_disk being set to "/dev/"
This isn't found on non-shim based installs as efi_disk variable is not used unless we are creating our own efibootmgr entry.
Lee Trager (ltrager) wrote : | #6 |
I agree a -b or -e should be used instead of -f. However pkname is a valid column in lsblk. From lsblk --help:
KNAME internal kernel device name
PKNAME internal parent kernel device name
pkname not working is a regression which was introduced in util-linux-2.34[1]. Upstream has fixed this[2].
[1] https:/
[2] https:/
Lee Trager (ltrager) wrote : | #7 |
Upstream util-linux has fixed this in 2.34.1+
Alberto Donato (ack) wrote : | #8 |
@Ryan FWIW the reason the install doesn't mark anything as modified is that I was trying to keep the btrfs root partition (/dev/sda3) and just install there.
I thought the installer would work similarly to what the desktop installer does with btrfs, where it creates subvolumes for / and /home in the root partition (as @ and @home).
On my desktop, when I upgrade I just move @ out of the way (by renaming it) and the installer creates a new one.
This is kinda nice as you can easily keep/revert to the old rootfs by just changing which subvol is mounted at boot.
Is there any way to do that with subiquity?
tags: | added: rls-ff-incoming |
Changed in curtin (Ubuntu Eoan): | |
status: | New → Invalid |
Ryan Harper (raharper) wrote : | #9 |
@Lee Thanks for tracking down the util-linux bug.
Since this is broken in 2.34 (eoan/focal); I'm thinking we should use sysfs to find the parent via device name walking;
Given a kname (nvme0n1p1) of the target partition
# look up sysfs path from kname
% realpath /sys/class/
/sys/devices/
# check if it's a partition
% ls -al /sys/devices/
-r--r--r-- 1 root root 4096 Feb 12 08:08
/sys/devices/
# extract parent device path
% dirname /sys/devices/
/sys/devices/
# extract parent device major/minor
% cat /sys/devices/
259:0
# udev symlinks/
% ls -al /dev/block/259:0
lrwxrwxrwx 1 root root 10 Jan 18 00:16 /dev/block/259:0 -> ../nvme0n1
% realpath /dev/block/259:0
/dev/nvme0n1
Ryan Harper (raharper) wrote : | #10 |
@Alberto
> I thought the installer would work similarly to what the desktop installer
> does with btrfs, where it creates subvolumes for / and /home in the root
> partition (as @ and @home).
>
> On my desktop, when I upgrade I just move @ out of the way (by renaming it)
> and the installer creates a new one.
> This is kinda nice as you can easily keep/revert to the old rootfs by
> just changing which subvol is mounted at boot.
Subiquity/curtin does not have support for btrfs subvolumes.
Changed in util-linux (Debian): | |
status: | Unknown → New |
Changed in curtin (Ubuntu Focal): | |
assignee: | nobody → Ryan Harper (raharper) |
status: | Triaged → In Progress |
tags: | removed: rls-ff-incoming |
This bug is fixed with commit 82f23e3d to curtin on branch master.
To view that commit see the following URL:
https:/
Launchpad Janitor (janitor) wrote : | #12 |
This bug was fixed in the package curtin - 19.3-26-
---------------
curtin (19.3-26-
* New upstream snapshot.
- install-grub: refactor uefi partition/disk searching (LP: #1862846)
- doc: update Canonical contributors URL [Paul Tobias]
- block-discover: detect additional "extended" partition types in MBR
(LP: #1861251)
- vmtests: skip focal bcache tests due to kernel bug
- net/deps.py: detect openvswitch cfg and install openvswitch packages
- vmtest: collection of vmtest related fixes to make things triple green
- clear-holders: umap the parent mpath to wipe the underlying partitions
- vmtests: bump fixby date out and fix false positive when date passes
(LP: #1855148)
- vmtests: drop disco tests using a tool to automate the process
-- Ryan Harper <email address hidden> Thu, 13 Feb 2020 21:08:59 -0600
Changed in curtin (Ubuntu Focal): | |
status: | In Progress → Fix Released |
Mauricio Faria de Oliveira (mfo) wrote : | #13 |
@ltrager
I'd be happy to handle the patch for util-linux on Ubuntu E/F if that helps; just let me know.
Lee Trager (ltrager) wrote : | #14 |
Ryan has updated Curtin to no longer require that util-linux feature. I do think it would be good to carry that patch in Ubuntu as it other users will be effected by that regression.
Mauricio Faria de Oliveira (mfo) wrote : | #15 |
@ltrager, Indeed, I see the refactor in curtin. :)
Absolutely agree w/ you, it's a regression in util-linux,
and after following up with you on IRC yesterday that it
is OK for me to submit the fix to Ubuntu, I worked on it.
I've tested the patch today, and it's currently building
on all architectures in a test PPA. If all goes well, it
should move forward to Focal and Eoan in the coming days.
Providing test steps in the next comment.
Attaching the debdiffs for reference.
cheers,
Mauricio
Mauricio Faria de Oliveira (mfo) wrote : | #16 |
util-linux / test steps
===
Bionic
---
No regression: parent kernel device name
$ dpkg -s util-linux | grep ^Version:
Version: 2.31.1-0.4ubuntu3.5
$ lsblk -no pkname /dev/vda1
vda
Eoan
---
Before: empty string
$ dpkg -s util-linux | grep ^Version:
Version: 2.34-0.1ubuntu2.2
$ lsblk -no pkname /dev/vda1
$
After: parent kernel device name
$ dpkg -s util-linux | grep ^Version:
Version: 2.34-0.
$ lsblk -no pkname /dev/vda1
vda
Focal
---
Before: empty string
$ dpkg -s util-linux | grep ^Version:
Version: 2.34-0.1ubuntu6
$ lsblk -no pkname /dev/vda1
$
After: parent kernel device name
$ dpkg -s util-linux | grep ^Version:
Version: 2.34-0.
$ lsblk -no pkname /dev/vda1
vda
Mauricio Faria de Oliveira (mfo) wrote : | #17 |
Mauricio Faria de Oliveira (mfo) wrote : | #18 |
Changed in util-linux (Ubuntu Eoan): | |
status: | New → In Progress |
importance: | Undecided → Medium |
assignee: | nobody → Mauricio Faria de Oliveira (mfo) |
Changed in util-linux (Ubuntu Focal): | |
status: | New → In Progress |
importance: | Undecided → Medium |
assignee: | nobody → Mauricio Faria de Oliveira (mfo) |
tags: | added: sts-sponsor-mfo |
tags: | added: patch |
tags: | removed: champagne |
Eric Desrochers (slashd) wrote : | #19 |
util-linux uploaded in focal.
Thanks Mauricio !
Mauricio Faria de Oliveira (mfo) wrote : | #20 |
@slashd, thanks for uploading util-linux to Focal.
I've added the SRU template and uploaded to Eoan.
description: | updated |
Launchpad Janitor (janitor) wrote : | #21 |
This bug was fixed in the package util-linux - 2.34-0.1ubuntu7
---------------
util-linux (2.34-0.1ubuntu7) focal; urgency=medium
* d/p/lsblk-
that lsblk doesn't print PKNAME column for partitions (LP: #1862846)
-- Mauricio Faria de Oliveira <email address hidden> Thu, 20 Feb 2020 11:09:29 -0300
Changed in util-linux (Ubuntu Focal): | |
status: | In Progress → Fix Released |
Łukasz Zemczak (sil2100) wrote : | #22 |
Hey! The package looks good so I'll accept it. In the impact you have mentioned that currently lsblk -no RM also doesn't work correctly, so maybe we should add it to the test case?
Changed in util-linux (Ubuntu Eoan): | |
status: | In Progress → Fix Committed |
Hello Alberto, or anyone else affected,
Accepted util-linux into eoan-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-
Further information regarding the verification process can be found at https:/
N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.
Mauricio Faria de Oliveira (mfo) wrote : | #24 |
Hi Lukasz, thanks! Sure thing, I'll add it.
Mauricio Faria de Oliveira (mfo) wrote : | #25 |
$ lsb_release -cs
eoan
Test device: partition (sdb1) in removable / USB flash disk (sdb)
$ ls -1d /sys/block/sdb/sdb1
/sys/block/sdb/sdb1
$ cat /sys/block/
1
eoan-proposed:
---
- PKNAME shows partition's parent/wholedisk.
- RM shows 1 for removable disk's partition.
$ dpkg -s util-linux | grep ^Version:
Version: 2.34-0.1ubuntu2.3
$ lsblk -no PKNAME /dev/sdb1
sdb
$ lsblk -no RM /dev/sdb1
1
eoan-updates:
---
- PKNAME shows nothing.
- RM shows 0 despite it's actually a removable disk's partition.
$ dpkg -s util-linux | grep ^Version:
Version: 2.34-0.1ubuntu2.2
$ lsblk -no PKNAME /dev/sdb1
$
$ lsblk -no RM /dev/sdb1
0
description: | updated |
tags: | added: verification-done-eoan |
Launchpad Janitor (janitor) wrote : | #26 |
This bug was fixed in the package util-linux - 2.34-0.1ubuntu2.3
---------------
util-linux (2.34-0.1ubuntu2.3) eoan; urgency=medium
* d/p/lsblk-
that lsblk doesn't print PKNAME column for partitions (LP: #1862846)
-- Mauricio Faria de Oliveira <email address hidden> Thu, 20 Feb 2020 11:13:53 -0300
Changed in util-linux (Ubuntu Eoan): | |
status: | Fix Committed → Fix Released |
The verification of the Stable Release Update for util-linux has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.
Changed in util-linux (Debian): | |
status: | New → Confirmed |
Changed in util-linux (Debian): | |
status: | Confirmed → Fix Released |
Changed in subiquity: | |
status: | New → Fix Released |
tags: | removed: sts-sponsor-mfo |
Hrm, this is a strange install.
The storage config has some strange settings..., first nothing is modified at all, all disks and partitions are marked preserve = true, as well as all filesystems. There is this strange mount:
{
"device": "format-0",
"id": "mount-0",
"path": "",
"type": "mount"
},
Heres the rootfs
{ partition- sda3",
"device": "format-
"id": "mount-2",
"path": "/",
"type": "mount"
},
And EFI
{ partition- sda1",
"device": "format-
"id": "mount-1",
"path": "/boot/efi",
"type": "mount"
}
The failure appears here:
+ [ -f /boot/efi/ EFI/ubuntu/ grubx64. efi ] EFI/ubuntu/ shimx64. efi ] shimx64. efi \EFI\ubuntu\ shimx64. efi shimx64. efi
+ [ -z ]
+ [ -f /boot/efi/
+ break
+ echo /EFI/ubuntu/
+ sed s|/|\\|g
+ loader=
+ efibootmgr --create --write-signature --label ubuntu --disk /dev/ --part 1 --loader \EFI\ubuntu\
efibootmgr: ** Warning ** : Boot0000 has same label ubuntu
Could not prepare Boot variable: Success
failed to install grub!
There's a but in the install-grub helper in how it determines the disk device;