[UBUNTU 20.04.1] Failure to install Ubuntu 20.04.1 as KVM guest on DASD

Bug #1893775 reported by bugproxy
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Fix Released
Critical
Skipper Bug Screeners
subiquity
Fix Released
Undecided
Michael Hudson-Doyle

Bug Description

---Problem Description---
Failure to install Ubuntu 20.04.1 as KVM guest on DASD

---uname output---
Linux version 5.4.0-42-generic (buildd@bos02-s390x-003) (gcc version 9.3.0 (Ubuntu 9.3.0-10ubuntu2)) #46-Ubuntu SMP Fri Jul 10 00:21:32 UTC 2020 (Ubuntu 5.4.0-42.46-generic 5.4.44)

Machine Type = 3096-703

---boot type---
CDROM / ISO image

---Install repository type---
Internet repository

---Install repository Location---
ports.ubunut.com

---Point of failure---
Other failure during installation (stage 1)

I tried to install an Ubuntu 20.04.1 guest from ISO which failed.
Steps to reproduce.
1. Boot from ISO to start installer, in my case I used virt-install, but a manually defined libvirt domain has the same issue:
$ virt-install --name focal.1 --memory 2048 --disk path=/dev/disk/by-path/ccw-0.0.a03f --cdrom ubuntu-20.04.1-live-server-s390x.iso

2. On the installer screen, stay in simple mode and accept all defaults

3. Shortly after a error pop-up appears, saying the installation has failed. Opening the log I see:
...
 storage:
   config:
   - {ptable: gpt, path: /dev/vda, wipe: superblock-recursive, preserve: false, name: '',
     grub_device: false, type: disk, id: disk-vda}
   - {device: disk-vda, size: 22153265152, wipe: superblock, flag: '', number: 1, preserve: false,
     type: partition, id: partition-0}
   - {fstype: ext4, volume: partition-0, preserve: false, type: format, id: format-0}
   - {device: format-0, path: /, type: mount, id: mount-0}
   version: 1
...
An error occured handling 'format-0': OSError - could not get path to dev from kname: vda1
 finish: cmd-install/stage-partitioning/builtin/cmd-block-meta: FAIL: configuring format: format-0
 TIMED BLOCK_META: 1.942
 finish: cmd-install/stage-partitioning/builtin/cmd-block-meta: FAIL: curtin command block-meta

It seems that the new installation procedure isn't correctly detecting virtio-attached DASDs and tries to handle them like SCSI disks (gpt label, ...). This is a regression compared to the debian-installer and prevents the installation of Ubuntu 20.04 KVM guests on DASD.

I cross-checked by running the installation on a QCOW2 image, which succeeded without problems.

bugproxy (bugproxy)
tags: added: architecture-s39064 bugnameltc-187975 severity-critical targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
affects: ubuntu → linux (Ubuntu)
Frank Heimes (fheimes)
affects: linux (Ubuntu) → subiquity (Ubuntu)
bugproxy (bugproxy)
tags: added: targetmilestone-inin20041
removed: targetmilestone-inin---
Revision history for this message
Frank Heimes (fheimes) wrote :

I could recreate and can confirm the same for groovy.
If the installation was started with:
sudo virt-install --name groovy --memory 2048 --disk path=/dev/disk/by-path/ccw-0.0.1601 --cdrom ./groovy-live-server-s390x.iso
the installer uses /dev/vda that is usually an indicator for virtio device rather than a DASD disk (the size of 6.877G is the correct net size of the used DASD Mod9 disk):

▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
  Guided storage configuration [ Help ]
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
  Configure a guided storage layout, or create a custom one:

  (X) Use an entire disk

       [ /dev/vda local disk 6.877G ▾ ]

       [X] Set up this disk as an LVM group

            [ ] Encrypt the LVM group with LUKS

                         Passphrase:

                 Confirm passphrase:

  ( ) Custom storage layout

                                 [ Done ]
                                 [ Back ]

I've added /var/logs and /var/crash.

Changed in subiquity (Ubuntu):
assignee: Skipper Bug Screeners (skipper-screen-team) → Canonical Foundations Team (canonical-foundations)
Changed in ubuntu-z-systems:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
importance: Undecided → Critical
status: New → Triaged
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2020-09-02 08:24 EDT-------
To add to Frank's comment. /dev/vda is the correct node for DASDs attached using virtio-blk. It is necessary to observe the disk label, e.g using parted /dev/vda print which yields an output like

Model: Virtio Block Device (virtblk)
Disk /dev/vda: 22.2GB
Sector size (logical/physical): 4096B/4096B
Partition Table: dasd
Disk Flags:

Number Start End Size File system Flags
1 3146kB 318MB 315MB ext2
2 318MB 20.4GB 20.0GB btrfs
3 20.4GB 22.2GB 1790MB linux-swap(v1)

It seems that the new installer always tries to write a GPT disk label, which is wrong and renders the disk unusable.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

I think if you use an already formatted dasd in the KVM this should work. If you use a completely blank one, I don't know that there's a way for the guest to tell that it's backed by a dasd vs some other block device?

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-09-03 04:40 EDT-------
Right, for KVM guests to use a DASD via virtio, it must already be formatted for the fdasd and parted heuristics to work. This has always been the case, as low-level formatting can only be done using special CCWs not available through the virtio layer.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-09-03 05:00 EDT-------
(In reply to comment #10)
> Right, for KVM guests to use a DASD via virtio, it must already be formatted
> for the fdasd and parted heuristics to work. This has always been the case,
> as low-level formatting can only be done using special CCWs not available
> through the virtio layer.

To make it clear. Yes the disk is formatted and it DID work with the old installer and it DOES NOT work with the new one.

Frank Heimes (fheimes)
tags: added: installer
Revision history for this message
Frank Heimes (fheimes) wrote :

I just retried (just to be sure) with having done an explicit fdasd and dasdfmt upfront (even if my DASD was already prepared before, but it may have had some LVM leftovers), but don't see a different - same situation.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Here's the udev data for /dev/vda:

 P: /devices/css0/0.0.0002/0.0.0000/virtio2/block/vda
 N: vda
 L: 0
 S: disk/by-path/ccw-0.0.0000
 E: DEVPATH=/devices/css0/0.0.0002/0.0.0000/virtio2/block/vda
 E: SUBSYSTEM=block
 E: DEVNAME=/dev/vda
 E: DEVTYPE=disk
 E: MAJOR=252
 E: MINOR=0
 E: USEC_INITIALIZED=545942
 E: ID_PATH=ccw-0.0.0000
 E: ID_PATH_TAG=ccw-0_0_0000
 E: DEVLINKS=/dev/disk/by-path/ccw-0.0.0000
 E: TAGS=:systemd:

How can we tell from this that it's a dasd? I'm surprised that there is not an ID_PART_TYPE if parted identifies the drive as dasd....

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-09-07 08:35 EDT-------
The udev event won't help, as it just reports a virtio block device. Whether the underlying format is a SCSI-style disk or ECKD DASD can only be found out by looking at the disk itself. Tools like fdasd or parted have the necessary heuristics to figure out the type. I'd recommend to use parted print as I mentioned in the comment above.

Revision history for this message
Frank Heimes (fheimes) wrote :

'parted' with either '-l' or 'print' for example points to that fact that a dasd is used (here a pristine DASD w/o partitions after using 'virt-install --name groovy --memory 2048 --disk path=/dev/disk/by-path/ccw-0.0.1601 --cdrom ./groovy-live-server-s390x.iso'):
"
$ parted /dev/vda print
Model: Virtio Block Device (virtblk)
Disk /dev/vda: 7385MB
Sector size (logical/physical): 4096B/4096B
Partition Table: dasd
Disk Flags:

Number Start End Size File system Flags
"

[
In contrast to a direct dasd usage (on z/VM) that of course points to dasd, too, but here with different ('normal') block device name an partitions:
"
$ sudo parted /dev/dasda print
Model: IBM S390 DASD drive (dasd)
Disk /dev/dasda: 7385MB
Sector size (logical/physical): 512B/4096B
Partition Table: dasd
Disk Flags:

Number Start End Size File system Flags
 1 98.3kB 7385MB 7384MB ext4
"
]

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Request to always force ms-dos partition table in KVM was requested before.

However, surely it is virtualization/layering violation, when a physical host device which is not virtio-scsi compatible is passed through to qemu which emulates it as scsi, despite the underlying device on the host not able to create GPT, extended partitions, etc.

Can we please prohibit passing through /dev/dasda to qemu as virtio-scsi provider on qemu and/or virtio layer?

If one passes /dev/dasda to qemu, it must show up as /dev/dasd* inside the guest too. As there is no way for the guest to know or figure out that the passthrough device is actually masquerading a dasd drive.

Can you please explain again, why is it correct and desired to provide /dev/vda in the guest with /dev/dasda on the host, when the two types are incompatible with each other?

Changed in subiquity (Ubuntu Groovy):
status: New → Incomplete
Changed in subiquity (Ubuntu Focal):
status: New → Incomplete
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

I thought that one must use vfio-ccw to passthrough dasd drives, not virtio-scsi.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-09-07 11:05 EDT-------
> Request to always force ms-dos partition table in KVM was requested before.

Can you expain the context? The KVM team certainly did not request ms-dos partition table.

> However, surely it is virtualization/layering violation, when a physical
> host device which is not virtio-scsi compatible is passed through to qemu
> which emulates it as scsi, despite the underlying device on the host not
> able to create GPT, extended partitions, etc.
>
> Can we please prohibit passing through /dev/dasda to qemu as virtio-scsi
> provider on qemu and/or virtio layer?
>
> If one passes /dev/dasda to qemu, it must show up as /dev/dasd* inside the
> guest too. As there is no way for the guest to know or figure out that the
> passthrough device is actually masquerading a dasd drive.
>
> Can you please explain again, why is it correct and desired to provide
> /dev/vda in the guest with /dev/dasda on the host, when the two types are
> incompatible with each other?

This is virtio-blk and virtio-blk has in QEMU special code to detect DASDs and will then also pass through geometry and block size.
This then allows the partition detection code and parted to detect this as a DASD. This did work in the past for all previous Ubuntu versions including 20.04, it does work with Redhat, Fedora and SUSE. parted has this detection and zipl has this detection. So it is far from being incompatible.

It is an important use case, because the majority of customers that also have z/OS will usually use DASDs also for Linux. This allows them to setup HA/failover solutions where the DASD is accessible from multiple locations and the failover is orchestrasted by z/OS. This also avoids the need for a shared filesystem.

Again: the old installer was able to handle this, the new one is not. It was certainly expected that we get regressions with a new installer. The right fix is not to discuss away useful features, instead the right solution is certainly to fix things that are regressions. no?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Re: [Bug 1893775] Re: [UBUNTU 20.04.1] Failure to install Ubuntu 20.04.1 as KVM guest on DASD

On Mon, Sep 7, 2020 at 4:50 PM Dimitri John Ledkov
<email address hidden> wrote:
>
> I thought that one must use vfio-ccw to passthrough dasd drives, not
> virtio-scsi.

Yes - You can provide any block device (even image files) as
virtio-blk/virtio-scsi the same as on any other platform.
But indeed to "pass through" a dasd vfio-ccw would be the right and
only way that comes to mind (AFAIK).

/me shakes his fist at the underlying truth that the storage server
has normal modern disks and emulates ECKD on top of them to later
cause these issues when it no more behaves like the kind of disks it
originally is composed of :-/

Revision history for this message
Dimitri John Ledkov (xnox) wrote :
Download full text (5.2 KiB)

So the history of this msdos/dasd partitioning table stuff in the old installer were as follows:

LTC-135429 https://bugs.launchpad.net/ubuntu/+source/partman-partitioning/+bug/1534629
LTC-136054 https://bugs.launchpad.net/ubuntu/+source/debian-installer/+bug/1527328
https://bugs.launchpad.net/ubuntu/+source/partman-partitioning/+bug/1537942
LTC-137464 https://bugs.launchpad.net/ubuntu/+source/debian-installer/+bug/1548411
LTC-149975 https://bugs.launchpad.net/ubuntu/+source/partman-partitioning/+bug/1650300
https://bugs.launchpad.net/ubuntu/+source/partman-base/+bug/1595495
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=815916

partman-partitioning (110ubuntu3) xenial; urgency=medium

  * Set s390x partitioning tables (parted label type) to dasd. LP:
    #1534629, LP: #1527328, LP: #1537942.

 -- Dimitri John Ledkov <email address hidden> Tue, 09 Feb 2016 19:32:43 +0000

partman-partitioning (110ubuntu4) xenial; urgency=medium

  [ Viktor Mihajlovski ]
  * Revert to using msdos partitioning table on s390x as a fallback,
    however use dasd partitioning table if advised to do so by
    partman. LP: #1548411

 -- Dimitri John Ledkov <email address hidden> Mon, 29 Feb 2016 14:27:51 +0000

partman-partitioning (112ubuntu1) yakkety; urgency=medium

  * Resynchronise with Debian. Remaining changes:
    ....
    - On s390x use msdos partition table by default, unless partman
      advises dasd.

 -- Dimitri John Ledkov <email address hidden> Fri, 12 Aug 2016 09:52:51 +0100

But also this:

partman-base (190) unstable; urgency=medium

  [ Viktor Mihajlovski ]
  * Add disk label type to device directory, such that
    e.g. partman-partitioning can elect dasd partitioning table for dasd
    drives.

  [ Dimitri John Ledkov ]
  * On s390[x], prevent using extended partitions on DASD drives that can
    only hold 3 partitions. Parted doesn't have special knowledge about
    that and claims that msdos partition table on DASD drives can hold
    the usual maximum number of partitions types.

 -- Christian Perrier <email address hidden> Sun, 13 Nov 2016 07:45:31 +0100

The code there is:

s390|s390x)
                if [ -e ./label ]; then
                    disklabel=$(cat label)
                fi
                # FBA devices have parted label dasd, but should not use dasd
                # partition table. Maybe FBA|ECKD type should be exposed by
                # partman-base and/or parted. LP: #1650300
                device=$(sed 's|.*/||' ./device)
                if grep -q "(FBA ).*$device" /proc/dasd/devices; then
                    disklabel=msdos
                fi
                if [ "$disklabel" != dasd ]; then
                    disklabel=msdos
                fi
                echo $disklabel;;

In partman-base we also limit msdos to primary partitions only on dasd
+#ifdef __s390__
+ /* DASD drives can only do 3 partitions */
+ && strcmp(disk->dev->model, "IBM S390 DASD drive")
+#endif

and disk label is saved to a file.

It is extremely limiting if installer cannot create/recreate/reformat the drive, and has such hardcoded bad guesses.

In the past, qemu did not have vfio-ccw, but now it does. Surely, the ne...

Read more...

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla
Download full text (4.6 KiB)

------- Comment From <email address hidden> 2020-09-09 04:38 EDT-------
(In reply to comment #20)
[...]
trimming for readability
>
> It is extremely limiting if installer cannot create/recreate/reformat the
> drive, and has such hardcoded bad guesses.
>
> In the past, qemu did not have vfio-ccw, but now it does. Surely, the new
> libvirt/qemu in focal, when given a --disk
> path=/dev/disk/by-path/ccw-0.0.1601 it must be emulating vfio-ccw if at all
> possible as then on the guest it would appear as a /dev/dasda, and be
> correctly processed.
First, vfio-ccw hasn't yet reached a maturity level comparable to that of virtio, which is in production since years. Forcing customers to use that as a full replacement for virtio is not appropriate at this point in time.
Second, it's about commonality. Customers do not want to treat KVM guests differently on Z than on other platforms. Virtio-blk was always the first choice to pass in host block devices, and DASDs are for sure block devices.
Third, the idea to infer from the device node the characterics of the device is not a concept that can applied universally. ATA devices mutated from /dev/hdx to /dev/sdx, without becoming 'real' SCSI devices. You can attach an DVD device or ISO as virtio-blk, and you can still not assume it can hold a partition table of any kind, etc.
>
> So on s390x, the old installer
> - default to use ms-dos partition table, with primary partitions only, with
> default layout having less than 4 partitions.
> - if previously detected DASD partition table, use DASD, unless operating on
> FBA in that case use msdos again
>
> The above I think was ultimately driven by "lowlevel formateded FBA or ECKD
> drives" without a dasd partition table. In such cases, it was reported in
> qemu/kvm as "unknown" and thus "msdos with 3 partitions" was used which
> seemed to work, but is extremely fragile.
>
> In the new installer, I do no wish to support partition tables that are
> limited by a very small upper bound of 2TiB. My laptop has that, and it's
> trivial to over-provision/thinly provision such sizes.
>
> Is it reasonable to say that:
> - nvme, zfcp, virtio-scsi will use GPT
This is still postulating an identity of transport and device characteristics, at least for virtio-scsi, it is perfectly valid to attach any kind of block device as a virtio-scsi LUN. It is likely safe to assume that NVME and FCP attached device have fixed block and not ECKD characteristics. Conceptually it is not clean, and assumptions like these are still hampering the proliferation of true 4K sector devices
> - dasd ECKD devices must use vfio-ccw if passed directly to KVM, and uses
> dasd partition table
NO, see above, it is not acceptable to take away the possibility from customers to use DASD in their environment
> - If vfio-ccw is not available, one can partition it on the host, create LVM
> or ext4 storage pool, and passthrough LVM volume or qcow2 file via
> virtio-scsi
Apart from the regression itself, this is impractical in many cases. Customers with DASD are in most cases also running z/OS. z/OS is able to handle the 'partitions' (for z/OS they are datasets) of a CDL formatted ECKD DASD.
> - passing through dasd ...

Read more...

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

It seems from a quick look that parted just embeds a (ancient) copy of fdasd to do the identification, so I don't think there's any real benefit to using parted over fdasd directly.

It would really be more in keeping with how we expect things to work to have a udev rule that would run fdasd on block devices and record in udev somehow if they are dasds and maybe that's something we should work towards in 21.04 but maybe we should also think of a way to make this work in focal (starting by making changes to probert, I guess).

tags: added: fr-662
Revision history for this message
Frank Heimes (fheimes) wrote :

Since there are some discussions about this ticket on other communication channels, I want to clarify and sum up the situation here:

An installation on a DASD disk that is virtio-attached, like:
virt-install --name focal --memory 2048 --disk path=/dev/disk/by-path/ccw-0.0.1234 --cdrom ./install_image.iso
works with d-i since, there is a 'non-standard way' (or one may call it 'hack') to identify that a certain disk that is passed over via virtio is not (as usual) a SCSI disk, but a DASD (based on the partition table).
We think that the clean way is to pass/attach DASD disks as ccw devices (what they actually are),
but we will look into this, since it worked with d-i.
Since this is a non-straight forward way to identify disk geometry, something like this would require 'extra coding' (or again some hacking).

However, nobody should be blocked by this, since there is still the focal / 20.04 legacy d-i image available, that allows such virt-install installation, like (in short):
virt-install --name focal --memory 2048 --disk path=/dev/disk/by-path/ccw-0.0.261e --cdrom ./ubuntu-20.04-legacy-server-s390x.iso
(for details see the attachment)

Even if the legacy d-i image is based on 20.04 GA, an installation (with access to the archives) will end up as a proper 20.04.1 installation (as of today), since the installer updates by default to the latest package levels.

Revision history for this message
Frank Heimes (fheimes) wrote :
Changed in subiquity:
status: New → In Progress
assignee: nobody → Michael Hudson-Doyle (mwhudson)
Frank Heimes (fheimes)
no longer affects: subiquity (Ubuntu)
no longer affects: subiquity (Ubuntu Focal)
no longer affects: subiquity (Ubuntu Groovy)
Changed in ubuntu-z-systems:
status: Triaged → In Progress
Revision history for this message
Frank Heimes (fheimes) wrote :

This bug is fixed with
https://github.com/CanonicalLtd/subiquity/releases/tag/21.01.1
and the pass-ober of DASDs via virtio-blk works if the correct kernel patch is also in place in the install system (LP 1903341).
This is the case with the focal/20.04 daily images as well with the upcoming hirsute/21.04 image that incl. kernel 5.10 that we expect soon.
(Hence this will also be part of the upcoming 20.04.2.)

Changed in subiquity:
status: In Progress → Fix Released
Changed in ubuntu-z-systems:
status: In Progress → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2021-01-21 06:04 EDT-------
IBM Bugzilla status->closed. Fix Released by Canonical

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.