subiquity fails to probe for block devices / discovering disks

Bug #1868817 reported by Frank Heimes
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Fix Released
High
Canonical Foundations Team
subiquity
Fix Released
Undecided
Unassigned

Bug Description

Using subiquity 20.03.2 I'm facing the problem that block device probing / disk discovery (on zFCP multipath devices) doesn't work anymore.

================================================================================
  Guided storage configuration [ Help ]
================================================================================
  Block probing did not discover any disks. Unfortunately this means that
  installation will not be possible.

root@ubuntu-server:/# lszdev | grep yes
zfcp-host 0.0.e000 yes yes
zfcp-host 0.0.e100 yes yes
zfcp-lun 0.0.e000:0x50050763060b16b6:0x4026400300000000 yes no sdb sg1
zfcp-lun 0.0.e000:0x50050763061b16b6:0x4026400300000000 yes no sda sg0
zfcp-lun 0.0.e100:0x50050763060b16b6:0x4026400300000000 yes no sdd sg3
zfcp-lun 0.0.e100:0x50050763061b16b6:0x4026400300000000 yes no sdc sg2
qeth 0.0.c000:0.0.c001:0.0.c002 yes no encc000.
2653
root@ubuntu-server:/#

root@ubuntu-server:/# multipath -ll
mpatha (36005076306ffd6b60000000000002603) dm-0 IBM,2107900
size=64G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
  |- 0:0:0:1073954854 sda 8:0 active ready running
  |- 0:0:1:1073954854 sdb 8:16 active ready running
  |- 1:0:0:1073954854 sdc 8:32 active ready running
  `- 1:0:1:1073954854 sdd 8:48 active ready running
root@ubuntu-server:/#

root@ubuntu-server:/# ls -l /dev/mapper/
total 0
crw------- 1 root root 10, 236 Mar 27 12:16 control
lrwxrwxrwx 1 root root 7 Mar 27 12:17 mpatha -> ../dm-0
lrwxrwxrwx 1 root root 7 Mar 27 12:17 mpatha-part1 -> ../dm-1
lrwxrwxrwx 1 root root 7 Mar 27 12:17 mpatha-part2 -> ../dm-2
lrwxrwxrwx 1 root root 7 Mar 27 12:17 mpatha-part5 -> ../dm-3
root@ubuntu-server:/#

root@ubuntu-server:/# time probert --storage

real 0m1.176s
user 0m0.341s
sys 0m0.166s
root@ubuntu-server:/# less out.txt
{
    "storage": {
        "bcache": {
            "backing": {},
            "caching": {}
        },
        "blockdev": {
            "/dev/dm-0": {
                "DEVLINKS": "/dev/mapper/mpatha /dev/disk/by-id/dm-name-mpatha /
dev/disk/by-id/wwn-0x6005076306ffd6b60000000000002603 /dev/disk/by-id/dm-uuid-mp
ath-36005076306ffd6b60000000000002603",
                "DEVNAME": "/dev/dm-0",
                "DEVPATH": "/devices/virtual/block/dm-0",
                "DEVTYPE": "disk",
                "DM_ACTIVATION": "0",
                "DM_NAME": "mpatha",
                "DM_STATE": "ACTIVE",
                "DM_SUBSYSTEM_UDEV_FLAG0": "1",
                "DM_SUSPENDED": "0",
                "DM_TABLE_STATE": "LIVE",
                "DM_TYPE": "scsi",
                "DM_UDEV_DISABLE_LIBRARY_FALLBACK_FLAG": "1",
                "DM_UDEV_PRIMARY_SOURCE_FLAG": "1",
...

In a second attempt I wiped out the multipath device manually, using wipefs in the subiquity shell and restarting the installation and re-tried, but still the same.

I've added the entire /var/log and /var/crash

Revision history for this message
Frank Heimes (fheimes) wrote :
Changed in ubuntu-z-systems:
importance: Undecided → High
assignee: nobody → Canonical Foundations Team (canonical-foundations)
Frank Heimes (fheimes)
summary: - subiquity fails to probe for bock devices / discovering disks
+ subiquity fails to probe for block devices / discovering disks
Revision history for this message
Ryan Harper (raharper) wrote :

ERROR curtin:1297 Validation error: None is not of type 'string' in
 {
  "device": "disk-sda",
  "flag": "extended",
  "id": "partition-sda2",
  "multipath": null,
  "number": 2,
  "offset": 501218304,
  "size": 68218258432,
  "type": "partition"
 }
 NoneType: None

This was recently fixed in curtin,

https://code.launchpad.net/~raharper/curtin/+git/curtin/+merge/379904
https://git.launchpad.net/curtin/commit/?id=1f9c53aecd0a514103fcb571ced54a10e3b12a0e

Revision history for this message
Ryan Harper (raharper) wrote :

Looking at the crash, it does discover some config some of the time, but sometimes the probe is returning no data:

 2020-03-27 12:17:30,533 DEBUG curtin:1265 Extracting storage config from probe data
G 2020-03-27 12:17:30,533 WARNING curtin:393 probe_data missing bcache data
 2020-03-27 12:17:30,672 WARNING curtin:393 probe_data missing dasd data
 2020-03-27 12:17:30,673 WARNING curtin:393 probe_data missing dmcrypt data
 2020-03-27 12:17:30,673 WARNING curtin:393 probe_data missing filesystem data
 2020-03-27 12:17:30,673 WARNING curtin:393 probe_data missing lvm data
 2020-03-27 12:17:30,673 WARNING curtin:393 probe_data missing raid data
 2020-03-27 12:17:30,673 WARNING curtin:393 probe_data missing mount data
 2020-03-27 12:17:30,673 WARNING curtin:393 probe_data missing zfs data
 2020-03-27 12:17:30,673 DEBUG curtin:1272 Sorting extracted configurations
 2020-03-27 12:17:30,673 INFO curtin:1291 Validating extracted storage config components

2020-03-27 12:17:30,698 DEBUG curtin:1325 Merged storage config:
 storage:
     config:
     - id: disk-sda
         path: /dev/sda
         ptable: dos
         serial: 36005076306ffd6b60000000000002603
         type: disk
         wwn: '0x6005076306ffd6b60000000000002603'
     - device: disk-sda
         flag: linux
         id: partition-sda1
         number: 1
         offset: 1048576
         size: 499122176
         type: partition
     version: 1

2020-03-27 12:17:30,700 ERROR root:39 finish: subiquity/Filesystem/_probe/probe_once: FAIL: cancelled
 2020-03-27 12:17:30,704 ERROR block-discover:148 block probing failed restricted=False
 Traceback (most recent call last):
   File "/snap/subiquity/1570/lib/python3.6/site-packages/subiquity/controllers/filesystem.py", line 145, in _probe
     self._probe_once_task.task, 15.0)
   File "/snap/subiquity/1570/usr/lib/python3.6/asyncio/tasks.py", line 351, in wait_for
     yield from waiter
 concurrent.futures._base.CancelledError

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :
Revision history for this message
Frank Heimes (fheimes) wrote :

Woow, looks indeed quite tricky (after reading 656) - glad that you found the root cause.

Changed in ubuntu-z-systems:
status: New → Triaged
Revision history for this message
Frank Heimes (fheimes) wrote :

I can confirm that this issue doesn't happen with subiquity edge.
I just retried on z/VM with zFCP multipath disks.
Hence updating bug status to Fix Committed.

Changed in ubuntu-z-systems:
status: Triaged → Fix Committed
Revision history for this message
Frank Heimes (fheimes) wrote :

Retried again today with beta image using installer 20.03.3+git85.f929f565 (so even didn't upgraded to edge) and did not faced this problem anymore.
So I guess it can be considered as fixed (fix released).

Changed in ubuntu-z-systems:
status: Fix Committed → Fix Released
Changed in subiquity:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.