Ubuntu 18.04 installer does not detect any IPR based HDD/RAID array [S822L] [ipr]

Bug #1751813 reported by bugproxy on 2018-02-26
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
High
Canonical Foundations Team
debian-installer (Ubuntu)
Low
Canonical Foundations Team
linux (Ubuntu)
Undecided
Canonical Kernel Team
systemd (Ubuntu)
Critical
Dimitri John Ledkov

Bug Description

---Problem Description---
Ubuntu 18.04 ppc64el installer does detect IPR (IBM Power RAID) based HDD/RAID arrays

Tried with current bionic and current bionic-proposed installers
wget http://ports.ubuntu.com/ubuntu-ports/dists/bionic/main/installer-ppc64el/current/images/netboot/ubuntu-installer/ppc64el/vmlinux
wget http://ports.ubuntu.com/ubuntu-ports/dists/bionic/main/installer-ppc64el/current/images/netboot/ubuntu-installer/ppc64el/initrd.gz
kexec -l vmlinux -i initrd.gz
kexec -e

wget http://ports.ubuntu.com/ubuntu-ports/dists/bionic-proposed/main/installer-ppc64el/current/images/netboot/ubuntu-installer/ppc64el/vmlinux
wget http://ports.ubuntu.com/ubuntu-ports/dists/bionic-proposed/main/installer-ppc64el/current/images/netboot/ubuntu-installer/ppc64el/initrd.gz
 kexec -l vmlinux -i initrd.gz
kexec -e

---uname output---
uname -a
Linux ltciofvtr-s822l2-lp1 4.13.0-32-generic #35-Ubuntu SMP Thu Jan 25 09:05:20 UTC 2018 ppc64le GNU/Linux

Machine Type = 8247-22L (S822L)

---boot type---
Network boot

---bootloader---
petitboot shell

---Kernel cmdline used to launch install---
#cat /proc/cmdline

---Bootloader protocol---
http

---Install repository type---
Internet repository

---Install repository Location---
US

---Point of failure---
Problem during post-install (stage 2) configuration or other problem seen after reboot

=======console error log=======

  ?????????????????????????? [!!] Partition disks ???????????????????????????
  ? ?
  ? Note that all data on the disk you select will be erased, but not ?
  ? before you have confirmed that you really want to make the changes. ?
  ? ?
  ? Select disk to partition: ?
  ? ?
  ? SCSI1 (0,0,0) (sda) - 2.0 GB SMART USB-IBM ?
  ? SCSI2 (0,0,0) (sdb) - 1.0 TB IBM T RDX-USB3 ?
  ? ?
  ? <Go Back> ?
  ? ?
  ???????????????????????????????????????????????????????????????????????????

<Tab> moves; <Space> selects; <Enter> activates buttons

~ # lsmod | grep -i ipr
ipr 148677 0
~ # modinfo ipr
filename: /lib/modules/4.13.0-32-generic/kernel/drivers/scsi/ipr.ko
version: 2.6.4
license: GPL
description: IBM Power RAID SCSI Adapter Driver
author: Brian King <email address hidden>
srcversion: 05BB872A11F65B3A73AB352
alias: pci:v00001014d000004DAsv00001014sd000004FBbc*sc*i*
alias: pci:v00001014d000004DAsv00001014sd000004FCbc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd000004C9bc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd000004C8bc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd000004C7bc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd0000049Cbc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd0000049Bbc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd0000049Abc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd00000499bc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd00000475bc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd00000474bc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd000004CAbc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd0000046Dbc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd000003FEbc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd000003FFbc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd000003FCbc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd000003FBbc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd0000035Ebc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd0000035Dbc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd00000357bc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd00000355bc*sc*i*
alias: pci:v00001014d0000034Asv00001014sd0000033Bbc*sc*i*
alias: pci:v00001014d0000033Dsv00001014sd00000354bc*sc*i*
alias: pci:v00001014d0000033Dsv00001014sd00000353bc*sc*i*
alias: pci:v00001014d0000033Dsv00001014sd00000352bc*sc*i*
alias: pci:v00001014d0000033Dsv00001014sd0000035Fbc*sc*i*
alias: pci:v00001014d0000033Dsv00001014sd00000356bc*sc*i*
alias: pci:v00001014d0000033Dsv00001014sd0000033Cbc*sc*i*
alias: pci:v00009005d00000503sv00001014sd000002C3bc*sc*i*
alias: pci:v00009005d00000503sv00001014sd000002D5bc*sc*i*
alias: pci:v00009005d00000503sv00001014sd000002BFbc*sc*i*
alias: pci:v00001014d00000180sv00001014sd00000264bc*sc*i*
alias: pci:v00001014d00000339sv00001014sd00000360bc*sc*i*
alias: pci:v00001014d00000339sv00001014sd0000035Cbc*sc*i*
alias: pci:v00001014d00000339sv00001014sd0000033Abc*sc*i*
alias: pci:v00001014d00000339sv00001014sd0000030Abc*sc*i*
alias: pci:v00001014d000002BDsv00001014sd00000338bc*sc*i*
alias: pci:v00001014d000002BDsv00001014sd000002C2bc*sc*i*
alias: pci:v00001014d000002BDsv00001014sd000002C1bc*sc*i*
alias: pci:v00009005d00000500sv00001014sd00000338bc*sc*i*
alias: pci:v00009005d00000500sv00001014sd000002C2bc*sc*i*
alias: pci:v00009005d00000500sv00001014sd000002C1bc*sc*i*
alias: pci:v00001014d0000028Csv00001014sd0000030Dbc*sc*i*
alias: pci:v00001014d0000028Csv00001014sd000002C0bc*sc*i*
alias: pci:v00001014d0000028Csv00001014sd0000028Dbc*sc*i*
alias: pci:v00001014d0000028Csv00001014sd000002BEbc*sc*i*
alias: pci:v00001069d0000B166sv00001014sd000002D3bc*sc*i*
alias: pci:v00001069d0000B166sv00001014sd000002D4bc*sc*i*
alias: pci:v00001069d0000B166sv00001014sd00000278bc*sc*i*
alias: pci:v00001069d0000B166sv00001014sd00000266bc*sc*i*
depends:
intree: Y
name: ipr
vermagic: 4.13.0-32-generic SMP mod_unload mprofile-kernel
signat: PKCS#7
signer:
sig_key:
sig_hashalgo: md4
parm: max_speed:Maximum bus speed (0-2). Default: 1=U160. Speeds: 0=80 MB/s, 1=U160, 2=U320 (uint)
parm: log_level:Set to 0 - 4 for increasing verbosity of device driver (uint)
parm: testmode:DANGEROUS!!! Allows unsupported configurations (int)
parm: fastfail:Reduce timeouts and retries (int)
parm: transop_timeout:Time in seconds to wait for adapter to come operational (default: 300) (int)
parm: debug:Enable device driver debugging logging. Set to 1 to enable. (default: 0) (int)
parm: dual_ioa_raid:Enable dual adapter RAID support. Set to 1 to enable. (default: 1) (int)
parm: max_devs:Specify the maximum number of physical devices. [Default=1024] (int)
parm: number_of_msix:Specify the number of MSIX interrupts to use on capable adapters (1 - 16). (default:16) (int)
parm: fast_reboot:Skip adapter shutdown during reboot. Set to 1 to enable. (default: 0) (int)
~ #

~ # lsscsi
[4:3:0:0] no dev IBM 57B4001SISIOA 0150 <-- its detecting one IPR card.
~ #

========================

== Comment: #1 - NAVEED A. UPPINANGADY SALIH <email address hidden> - 2018-02-21 06:01:18 ==
Was able to make installer detect IPR based drive

Work around is <Go_Back>
Detect disks-->Guided partitioning--> Guided - use entire disk

========console logs ===========

  ?????????????????????????? [!!] Partition disks ???????????????????????????
  ? ?
  ? Note that all data on the disk you select will be erased, but not ?
  ? before you have confirmed that you really want to make the changes. ?
  ? ?
  ? Select disk to partition: ?
  ? ?
  ? SCSI1 (0,0,0) (sda) - 2.0 GB SMART USB-IBM ?
  ? SCSI2 (0,0,0) (sdb) - 1.0 TB IBM T RDX-USB3 ?
  ? SCSI3 (2,0,0) (sdc) - 571.3 GB IBM IPR-0 6A98C700 ? ?
  ? SCSI3 (2,1,0) (sdd) - 1.1 TB IBM IPR-0 6A98C700 ? ?
  ? SCSI4 (2,0,0) (sde) - 571.3 GB IBM IPR-0 6A985000 ? ?
  ? SCSI4 (2,1,0) (sdf) - 571.3 GB IBM IPR-0 6A985000 ? ?
  ? SCSI4 (2,2,0) (sdg) - 571.3 GB IBM IPR-0 6A985000 ? ?
  ? SCSI4 (2,3,0) (sdh) - 571.3 GB IBM IPR-0 6A985000 ? ?
  ? SCSI5 (2,0,0) (sdi) - 283.8 GB IBM IPR-0 58E4D100 ? ?
  ? SCSI5 (2,1,0) (sdj) - 283.8 GB IBM IPR-0 58E4D100 ?
  ? ?
  ? <Go Back> ?
  ? ?
  ???????????????????????????????????????????????????????????????????????????

<Tab> moves; <Space> selects; <Enter> activates buttons

============================================

== Comment: #5 - Breno Leitao <email address hidden> - 2018-02-26 08:48:31 ==
it looks like that the ipr driver is loaded in the installer. I am wondering where the problem is.

== Comment: #6 - Vaishnavi Bhat <email address hidden> - 2018-02-26 09:04:33 ==
Since the IPR based drive is not detectable in the general scenario but only with the workaround, it might be an issue with the installer or the driver being obsolete. But otherwise the ipr module is loaded fine during the installation.
Mirroring this bug to the distro for their awareness and being investigated in the IBM side as well.

Thank you.

CVE References

bugproxy (bugproxy) on 2018-02-26
tags: added: architecture-ppc64le bugnameltc-164932 severity-critical targetmilestone-inin1804
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → debian-installer (Ubuntu)
Changed in ubuntu-power-systems:
importance: Undecided → Critical
assignee: nobody → Canonical Foundations Team (canonical-foundations)
tags: added: triage-g
Steve Langasek (vorlon) wrote :

According to the bug report this is currently mirrored "for awareness", so unassigning from Foundations for now.

Changed in ubuntu-power-systems:
assignee: Canonical Foundations Team (canonical-foundations) → nobody
Manoj Iyer (manjo) on 2018-02-26
Changed in ubuntu-power-systems:
status: New → Triaged

------- Comment From <email address hidden> 2018-02-27 18:41 EDT-------
This is P8 system. Why it is shipping issue?

Andrew Cloke (andrew-cloke) wrote :

Following up on comment #1, if this bug has been raised "for awareness", lowering priority from critical to low.

If this is incorrect, please respond on this bug and it can be adjusted.

Changed in ubuntu-power-systems:
importance: Critical → Low
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-02-28 13:24 EDT-------
I got on a P8 system and saw the similar issue. But in the shell, I can see all ipr disks.

# lspci|grep IPR
0001:08:00.0 RAID bus controller: IBM PCI-E IPR SAS Adapter (ASIC) (rev 01)
---> solstics adapter

# ls -l /dev/sd*
brw------- 1 root root 8, 0 Feb 28 16:13 /dev/sda
brw------- 1 root root 8, 1 Feb 28 16:13 /dev/sda1
brw------- 1 root root 8, 2 Feb 28 16:13 /dev/sda2
brw------- 1 root root 8, 3 Feb 28 16:13 /dev/sda3
brw------- 1 root root 8, 4 Feb 28 16:13 /dev/sda4
brw------- 1 root root 8, 5 Feb 28 16:13 /dev/sda5
brw------- 1 root root 8, 6 Feb 28 16:13 /dev/sda6
brw------- 1 root root 8, 16 Feb 28 16:13 /dev/sdb
brw------- 1 root root 8, 17 Feb 28 16:13 /dev/sdb1
brw------- 1 root root 8, 18 Feb 28 16:13 /dev/sdb2
brw------- 1 root root 8, 19 Feb 28 16:13 /dev/sdb3
brw------- 1 root root 8, 32 Feb 28 16:13 /dev/sdc
brw------- 1 root root 8, 33 Feb 28 16:13 /dev/sdc1
brw------- 1 root root 8, 34 Feb 28 16:13 /dev/sdc2
brw------- 1 root root 8, 35 Feb 28 16:13 /dev/sdc3
brw------- 1 root root 8, 48 Feb 28 16:13 /dev/sdd
brw------- 1 root root 8, 49 Feb 28 16:13 /dev/sdd1
brw------- 1 root root 8, 50 Feb 28 16:13 /dev/sdd2
brw------- 1 root root 8, 64 Feb 28 16:13 /dev/sde
brw------- 1 root root 8, 65 Feb 28 16:13 /dev/sde1
brw------- 1 root root 8, 66 Feb 28 16:13 /dev/sde2
brw------- 1 root root 8, 67 Feb 28 16:13 /dev/sde3
brw------- 1 root root 8, 80 Feb 28 16:13 /dev/sdf
brw------- 1 root root 8, 96 Feb 28 16:13 /dev/sdg
brw------- 1 root root 8, 112 Feb 28 16:13 /dev/sdh
brw------- 1 root root 8, 128 Feb 28 16:13 /dev/sdi
brw------- 1 root root 8, 144 Feb 28 16:13 /dev/sdj
brw------- 1 root root 8, 160 Feb 28 16:13 /dev/sdk
brw------- 1 root root 8, 176 Feb 28 16:13 /dev/sdl
brw------- 1 root root 8, 192 Feb 28 16:13 /dev/sdm
brw------- 1 root root 8, 208 Feb 28 16:13 /dev/sdn
brw------- 1 root root 8, 224 Feb 28 16:13 /dev/sdo
brw------- 1 root root 8, 240 Feb 28 16:13 /dev/sdp
brw------- 1 root root 65, 0 Feb 28 16:13 /dev/sdq
brw------- 1 root root 65, 16 Feb 28 16:13 /dev/sdr
brw------- 1 root root 65, 32 Feb 28 16:13 /dev/sds
brw------- 1 root root 65, 48 Feb 28 16:13 /dev/sdt
brw------- 1 root root 65, 64 Feb 28 16:13 /dev/sdu

I didn't see ipr device driver has any issue here. Reassign bug back.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-02-28 21:12 EDT-------
Hi,
> Following up on comment #1, if this bug has been raised "for awareness",
> lowering priority from critical to low.
>
> If this is incorrect, please respond on this bug and it can be adjusted.

IBM is able to reproduce this issue and we are able to see all the IPR disks in the shell. The IPR module is loaded during the installation as well but not detected during installation. This is a installer issue. Please raise the priority from low to high.

Thank you.

------- Comment (attachment only) From <email address hidden> 2018-03-01 01:51 EDT-------

bugproxy (bugproxy) wrote : partman

------- Comment (attachment only) From <email address hidden> 2018-03-01 01:52 EDT-------

------- Comment (attachment only) From <email address hidden> 2018-03-01 01:54 EDT-------

Frank Heimes (frank-heimes) wrote :

It looks like some parties can reproduce this situation, but other cannot.
Please can you also investigate into any potential differences in these obviously two environments?
Are both using the same or different:
- Power hardware
- adapter hardware revision
- firmware on the adapter
- 18.04 installer version
etc.
Thx

Changed in ubuntu-power-systems:
importance: Low → High

------- Comment From <email address hidden> 2018-03-01 03:39 EDT-------
Hi,
> It looks like some parties can reproduce this situation, but other cannot.
> Please can you also investigate into any potential differences in these
> obviously two environments?

We are able to reproduce this issue every time.
- upgraded the IPR RAID firmwares to latest VR18
- booted with current 4.15.0-10-generic and we still see this issue..
- Current firwmare version :
P side : FW840.00 (SV840_056)
T side : FW860.20 (SV860_072)
Boot side : FW860.20 (SV860_072)

Can you please clarify which parties were successful in installing? If this means some other parties form IBM side so that we can cross check the failures?

Thank you.

bugproxy (bugproxy) wrote :
Download full text (4.4 KiB)

------- Comment From <email address hidden> 2018-02-28 04:30 EDT-------
(In reply to comment #15)
> Please update to the latest firmware on ipr adapters. Let me know when you
> have system available for me to debug.

upgraded the IPR RAID firmwares to latest VR18, booted with current 4.15.0-10-generic and we still see this issue.. Only thing which is pending is system firmware, I still believe that should not be a problem
/ # update_flash -d
Current firwmare version :

Do any one else have any bare-metal tuleta to try this out ?

?????????????????????????? [!!] Partition disks ???????????????????????????
? ?
? Note that all data on the disk you select will be erased, but not ?
? before you have confirmed that you really want to make the changes. ?
? ?
? Select disk to partition: ?
? ?
? SCSI1 (0,0,0) (sda) - 1.0 TB IBM T RDX-USB3 ?
? ?
? <Go Back> ?
? ?
???????????????????????????????????????????????????????????????????????????

<Tab> moves; <Space> selects; <Enter> activates buttons

```root@ltciofvtr-s822l2-lp1:/usr/lib/microcode# iprconfig -c show-details sg9

Manufacturer . . . . . . . . . . . . . . : IBM
Machine Type and Model . . . . . . . . . : 57D7001SISIOA
Firmware Version . . . . . . . . . . . . : 18518200
Serial Number. . . . . . . . . . . . . . : 0051R116
Part Number. . . . . . . . . . . . . . . : 0000000MH962
Plant of Manufacturer. . . . . . . . . . : 00UE
Write Cache Size . . . . . . . . . . . . : 256 MB
DRAM Size. . . . . . . . . . . . . . . . : 512 MB
Resource Name. . . . . . . . . . . . . . : /dev/sg9

Physical location
PCI Address. . . . . . . . . . . . . . . : 0005:04:00.0
Resource Path. . . . . . . . . . . . . . : FE
SCSI Host Number . . . . . . . . . . . . : 1
Platform Location. . . . . . . . . . . . : U78CB.001.WZS07EH-P1-C15
Rebuild Verification . . . . . . . . . . : Disabled
Cache Protection . . . . . . . . . . . . : Synchronize Cache

root@ltciofvtr-s822l2-lp1:/usr/lib/microcode# iprconfig -c show-details sg0

Manufacturer . . . . . . . . . . . . . . : IBM
Machine Type and Model . . . . . . . . . : 57D7001SISIOA
Firmware Version . . . . . . . . . . . . : 18518200
Serial Number. . . . . . . . . . . . . . : 0051R044
Part Number. . . . . . . . . . . . . . . : 0000000MH962
Plant of Manufacturer. . . . . . . . . . : 00UE
Write Cache Size . . . . . . . . . . . . : 256 MB
DRAM Size. . . . . . . . . . . . . . . . : 512 MB
Resource Name. . . . . . . . . . . . . . : /dev/sg0

Physical location
PCI Address. . . . . . . . . . . . . . . : 0001:04:00.0
Resource Path. . . . . . . . . . . . . . : FE
SCSI Host Number . . . . . . . . . . . . : 0
Platform Location. . . . . . . . . . . . : U78CB.001.WZS07EH-P1-C14
Rebuild Verification . . . ....

Read more...

Frank Heimes (frank-heimes) wrote :

Just figured out that the thread-view in my inbox wasn't correctly displayed and the neg. confirmation of the reproducibility of this issue came from a different ticket.

But thanks for confirmation and the additional details.

Since this is no longer for awareness only, I'll reassign it (importance was already raised).

Changed in ubuntu-power-systems:
assignee: nobody → Canonical Foundations Team (canonical-foundations)
Steve Langasek (vorlon) wrote :

> Work around is <Go_Back>
> Detect disks-->Guided partitioning--> Guided - use entire disk

If this is a reliable workaround, then it sounds like a clear ordering bug in the installer.

Changed in debian-installer (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Canonical Foundations Team (canonical-foundations)
importance: Undecided → High
status: New → Triaged
tags: added: id-5a985f2774c5a615846b06b6
Adam Conrad (adconrad) wrote :

That order is exactly what happens automatically, so it's not an ordering bug. I suspect it's that the driver is taking more time to detect disks than it used to, and the "workaround" is really just the equivalent of adding a "sleep 10" between modprobe and enumeration. I wonder if something on the kernel side switched from being sync to async and we need to paper over that in userspace.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-03-19 18:02 EDT-------
Async scanning for SCSI has been around for a while. The ipr driver fully supports async scanning. This behavior can be controlled via the scan scsi_mod module parameter. Valid values are sync, async, manual, or none. One thing to try would be to pass the following on the kernel command line when booting the installer and see if that affects the end result:

scsi_mod.scan=sync

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-03-19 18:25 EDT-------
FYI, This issue is happened on mpt3sas driver as well on the same system.

Is scsi_mode built into kernel in Ubuntu?

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-03-20 13:40 EDT-------
I got these in config file:

CONFIG_SCSI_SCAN_ASYNC=y
..
..
# SCSI device support
#
CONFIG_SCSI_MOD=y
CONFIG_RAID_ATTRS=m
CONFIG_SCSI=y

Dimitri John Ledkov (xnox) wrote :

Can you extract and attach the full installer log?
the /var/log/syslog from the running d-i, after doing all possible steps to get it working again
(e.g. the mentioned going back & rescanning disks).

For the installer, kernel modules are split into .udeb packages, and it might be that some of the needed kernel modules, are loaded after disk scanning was performed, hence rescanning picks up new things as new/more kernel drivers are loaded at that point.

Output of lsmod would be useful as well, to correlate and check which udebs are downloaded when; and which drivers are modprobed when.

E.g. it is possible that scsi-modules udeb is lacking the necessary modules in v4.15 packaging splits.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-03-29 02:20 EDT-------
Tried with current bionic and current bionic-proposed installers
wget http://ports.ubuntu.com/ubuntu-ports/dists/bionic/main/installer-ppc64el/current/images/netboot/ubuntu-installer/ppc64el/vmlinux
wget http://ports.ubuntu.com/ubuntu-ports/dists/bionic/main/installer-ppc64el/current/images/netboot/ubuntu-installer/ppc64el/initrd.gz
kexec -l vmlinux -i initrd.gz
kexec -e

wget http://ports.ubuntu.com/ubuntu-ports/dists/bionic-proposed/main/installer-ppc64el/current/images/netboot/ubuntu-installer/ppc64el/vmlinux
wget http://ports.ubuntu.com/ubuntu-ports/dists/bionic-proposed/main/installer-ppc64el/current/images/netboot/ubuntu-installer/ppc64el/initrd.gz
kexec -l vmlinux -i initrd.gz
kexec -e

We hit this issue
?????????????????? [!!] Download installer components ???????????????????
? ?
? No kernel modules were found. This probably is due to a mismatch ?
? between the kernel used by this version of the installer and the ?
? kernel version available in the archive. ?
? ?
? If you're installing from a mirror, you can work around this problem ?
? by choosing to install a different version of Ubuntu. The install ?
? will probably fail to work if you continue without kernel modules. ?
? ?
? Continue the install without loading kernel modules? ?
? ?
? <Go Back> <Yes> <No> ?
? ?
?????????????????????????????????????????????????????????????????????????

Can you provide us right levels of netboot images ?

Dimitri John Ledkov (xnox) wrote :

When booting proposed images which have not yet migrated to the release pocket, one must also specify 'apt-setup/proposed=true' and the mirror you use must have up to date bionic-proposed.

It is best to not download d-i from ports.ubuntu.com, but instead please use your local mirror - to ensure the d-i published there, matches the available kernel udebs on your mirror. You should be mirroring main/installer-* and main/debian-installer/*.

Also please specify which d-i serial number is in use. (check /var/lib/dpkg/info for the debian-installer package) or instead of the symlink url /current/ use the one with the build serial and let me know which one that is.

Also note, that 20101020ubuntu535 build serial has migrated to bionic-release, thus if you are using 20101020ubuntu535 or later, with an up to date mirror, 'apt-setup/proposed=true' should not be needed and you should have access to all the latest kernel modules.

If you do not, please check that your mirror is up to date. It was fully published on 2018-03-27 03:04:31 BST, together with the 4.15.0-13.14 kernel. Is 20101020ubuntu535 d-i synced to your internal mirror yet?

I am marking this bug report incomplete until you provide the d-i serial number in use.

(please note, this is not the first time that d-i erorrs are reported _without_ specifying d-i serial number in use. Without d-i serial number, it is very hard to speculate and assist with your queries.)

Changed in debian-installer (Ubuntu):
status: Triaged → Incomplete
Changed in ubuntu-power-systems:
status: Triaged → Incomplete
Changed in debian-installer (Ubuntu):
importance: High → Low
Changed in ubuntu-power-systems:
importance: High → Low
Changed in debian-installer (Ubuntu):
milestone: none → later
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-03-29 06:32 EDT-------
Thank you for the detailed instructions.

We are able to proceed with following netboot versions. using US public mirror.
#wget http://ports.ubuntu.com/dists/bionic/main/installer-ppc64el/20101020ubuntu535/images/netboot/ubuntu-installer/ppc64el/vmlinux
#wget http://ports.ubuntu.com/dists/bionic/main/installer-ppc64el/20101020ubuntu535/images/netboot/ubuntu-installer/ppc64el/initrd.gz
#kexec -l vmlinux -i initrd.gz
#kexec -e

Will provide you the logs pertaining to this bugs shortly .

------- Comment From <email address hidden> 2018-03-29 06:36 EDT-------
Attaching the logs with the 20101020ubuntu535 image at:
http://ports.ubuntu.com/dists/bionic/main/installer-ppc64el/20101020ubuntu535/images/netboot/ubuntu-installer/ppc64el/vmlinux
http://ports.ubuntu.com/dists/bionic/main/installer-ppc64el/20101020ubuntu535/images/netboot/ubuntu-installer/ppc64el/initrd.gz

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:37 EDT-------

bugproxy (bugproxy) wrote : partman

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:37 EDT-------

bugproxy (bugproxy) wrote : syslog

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:38 EDT-------

Dimitri John Ledkov (xnox) wrote :

Awesome! Let me check them all.

------- Comment From <email address hidden> 2018-04-03 09:09 EDT-------
Do we have any updates here? Do you need any more information from us?

Steve Langasek (vorlon) on 2018-04-03
Changed in debian-installer (Ubuntu):
status: Incomplete → Triaged
Manoj Iyer (manjo) on 2018-04-05
Changed in ubuntu-power-systems:
status: Incomplete → Triaged
Changed in ubuntu-power-systems:
importance: Low → High
Patricia Gaughen (gaughen) wrote :

We're currently unsure of the root cause, still investigating. Is it possible to get remote access to an IPR system?

Andrew Cloke (andrew-cloke) wrote :

From discussions with Breno, all Tuleta hardware has the required IPR RAID controller.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-04-16 07:48 EDT-------
I think 4/19 is final freeze for 18.04. Would be great if we can get fix before that?

Thanks!

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-04-16 13:35 EDT-------
I was able to restart ipr with the the highest log level and also with debug enabled, and this is the debug log I can see:

------- Comment (attachment only) From <email address hidden> 2018-04-16 13:36 EDT-------

------- Comment From <email address hidden> 2018-04-16 16:06 EDT-------
Looking further at this problem, it seems that ipr.ko is not part of the basic set of device drivers in the installer package.

If I exit to the shell at the beginning of d-i, I do not find ipr.ko:

BusyBox v1.27.2 (Ubuntu 1:1.27.2-2ubuntu3) built-in shell (ash)
Enter 'help' for a list of built-in commands.

~ # cd /
~ # find . -name ipr.ko
~ #

It seems it will only show up later, maybe if we grab the -extra package? If that is the case, and I would like to confirm soon, we will need to move ipr back to the main kernel package.

Manoj Iyer (manjo) wrote :

Looks like ipr module is present in kernel d-i.

bionic$ grep ipr debian.master/d-i/modules/scsi-modules*
debian.master/d-i/modules/scsi-modules:ipr ?
debian.master/d-i/modules/scsi-modules.powerpc:ipr ?

scsi-modules are not in d-i's pkg-lists/netboot .cfg files. So ipr.ko is not in initrd. May be ipr could be moved to storage-core-modules instead ?

On Tue, Apr 17, 2018 at 03:15:07AM -0000, Manoj Iyer wrote:
> Looks like ipr module is present in kernel d-i.

> bionic$ grep ipr debian.master/d-i/modules/scsi-modules*
> debian.master/d-i/modules/scsi-modules:ipr ?
> debian.master/d-i/modules/scsi-modules.powerpc:ipr ?

> scsi-modules are not in d-i's pkg-lists/netboot .cfg files. So ipr.ko is
> not in initrd. May be ipr could be moved to storage-core-modules instead
> ?

It could be, but the race would still be there, just less likely to be hit.

bugproxy (bugproxy) wrote : partman

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:37 EDT-------

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:37 EDT-------

bugproxy (bugproxy) wrote : partman

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:37 EDT-------

bugproxy (bugproxy) wrote : syslog

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:38 EDT-------

------- Comment (attachment only) From <email address hidden> 2018-04-16 13:36 EDT-------

------- Comment From <email address hidden> 2018-04-17 10:19 EDT-------
As suggested by Brian, I was finally able to use scsi_mod.scan=sync kerenl parameter and the problem is not reproducible, although the hardware detection phase is a bit slower.

The problem is that ipr is taking more than a minute to be loaded now:

~ # date ; modprobe ipr ; date
Tue Apr 17 14:15:53 UTC 2018
Tue Apr 17 14:17:15 UTC 2018

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-04-17 10:43 EDT-------
(In reply to comment #63)
> On Tue, Apr 17, 2018 at 03:15:07AM -0000, Manoj Iyer wrote:
> > Looks like ipr module is present in kernel d-i.
>
> > bionic$ grep ipr debian.master/d-i/modules/scsi-modules*
> > debian.master/d-i/modules/scsi-modules:ipr ?
> > debian.master/d-i/modules/scsi-modules.powerpc:ipr ?
>
> > scsi-modules are not in d-i's pkg-lists/netboot .cfg files. So ipr.ko is
> > not in initrd. May be ipr could be moved to storage-core-modules instead
> > ?
>
> It could be, but the race would still be there, just less likely to be hit.

I do not think there is a race here, but basically the disk probing is slow(er?), something that needs to be investigate, but, definitely not before the 18.04 release.

I understand we have two fixes for the near term (18.04 GA):

1) Load the driver earlier, putting ipr.ko into storage-core-modules, and hope to have the discs initialized asynchronously before the disk partitioner is invoked.

2) Force the disc scan to use the sync mode during install

PS: Since IPR is the most used disk controller on ppc64el by far, I think it wouldn't hurt if we put it as part of the original netboot intird image. If we do not want to increase the initird size, we can replace the wifi drivers by the ipr driver or some net as arcnet, or hid drivers.

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:37 EDT-------

bugproxy (bugproxy) wrote : partman

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:37 EDT-------

bugproxy (bugproxy) wrote : syslog

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:38 EDT-------

------- Comment (attachment only) From <email address hidden> 2018-04-16 13:36 EDT-------

Steve Langasek (vorlon) wrote :

It sounds like the root cause of this bug, then, is that the ipr driver is taking longer than normal / longer than reasonable to initialize+scan. If this is a regression vs. previous Ubuntu releases (as opposed to, say, an instance of failing hardware on a particular test machine), then we need to be looking at a kernel bug; opening a task on the linux package for this.

Separately, based on the investigation and findings up to this point, I think d-i should also be fixed to scan disks synchronously, because there is certainly a race here; and given the design of d-i, which does not automatically update the UI with details of late-detected disks, I think it's more important to be reliable than fast.

Changed in linux (Ubuntu):
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Seth Forshee (sforshee) wrote :

Marking the linux task fix committed for the d-i changes, no investigation yet as to the slowdown.

Changed in linux (Ubuntu):
status: New → Fix Committed
Changed in ubuntu-power-systems:
status: Triaged → In Progress
Dimitri John Ledkov (xnox) wrote :

Two things.

SCSI_SCAN_ASYNC=y is set from xenial to bionic.

If we believe this is what is causing the race here, we can try booting d-i with a kernel parameter 'scsi_mod.scan=sync' specified. (or i can rebuild d-i with such kernel cmdline built-in if required). And check if that helps to resolve this case.

I do wonder, if d-i should be taking the control of the scan itself, meaning, booting with scsi_mod.scan=none and doing `echo "- - -" > /sys/class/scsi_host/*/scan` at the appropriate stage of disk-detect.

Or if we do async scan, trigger a rescan at disk-detect stage. `echo 1 > /sys/class/scsi_device/device/rescan`

Will be checking the disk-detect / scsi code, to see if it does trigger scans, and if it correctly blocks on said scans to complete.

Dimitri John Ledkov (xnox) wrote :

So my current feelings about this are as follows:

udev-udeb should ship /lib/modprobe.d/ directory

It should contain systemd.conf, just like the deb udev package, which sets bonding max_bonds=0 and dummy numdummies=0.

It should also contain scsi-scan-sync.conf that that sets `options scsi_mod scan=sync`

This way, when hw-detect runs `update-dev` and `update-dev --settle` all the scsi module loading will be blocking said sync points correctly, using the scsi kernel built-in scan timeouts etc.

Potentially no disks found menu, needs an option to "Rescan all SCSI drives again".

Dimitri John Ledkov (xnox) wrote :

Can somebody please test that booting d-i with `scsi_mod.scan=sync` on the kernel command line, on the previously affected system, makes IPR discovery work as expected, by the time one reaches the partitioning menu without any mitigations required from the user?

------- Comment From <email address hidden> 2018-04-19 14:41 EDT-------
mment From xnox 2018-04-19 20:09:09 UTC-------
> Can somebody please test that booting d-i with `scsi_mod.scan=sync` on the
> kernel command line, on the previously affected system, makes IPR discovery
> work as expected, by the time one reaches the partitioning menu without any
> mitigations required from the user?

I tested using this parameter as a kernel parameter, and it works as expected. I am able to see the disks as soon as I get into the disk partition tool.

Dimitri John Ledkov (xnox) wrote :

In light of comment #51 above, I shall pursue uploading a udev-udeb with modprobe.d config snippet that forces scsi_mod.scan=sync.

Changed in systemd (Ubuntu):
status: New → In Progress
assignee: nobody → Dimitri John Ledkov (xnox)
importance: Undecided → Critical
milestone: none → ubuntu-18.04
Changed in debian-installer (Ubuntu):
milestone: later → ubuntu-18.04
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 237-3ubuntu10

---------------
systemd (237-3ubuntu10) bionic; urgency=medium

  * Create tmpfiles for persistent journal in postinst only when running
    systemd (LP: #1748659)

 -- Balint Reczey <email address hidden> Fri, 20 Apr 2018 18:55:56 +0200

Changed in systemd (Ubuntu):
status: In Progress → Fix Released
Manoj Iyer (manjo) on 2018-04-23
Changed in ubuntu-power-systems:
status: In Progress → Fix Committed
Launchpad Janitor (janitor) wrote :
Download full text (35.7 KiB)

This bug was fixed in the package linux - 4.15.0-19.20

---------------
linux (4.15.0-19.20) bionic; urgency=medium

  * linux: 4.15.0-19.20 -proposed tracker (LP: #1766021)

  * Kernel 4.15.0-15 breaks Dell PowerEdge 12th Gen servers (LP: #1765232)
    - Revert "blk-mq: simplify queue mapping & schedule with each possisble CPU"
    - Revert "genirq/affinity: assign vectors to all possible CPUs"

linux (4.15.0-18.19) bionic; urgency=medium

  * linux: 4.15.0-18.19 -proposed tracker (LP: #1765490)

  * [regression] Ubuntu 18.04:[4.15.0-17-generic #18] KVM Guest Kernel:
    meltdown: rfi/fallback displacement flush not enabled bydefault (kvm)
    (LP: #1765429)
    - powerpc/pseries: Fix clearing of security feature flags

  * signing: only install a signed kernel (LP: #1764794)
    - [Packaging] update to Debian like control scripts
    - [Packaging] switch to triggers for postinst.d postrm.d handling
    - [Packaging] signing -- switch to raw-signing tarballs
    - [Packaging] signing -- switch to linux-image as signed when available
    - [Config] signing -- enable Opal signing for ppc64el
    - [Packaging] printenv -- add signing options

  * [18.04 FEAT] Sign POWER host/NV kernels (LP: #1696154)
    - [Packaging] signing -- add support for signing Opal kernel binaries

  * Please cherrypick s390 unwind fix (LP: #1765083)
    - s390/compat: fix setup_frame32

  * Ubuntu 18.04 installer does not detect any IPR based HDD/RAID array [S822L]
    [ipr] (LP: #1751813)
    - d-i: move ipr to storage-core-modules on ppc64el

  * drivers/gpu/drm/bridge/adv7511/adv7511.ko missing (LP: #1764816)
    - SAUCE: (no-up) rename the adv7511 drm driver to adv7511_drm

  * Miscellaneous Ubuntu changes
    - [Packaging] Add linux-oem to rebuild test blacklist.

linux (4.15.0-17.18) bionic; urgency=medium

  * linux: 4.15.0-17.18 -proposed tracker (LP: #1764498)

  * Eventual OOM with profile reloads (LP: #1750594)
    - SAUCE: apparmor: fix memory leak when duplicate profile load

linux (4.15.0-16.17) bionic; urgency=medium

  * linux: 4.15.0-16.17 -proposed tracker (LP: #1763785)

  * [18.04] [bug] CFL-S(CNP)/CNL GPIO testing failed (LP: #1757346)
    - [Config]: Set CONFIG_PINCTRL_CANNONLAKE=y

  * [Ubuntu 18.04] USB Type-C test failed on GLK (LP: #1758797)
    - SAUCE: usb: typec: ucsi: Increase command completion timeout value

  * Fix trying to "push" an already active pool VP (LP: #1763386)
    - SAUCE: powerpc/xive: Fix trying to "push" an already active pool VP

  * hisi_sas: Revert and replace SAUCE patches w/ upstream (LP: #1762824)
    - Revert "UBUNTU: SAUCE: scsi: hisi_sas: export device table of v3 hw to
      userspace"
    - Revert "UBUNTU: SAUCE: scsi: hisi_sas: config for hip08 ES"
    - scsi: hisi_sas: modify some register config for hip08
    - scsi: hisi_sas: add v3 hw MODULE_DEVICE_TABLE()

  * Realtek card reader - RTS5243 [VEN_10EC&DEV_5260] (LP: #1737673)
    - misc: rtsx: Move Realtek Card Reader Driver to misc
    - updateconfigs for Realtek Card Reader Driver
    - misc: rtsx: Add support for RTS5260
    - misc: rtsx: Fix symbol clashes

  * Mellanox [mlx5] [bionic] UBSAN: Undefined behaviour in
    ./include/linux/net_dim.h (LP: #1...

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:37 EDT-------

bugproxy (bugproxy) wrote : partman

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:37 EDT-------

bugproxy (bugproxy) wrote : syslog

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:38 EDT-------

------- Comment (attachment only) From <email address hidden> 2018-04-16 13:36 EDT-------

------- Comment From <email address hidden> 2018-04-24 03:44 EDT-------
(In reply to comment #76)
> Is the Fix available in Daily builds to try? Please advice

As per Canonical's latest updates, fix for this issue should be available in the 4/24 daily build (which should have kernel 4.15.0-19.20 & systemd - 237-3ubuntu10).

Please give it a try.

bugproxy (bugproxy) wrote :
Download full text (4.4 KiB)

------- Comment From <email address hidden> 2018-04-24 10:52 EDT-------
This bug looks to be fixed in the d-i serial 20101020ubuntu540 build.

Should we close this bug?

????????????????? Detecting disks and all other hardware ??????????????????
? ?
? 0% ?
? ?
? Detecting hardware, please wait... ?
? ?
???????????????????????????????????????????????????????????????????????????

??????????????????????? Starting up the partitioner ???????????????????????
? ?
? 34% ?
? ?
? Please wait... ?
? ?
???????????????????????????????????????????????????????????????????????????

?????????????????????????? [!!] Partition disks ???????????????????????????
? ?
? The installer can guide you through partitioning a disk (using ?
? different standard schemes) or, if you prefer, you can do it ?
? manually. With guided partitioning you will still have a chance later ?
? to review and customise the results. ?
? ?
? If you choose guided partitioning for an entire disk, you will next ?
? be asked which disk should be used. ?
? ?
? Partitioning method: ?
? ?
? Guided - resize RAID1 device #0 and use freed space ?
? Guided - resize SCSI3 (2,1,0) (sdc) and use freed space ?
? Guided - resize SCSI4 (2,0,0), partition #2 (sdd) and use freed s ? ?
? Guided - reuse partition, RAID1 device #0 ? ?
? Guided - reuse partition, SCSI4 (2,0,0), partition #2 (sdd) ?
? ?
? <Go Back> ?
? ?
???????????????????????????????????????????????????????????????????????????

<Tab> moves; <Space> selects; <Enter> activates buttons

?????????????????????????? [!!] Partition disks ???????????????????????????
? ?
? Note that all data on the disk you select will be erased, but not ?
? before you have confirmed that you really want to make the changes. ?
? ...

Read more...

Steve Langasek (vorlon) on 2018-04-24
Changed in debian-installer (Ubuntu):
status: Triaged → Invalid

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:37 EDT-------

bugproxy (bugproxy) wrote : partman

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:37 EDT-------

bugproxy (bugproxy) wrote : syslog

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:38 EDT-------

------- Comment (attachment only) From <email address hidden> 2018-04-16 13:36 EDT-------

Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released

------- Comment From <email address hidden> 2018-04-25 08:25 EDT-------
(In reply to comment #81)
> This bug looks to be fixed in the d-i serial 20101020ubuntu540 build.
>
> Should we close this bug?

Yes, closing it at our side.

By the way, did you measure if the disk scan is considerably slower than previously?

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:37 EDT-------

bugproxy (bugproxy) wrote : partman

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:37 EDT-------

bugproxy (bugproxy) wrote : syslog

------- Comment (attachment only) From <email address hidden> 2018-03-29 06:38 EDT-------

------- Comment (attachment only) From <email address hidden> 2018-04-16 13:36 EDT-------

------- Comment From <email address hidden> 2018-05-02 02:05 EDT-------
(In reply to comment #82)
> (In reply to comment #81)
> > This bug looks to be fixed in the d-i serial 20101020ubuntu540 build.
> >
> > Should we close this bug?
>
> Yes, closing it at our side.
>
> By the way, did you measure if the disk scan is considerably slower than
> previously?

We could not make out the difference of the time of taken for disk scan as user<<- -may be we did not notice it?

Dimitri John Ledkov (xnox) wrote :

The scan is forced to be done synchronously, however the ipr driver is available and is loaded much earlier in the install process, thus it should not significantly affect the overall install time.

One may be able to notice the difference a bit more with a fully-automated preseed install.

Changed in debian-installer (Ubuntu):
status: Invalid → Fix Released
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-05-04 01:38 EDT-------
(In reply to comment #84)
> The scan is forced to be done synchronously, however the ipr driver is
> available and is loaded much earlier in the install process, thus it should
> not significantly affect the overall install time.
>
> One may be able to notice the difference a bit more with a fully-automated
> preseed install.

Noted,
Thank you.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers