debian-installer fails to detect lsi 2208-based raid card in precise, quantal, raring

Bug #1091465 reported by joro on 2012-12-17
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
debian-installer (Ubuntu)
Undecided
Unassigned

Bug Description

I have been trying to install ubuntu server for the last 4 days on an Intel P4208CP4 server with intel rms25pb080 raid controller.
Although the raid card (LSI 2208 chipset, 1 gb RAM, 2 8087-ports ) is recognized correctly by the kernel as megaraid_sas it will not ever be used or even offered as an installation target.
dmesg reports "FW in fault state", and that is about all that ever happens. lsmod shows the relevant modules to be loaded.
fdisk or gparted never show any devices on this controller.
Versions i tried:
- 12.04 server 64bit, release
- 12.04.1 server 64bit, release
- 12.04.2 daily build server 64bit as of 12/12/2012, kernel 3.5 quantal-lts
- 12.04.2 daily build desktop 64bit as of 12/12/2012, kernel 3.5 quantal-lts
- 12.10 deskktop and server , daily build from 12/13/2012 kernel 3.5
- 13.04 alpha, daily build from 12/14/2012 64bit, desktop only, kernel 3.7-rc
Just to make sure, i moved the raid card around between the pci-e slots.
I also tried efi and legacy boot with all these versions - nothing.
There are 3 more 1tb sata disks in the machine, that are not attached to the raid card, ubuntu in any of the above versions installs and runs there without any problems
Machine Specs:
- intel server chassis 4208cp
- MB intel S2600CP4
- 2xE5-2630 xeon
- 8x8GB Kingston ecc reg
- intel raid adapter rms25pb080 lsi 2208 chipset, 1gb ram, 8x Samsung SSD 840PRO /256gb wrapped into a single raid6 volume
- battery backup module for raid controller intel AXXRMFBU2
- 3x 1TB Hitachi HTE54101, 2,5" sata on onboard sas-ports (only for backup, not connected for installation, i wanted all system-things to got to the raid set)
no cd/dvd

BTW: Centos 6.3-64bit works out of the box, much to my chagrin..
Disk throughput ist 2.9GB/s read avg. and 2.4GB/write avg. according to centos gnome disk utility, so the raid card seems to be ok with these ssds attached to it.

joro (joromindlab) wrote :
joro (joromindlab) wrote :
joro (joromindlab) wrote :
Download full text (92.2 KiB)

dmesg from working centos 6.3, please compare with attached raring dmesg

root@kvmhost log]# dmesg
Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.32-279.14.1.el6.centos.plus.x86_64 (<email address hidden>) (gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC) ) #1 SMP Wed Nov 7 00:40:45 UTC 2012
Command line: ro root=/dev/mapper/vg_system-RootLV rd_NO_LUKS rd_LVM_LV=vg_system/RootLV rd_LVM_LV=vg_system/SwapLV KEYBOARDTYPE=pc KEYTABLE=de-latin1-nodeadkeys rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto LANG=de_DE.UTF-8 rd_NO_DM rhgb quiet
KERNEL supported cpus:
  Intel GenuineIntel
  AMD AuthenticAMD
  Centaur CentaurHauls
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000008f800 (usable)
 BIOS-e820: 000000000008f800 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000b8e10000 (usable)
 BIOS-e820: 00000000b8e10000 - 00000000b8e8a000 (ACPI data)
 BIOS-e820: 00000000b8e8a000 - 00000000baa82000 (usable)
 BIOS-e820: 00000000baa82000 - 00000000baa93000 (reserved)
 BIOS-e820: 00000000baa93000 - 00000000baaca000 (usable)
 BIOS-e820: 00000000baaca000 - 00000000baad3000 (reserved)
 BIOS-e820: 00000000baad3000 - 00000000baaf6000 (usable)
 BIOS-e820: 00000000baaf6000 - 00000000baaf9000 (reserved)
 BIOS-e820: 00000000baaf9000 - 00000000bacc8000 (usable)
 BIOS-e820: 00000000bacc8000 - 00000000baccb000 (reserved)
 BIOS-e820: 00000000baccb000 - 00000000bacd1000 (usable)
 BIOS-e820: 00000000bacd1000 - 00000000bacd3000 (reserved)
 BIOS-e820: 00000000bacd3000 - 00000000bad16000 (usable)
 BIOS-e820: 00000000bad16000 - 00000000bad1a000 (reserved)
 BIOS-e820: 00000000bad1a000 - 00000000bad1b000 (usable)
 BIOS-e820: 00000000bad1b000 - 00000000bad20000 (reserved)
 BIOS-e820: 00000000bad20000 - 00000000bad35000 (usable)
 BIOS-e820: 00000000bad35000 - 00000000bad39000 (reserved)
 BIOS-e820: 00000000bad39000 - 00000000bade7000 (usable)
 BIOS-e820: 00000000bade7000 - 00000000badee000 (reserved)
 BIOS-e820: 00000000badee000 - 00000000bae6f000 (usable)
 BIOS-e820: 00000000bae6f000 - 00000000bae71000 (reserved)
 BIOS-e820: 00000000bae71000 - 00000000bae85000 (usable)
 BIOS-e820: 00000000bae85000 - 00000000bae88000 (reserved)
 BIOS-e820: 00000000bae88000 - 00000000bae97000 (usable)
 BIOS-e820: 00000000bae97000 - 00000000baea6000 (reserved)
 BIOS-e820: 00000000baea6000 - 00000000baeb7000 (usable)
 BIOS-e820: 00000000baeb7000 - 00000000baec6000 (reserved)
 BIOS-e820: 00000000baec6000 - 00000000baee0000 (usable)
 BIOS-e820: 00000000baee0000 - 00000000baee4000 (reserved)
 BIOS-e820: 00000000baee4000 - 00000000baf2d000 (usable)
 BIOS-e820: 00000000baf2d000 - 00000000bb268000 (reserved)
 BIOS-e820: 00000000bb268000 - 00000000bd9fa000 (usable)
 BIOS-e820: 00000000bd9fa000 - 00000000bdbfa000 (reserved)
 BIOS-e820: 00000000bdbfa000 - 00000000bdcdd000 (usable)
 BIOS-e820: 00000000bdcdd000 - 00000000bdde7000 (reserved)
 BIOS-e820: 00000000bdde7000 - 00000000bdf08000 (ACPI NVS)
 BIOS-e820: 00000000bdf08000 - 00000000bdf09000 (ACPI data)
 BIOS-e820: 00000000bdf09000 - 00000000bdf0b000 (ACPI NVS)
 BIOS-e820: 00000000bdf0b000 - 00000000bdf0...

joro (joromindlab) wrote :

The raid volume is "partitioned" into system (100gb) and data volume (1.4tb),
but this does not make any difference, d-i also fails to see single volume on any of the versions i tried

[root@kvmhost ~]# export LANGUAGE=C; fdisk -lc

Disk /dev/sda: 104.9 GB, 104856551424 bytes
255 heads, 63 sectors/track, 12748 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00044fbb

   Device Boot Start End Blocks Id System
/dev/sda1 * 1 64 512000 83 Linux
/dev/sda2 64 12749 101885952 8e Linux LVM

Disk /dev/sdb: 1425.1 GB, 1425137467392 bytes
255 heads, 63 sectors/track, 173263 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in debian-installer (Ubuntu):
status: New → Confirmed
Ron (ron-neversleep) wrote :

I also posted this solution to bug# 1091263 "Cannot install Ubuntu Server on Intel Raid System"

I have FINALLY discovered a work-around/solution for this situation. (after spending 3 days, wrongly debugging the megaraid_sas driver source code....)

Setting linux kernel boot parameter pci=conf1 allows my Intel RAID Controller-RMS25PB080 (LSI 2208/Fusion based) to be detected, and the FW to transition to Ready state. Without this setting the Card FW responds only with 0xF0000000 Fault (Masked).

I found linux kernel series 3.0.0 (as used in Oneiric / 11.10), properly discovered and used the RAID card. Things broke in kernel series 3.2.0 (as used in Precise / 12.04), onward. So this also affects Quantal (12.10), and current Raring 13.04 kernel builds (v3.7+). So this means in my Intel s2600 system, using the Intel/LSI card I have confirmed broken PCI discovery in kernels 3.2 up to current day 3.7.1 kernels.

Details:
Card PCI ID: 1000:005b
Card PCI description: LSI Logic / Symbios Logic MegaRAID SAS 2208 [Thunderbolt] (rev 03) aka Fusion
Card Firmware Version: v23.9.0-0018 (most current to-date)
Card OEM: Intel RAID Module (Card) RMS25PB080 SAS
System Board: s2600CP - E5-2600 Xeon, LGA2011 Socket
System Firmware BIOS level: 01.06.0002 2012/11/15 (most current to-date)

Bjorn Helgaas (bjorn-helgaas) wrote :

This might be related to https://bugzilla.kernel.org/show_bug.cgi?id=63661

That issue was bisected to 3c076351c402 ("PCI: Rework ASPM disable code"), which appeared upstream in v3.3-rc1. That commit was backported to Ubuntu-3.2.0-22.35: http://kernel.ubuntu.com/git/ubuntu/ubuntu-precise.git/commit/?id=2dcce0a318fc. I'm told that 3c076351c402 has *not* been backported to RHEL6.3, and I expect the same is true for CentOS6.3, which might explain why the problem doesn't occur on CentOS6.3.

For this Ubuntu issue, "pci=conf1" is a workaround, but for 63661, "pci=conf1" didn't help (http://www.spinics.net/lists/linux-scsi/msg76194.html), so it may be that this issue on Ubuntu is different.

But 63661 is the same hardware (Intel S2600 motherboard, MegaRAID SAS 2208), so it seems likely these issues are the same. One reporter there found that turning off "Fast Boot" in the BIOS settings is a workaround.

My theory is that megaraid_sas depends on something done by the option ROM, and that enabling "Fast Boot" in the BIOS (which is apparently the default) causes the BIOS to skip executing the option ROM. Further, I speculate that prior to 3c076351c402 ("PCI: Rework ASPM disable code"), the Linux PCI core did something that made things work.

If anybody still cares about this issue, please look at https://bugzilla.kernel.org/show_bug.cgi?id=63661 and we can try to debug it further.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.