hpsa driver causing kernel panics on HP ProLiant DL385p Gen8

Bug #1550481 reported by Rod Smith
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
High
Unassigned

Bug Description

Doing regression testing on Ubuntu 14.04.4, we've found a big one: An HP ProLiant DL385p Gen8 kernel panics on startup with the default kernel for that version (4.2.0-27-generic). The server's display (see screen shot taken via its BMC's remote KVM) shows a kernel panic that seems to implicate the hpsa driver. This server boots and installs every Trusty release through 14.04.3 (with a 3.19.0-25 kernel) just fine.

I'm attaching log files from a 14.04.1 installation so that you can have some hardware information, and a screen shot of the kernel panic in 14.04.4.

We (the server certification team) have access to the server for a limited period. Normally it's in OIL.

Revision history for this message
Rod Smith (rodsmith) wrote :
Revision history for this message
Rod Smith (rodsmith) wrote :
Revision history for this message
Rod Smith (rodsmith) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
tags: added: trusty
Rod Smith (rodsmith)
tags: added: blocks-hwcert-server
Revision history for this message
Rod Smith (rodsmith) wrote :

Here's some version information, taken from the machine's iLO:

Product Name ProLiant DL385p Gen8
UUID 37303137-3332-584D-5134-303830335450
Server Serial Number MXQ40803TP
Product ID 710723-001
System ROM A28 02/06/2014
Backup System ROM 11/02/2013
Integrated Remote Console .NET Java
License Type iLO 4 Advanced
iLO Firmware Version 1.51 Jun 16 2014

Actually, I just saw that the iLO is giving me a "failed" status on the HP Smart Array P420i Controller (firmware version 5.42), although all the detailed diagnostics below it on the page show green. (I'm attaching a screen shot.) I don't see any more diagnostic details in the web UI. If there's something else I can check to help track this down, please tell me.

Revision history for this message
Rod Smith (rodsmith) wrote :

An update: After re-deploying with Ubuntu 14.04.3, the error logged in the iLO and shown in the screen shot disappears; the hardware now registers as OK. Thus, I think this error is simply another symptom of the kernel panic.

Revision history for this message
Ganesh Lakshminarayana (ganesh-lakshminarayana) wrote :

Please try diabling IOMMU from RBSU (System options->Processors->IOMMU disable). When IOMMU is disabled, OS boots as expected. When IOMMU is enabled, issue is seen.

Revision history for this message
Rod Smith (rodsmith) wrote :

Disabling IOMMU did indeed enable us to boot.

Revision history for this message
Rod Smith (rodsmith) wrote :

Note that the same problem occurs with the 4.2 kernel in 16.04 (a pre-release from a week or two ago). I haven't yet tested the firmware workaround with 16.04.

penalvch (penalvch)
Changed in linux (Ubuntu):
importance: Undecided → Medium
Jeff Lane  (bladernr)
tags: removed: blocks-hwcert-server
Revision history for this message
fierabras66 (ralonsoal) wrote :

Hello. I have four of this DL385P Generation 8 serves from Hewlett Packard Enterprise.

When we bought them, everything was fine using Ubuntu 14.04.03 LTS, but once we upgrade to 16.04 LTS, everything went wrong.

I'm following this bug and I'm really very interested in any solution. Is there any one? Where can I follow advances in this bug solution? Thanks.

Revision history for this message
Rod Smith (rodsmith) wrote :

fierabras66, please see comment #7 in this bug report for a workaround.

Revision history for this message
Narinder Gupta (narindergupta) wrote :

there is a IOMMU bug HP is debugging and for now you can disable IOMMU and it should boot as well as should work fine. Once HP will identify fix and release it we will integrate into Ubuntu 16.04.

Revision history for this message
brien Fournerat (brien-fournerat) wrote :

The #7 workaround is not a viable solution for Virtualized environments.
Currently we have pressure to provide an update as to the exact status and details of the work being done by HPE to resolve this with 4.2.X kernel.
#12 states HP is debugging...can you provide status?

Changed in linux (Ubuntu):
importance: Medium → High
tags: added: kernel-key
Revision history for this message
John Sopko (sopko) wrote :

We recently inherited several Proliant HP-SL270s Gen 8 with P420 raid controller. Loading 16.04 with kernel 4.4.0-31 I cannot see the logical disk, I disabled Intel Vt-d (same as IOMMU) and this did not fix. There was a similar bug that expired where disabling IOMMU allowed the disks to be seen on a ProLiant BL465c Gen8

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1590072

Ubuntu 14.04 can see the logical disk on the installer, would really like to load 16.04. FYI Red Hat 7 can see the disks also.

tags: added: kernel-da-key
removed: kernel-key
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.