Random SATA drives on PMPs on sata_sil24 cards not being detected at boot

Bug #987353 reported by Daniel Smedegaard Buus
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linux
Confirmed
High
linux (Ubuntu)
Incomplete
Medium
Unassigned

Bug Description

I just updated my box from Oneiric to Precise (linux-generic-3.2.0.23.25).

The box has 22 SATA drives,
6 on ICH10R
15 on three sata_sil24 PCIe 1-port cards using three 1:5 PMPs
1 on a sata_sil PCI32 4-port card

After upgrading, two things are apparent:
1) GRUB hangs for quite a long time upon booting, but eventually continues. dmesg (attached) mentions a whole lot of hard resetting of my sata_sil24 devices until finally continuing.
2) Random SATA drives attached to the sata_sil24 will be missing.

dmesg attached, other output to follow.

WORKAROUND: Hot-unplugging and re-plugging the affected drives properly presents them to the system, and I can then use them.

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: linux-image-3.2.0-23-generic 3.2.0-23.36
ProcVersionSignature: Ubuntu 3.2.0-23.36-generic 3.2.14
Uname: Linux 3.2.0-23-generic x86_64
NonfreeKernelModules: nvidia zfs zcommon znvpair zavl zunicode
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 2.0.1-0ubuntu5
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC2: daniel 2627 F.... pulseaudio
 /dev/snd/controlC1: daniel 2627 F.... pulseaudio
 /dev/snd/controlC0: daniel 2627 F.... pulseaudio
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xfaef8000 irq 22'
   Mixer name : 'Realtek ALC888'
   Components : 'HDA:10ec0888,18490888,00100001'
   Controls : 37
   Simple ctrls : 17
Card1.Amixer.info:
 Card hw:1 'NVidia'/'HDA NVidia at 0xfebfc000 irq 17'
   Mixer name : 'Nvidia GPU 1c HDMI/DP'
   Components : 'HDA:10de001c,104383a0,00100100'
   Controls : 12
   Simple ctrls : 2
Card2.Amixer.info:
 Card hw:2 'NVidia_1'/'HDA NVidia at 0xfcbfc000 irq 23'
   Mixer name : 'Nvidia GPU 1c HDMI/DP'
   Components : 'HDA:10de001c,19da1228,00100100'
   Controls : 12
   Simple ctrls : 2
Date: Mon Apr 23 17:00:48 2012
HibernationDevice: RESUME=UUID=ad3d1ee7-2108-47b0-b764-a1ec42898ba2
InstallationMedia: Kubuntu 12.04 LTS "Precise Pangolin" - Beta amd64 (20120421.1)
MachineType: To Be Filled By O.E.M. To Be Filled By O.E.M.
ProcEnviron:
 LANGUAGE=en_US:en
 TERM=xterm
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/@/boot/vmlinuz-3.2.0-23-generic root=UUID=aa16e554-8ba2-4537-8bc6-a2c37f160369 ro rootflags=subvol=@ quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-23-generic N/A
 linux-backports-modules-3.2.0-23-generic N/A
 linux-firmware 1.79
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 12/01/2009
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: P3.10
dmi.board.name: P43Twins1600
dmi.board.vendor: ASRock
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: To Be Filled By O.E.M.
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrP3.10:bd12/01/2009:svnToBeFilledByO.E.M.:pnToBeFilledByO.E.M.:pvrToBeFilledByO.E.M.:rvnASRock:rnP43Twins1600:rvr:cvnToBeFilledByO.E.M.:ct3:cvrToBeFilledByO.E.M.:
dmi.product.name: To Be Filled By O.E.M.
dmi.product.version: To Be Filled By O.E.M.
dmi.sys.vendor: To Be Filled By O.E.M.

Revision history for this message
Daniel Smedegaard Buus (danielbuus) wrote :
Revision history for this message
Daniel Smedegaard Buus (danielbuus) wrote :

FYI: Hot-unplugging and re-plugging the affected drives properly presents them to the system, and I can then use them.

Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Does this issue still happen if you boot back into an Oneiric kernel?

Also, would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.4kernel[1] (Not a kernel in the daily directory). Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag(Only that one tag, please leave the other tags). This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4-rc4-precise/

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key needs-upstream-testing
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Daniel Smedegaard Buus (danielbuus) wrote :

I will try different kernels promptly.

Btw, I wasn't perfectly clear in my phrasing (sorry): By "updated my box from Oneiric to Precise" I didn't mean a dist upgrade, I meant a wipe of Oneiric and fresh install of Precise daily. Just to clear that up :)

Will report back with results on other kernels.

Revision history for this message
Daniel Smedegaard Buus (danielbuus) wrote :

Just tried the 3.4.0-030400rc4-generic from http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4-rc4-precise/, same problem.

I'll try an Oneiric one now.

Revision history for this message
Daniel Smedegaard Buus (danielbuus) wrote :

Installed 3.0.0-17-generic for Oneiric, GRUB now goes happily and immediately from its selection screen to the Kubuntu loading logo, and all drives are present at boot (which is a LOT faster, now that the link resets aren't triggered anymore).

Is there anything I should attach or report about in this configuration?

Also, is it "safe" to use this build for the time being until the issue with 3.2+ is resolved? (e.g. I had to force install as I couldn't immediately satisfy a wireless-crda package dependency (and I don't use wireless on this box)).

Thanks!

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

This issue appears to be an upstream bug, since you tested the latest upstream kernel. Would it be possible for you to open an upstream bug report at bugzilla.kernel.org [1]? That will allow the upstream Developers to examine the issue, and may provide a quicker resolution to the bug.

If you are comfortable with opening a bug upstream, It would be great if you can report back the upstream bug number in this bug report. That will allow us to link this bug to the upstream report.

[1] https://wiki.ubuntu.com/Bugs/Upstream/kernel

Changed in linux (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
Daniel Smedegaard Buus (danielbuus) wrote :

Thanks for your time and help, I've created a kernel bug report (#43153) here:

https://bugzilla.kernel.org/show_bug.cgi?id=43153

Changed in linux:
importance: Unknown → High
status: Unknown → Confirmed
Revision history for this message
penalvch (penalvch) wrote :

Daniel Smedegaard Buus, this bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? If so, could you please test for this with the latest upstream kernel available (not the daily folder) following https://wiki.ubuntu.com/KernelMainlineBuilds ? Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this bug is fixed in the mainline kernel, please add the following tags:
kernel-fixed-upstream
kernel-fixed-upstream-VERSION-NUMBER

where VERSION-NUMBER is the version number of the kernel you tested. For example:
kernel-fixed-upstream-v3.12

This can be done by clicking on the yellow circle with a black pencil icon next to the word Tags located at the bottom of the bug description. As well, please remove the tag:
needs-upstream-testing

If the mainline kernel does not fix this bug, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-VERSION-NUMBER

As well, please remove the tag:
needs-upstream-testing

Once testing of the upstream kernel is complete, please mark this bug's Status as Confirmed. Please let us know your results. Thank you for your understanding.

tags: added: latest-bios-3.10 regression-release
removed: kernel pmp sata sata-sil24
description: updated
Changed in linux (Ubuntu):
status: Triaged → Incomplete
tags: added: kernel-fixed-upstream kernel-fixed-upstream-v3.11
removed: needs-upstream-testing
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
penalvch (penalvch) wrote :

Daniel Smedegaard Buus, could you please test for this in Trusty via http://cdimage.ubuntu.com/daily-live/current/ and advise on if this is reproducible?

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.