"ata4: SRST failed (errno=-16)" freezes Ubuntu 12.04.2 LTS Precise Pangolin on resume since vmlinuz-3.2.0-49-generic-pae

Bug #1203446 reported by TEN
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Medium
Unassigned

Bug Description

Since linux-image-3.2.0-49-generic-pae:i386 my Hewlett-Packard DC7800CMT (Nvidia GT 520 graphics with snd_hda_codec_hdmi usually in /etc/modprobe.d/blacklist-oss.conf due to https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1169984 and its dozens of duplicates so far still unfixed for Ubuntu 12.04), freezes on most resumes (wake-up attempts) from stand-by with a full-screen text console message of "ata4: SRST failed (errno=-16)", requiring a hard power cycle to reboot.

It is not remedied by a pci=nomsi parameter nor by changes to the SATA wiring.
Reverting to linux-image-3.2.0-48-generic-pae:i386 makes the problem disappear.

---
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 2.0.1-0ubuntu17.3
Architecture: i386
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: user 2400 F.... pulseaudio
 /dev/snd/controlC1: user 2400 F.... pulseaudio
CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found.
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xf3120000 irq 45'
   Mixer name : 'Analog Devices AD1884'
   Components : 'HDA:11d41884,103c2819,00100100'
   Controls : 31
   Simple ctrls : 19
Card1.Amixer.info:
 Card hw:1 'SAA7134'/'saa7134[0] at 0xf3200000 irq 22'
   Mixer name : 'SAA7134 Mixer'
   Components : ''
   Controls : 6
   Simple ctrls : 3
Card2.Amixer.info:
 Card hw:2 'NVidia'/'HDA NVidia at 0xf3000000 irq 17'
   Mixer name : 'Nvidia ID 1c'
   Components : 'HDA:10de001c,14628097,00100100'
   Controls : 0
   Simple ctrls : 0
Card2.Amixer.values:

DistroRelease: Ubuntu 12.04
HibernationDevice: RESUME=UUID=b779581c-49c5-4cca-8cac-3d2b5243b6ee
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Release i386 (20120423)
IwConfig:
 lo no wireless extensions.

 eth0 no wireless extensions.
MachineType: Hewlett-Packard HP Compaq dc7800p Convertible Minitower
MarkForUpload: True
NonfreeKernelModules: nvidia
Package: linux (not installed)
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-48-generic-pae root=UUID=967f80d1-798e-4c04-bb4c-935f35d7397b ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 3.2.0-48.74-generic-pae 3.2.46
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-48-generic-pae N/A
 linux-backports-modules-3.2.0-48-generic-pae N/A
 linux-firmware 1.79.4
RfKill:
 0: hci0: Bluetooth
  Soft blocked: no
  Hard blocked: no
StagingDrivers: mei
Tags: precise staging
Uname: Linux 3.2.0-48-generic-pae i686
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
dmi.bios.date: 07/18/2007
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: 786F1 v01.04
dmi.board.asset.tag: CZC8277VRK
dmi.board.name: 0AACh
dmi.board.vendor: Hewlett-Packard
dmi.chassis.asset.tag: CZC8277VRK
dmi.chassis.type: 6
dmi.chassis.vendor: Hewlett-Packard
dmi.modalias: dmi:bvnHewlett-Packard:bvr786F1v01.04:bd07/18/2007:svnHewlett-Packard:pnHPCompaqdc7800pConvertibleMinitower:pvr:rvnHewlett-Packard:rn0AACh:rvr:cvnHewlett-Packard:ct6:cvr:
dmi.product.name: HP Compaq dc7800p Convertible Minitower
dmi.sys.vendor: Hewlett-Packard

TEN (launchpad-20-ten)
summary: - "ata4: SRST failt (errno=-16)" freezes Ubuntu 12.04.2 LTS Precise
+ "ata4: SRST failed (errno=-16)" freezes Ubuntu 12.04.2 LTS Precise
Pangolin on resume since vmlinuz-3.2.0-49-generic-pae
description: updated
description: updated
description: updated
TEN (launchpad-20-ten)
description: updated
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1203446

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: precise
TEN (launchpad-20-ten)
description: updated
Revision history for this message
TEN (launchpad-20-ten) wrote : .etc.asound.conf.txt

apport information

tags: added: apport-collected staging
description: updated
Revision history for this message
TEN (launchpad-20-ten) wrote : AcpiTables.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : AlsaDevices.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : AplayDevices.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : ArecordDevices.txt

apport information

description: updated
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
TEN (launchpad-20-ten) wrote :

Not sure why apport apparently only attaches ALSA-related files before running into a "504 Gateway Time-out: The server didn't respond in time."

Revision history for this message
penalvch (penalvch) wrote :

TEN, could you please attempt the apport-collect once again? It looks like something choked and all the expected files were not gathered.

tags: added: needs-upstream-testing regression-update
Revision history for this message
TEN (launchpad-20-ten) wrote : AcpiTables.txt

apport information

description: updated
Revision history for this message
TEN (launchpad-20-ten) wrote : AlsaDevices.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : AplayDevices.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : ArecordDevices.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : BootDmesg.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : Card0.Amixer.values.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : Card0.Codecs.codec.0.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : Card1.Amixer.values.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : Card2.Codecs.codec.0.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : CurrentDmesg.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : Lspci.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : Lsusb.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : PciMultimedia.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : ProcInterrupts.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : ProcModules.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : PulseList.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : UdevDb.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : UdevLog.txt

apport information

Revision history for this message
TEN (launchpad-20-ten) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
TEN (launchpad-20-ten) wrote :

Also affects kernel 3.2.0-51-generic-pae (distributed to Ubuntu 12.04 LTS today); using this however the"last words" (unlike 3.2.0-49) are not consistently "ATA*: SRST failed" anymore, but sometimes refer to mei or USB read/64 failures.

"blacklist mei" through /etc/modprobe.d does not prevent the freeze either: I still need to revert to 3.2.0-48 through grub.

Curiously I have seen the "NVRM: Your system is not currently configured to drive a VGA console" warning (cf. http://www.nvnews.net/vbulletin/showthread.php?t=184614, https://devtalk.nvidia.com/default/topic/534599/linux/-304-84-when-switching-from-x-session-to-tty-1-6-display-gets-no-signal) show up in /var/log/syslog with kernel 3.2.0-51-generic-pae.

Revision history for this message
TEN (launchpad-20-ten) wrote :

Problem persists under Kernel 3.2.0-52, but could not be reproduced so far for Kernel 3.8.0-29 after
sudo apt-get install linux-generic-lts-raring
(cf. https://wiki.ubuntu.com/Kernel/LTSEnablementStack#Proposed_12.04.3_.2B-_13.04_Hardware_Enablement_Stack_Policies_and_Procedures)
which needs further changes however in order not to crash virtualbox-dkms:

Examining /etc/kernel/header_postinst.d.
run-parts: executing /etc/kernel/header_postinst.d/dkms
3.8.0-29-generic /boot/vmlinuz-3.8.0-29-generic
Error! The dkms.conf for this module includes a BUILD_EXCLUSIVE
directive which does not match this kernel/arch. This indicates that
it should not be built.
Error! Bad return status for module build on kernel: 3.8.0-29-generic
(i686)
Consult /var/lib/dkms/virtualbox/4.1.12/build/make.log for more
information.

Revision history for this message
penalvch (penalvch) wrote :

TEN, as per http://h20566.www2.hp.com/portal/site/hpsc/template.PAGE/public/psi/swdHome/?sp4ts.oid=3459243&spf_p.tpst=swdMain&spf_p.prp_swdMain=wsrp-navigationalState%3DswEnvOID%253D4059%257CswLang%253D%257Caction%253DlistDriver&javax.portlet.begCacheTok=com.vignette.cachetoken&javax.portlet.endCacheTok=com.vignette.cachetoken#BIOS an update is available for your BIOS (1.32 Rev. A). If you update to this, does it change anything?

If not, could you please both specify what happened, and provide the output of the following terminal command:
sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date

Please note your current BIOS is already in the Bug Description, so posting this on the old BIOS would not be helpful.

Thank you for your understanding.

description: updated
tags: added: bios-outdated-bios-outdated-1.32a needs-suspend-log resume suspend
description: updated
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
TEN (launchpad-20-ten) wrote :

Christopher M. Penalver (penalvch) wrote on 2013-09-05:
> BIOS (1.32 Rev. A). If you update to this, does it change anything?

As this "test" can never be undone (cf. attachment), it might not have been the best option: ;-(

Now kernels 3.2.0-49+ crash beeping frantically on resume without even re-activating the monitor (LED stays amber).

Curiously kernel 3.2.0-49 itself specifically does not use the entire screen are anymore but shows wide black borders (with the Nvidia driver) on the LXDE desktop in spite of FullHD 1920*1080 resolution.

Kernel 3.2.0-48 (which used to wake up before the BIOS update) now crashes to a synchronized but black display with the fans spinning up ever faster on resume.

Fortunately kernel 3.8.0-30 (unlike 3.2.0-48 now) still survives a suspend/resume cycle even with the new BIOS 1.32, but now pops up the crash reporter for a System Program Problem twice on first boot, referring to an issue with vbetool (interesting as a mode-switching problem might indeed be one to freeze the machine), but closing without an opportunity to actually comment on the issue/report in the browser.

The VirtualBox bug for kernels 3.8 per #30 (standing in the way of a wholesale switch to linux-generic-lts-raring so far) can be fixed according to https://bugs.launchpad.net/debian/+source/virtualbox/+bug/1076603 but still needs to be integrated into the default repositories.

Revision history for this message
TEN (launchpad-20-ten) wrote :

s/are/area on #32 (sorry for the typo; no edit allowed by Launchpad?)

For the record:
$ sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date
786F1 v01.32
07/21/2011

Revision history for this message
penalvch (penalvch) wrote :

TEN, could you please confirm this issue exists with the latest development release of Ubuntu? ISO images are available from http://cdimage.ubuntu.com/daily-live/current/ . If the issue remains, could you please run the following command in the development release from a Terminal (Applications->Accessories->Terminal), as it will automatically gather and attach updated debug information to this report:

apport-collect -p linux <replace-with-bug-number>

Also, could you please test the latest upstream kernel available following https://wiki.ubuntu.com/KernelMainlineBuilds ? It will allow additional upstream developers to examine the issue. Please do not test the daily folder, but the one all the way at the bottom. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this bug is fixed in the mainline kernel, please add the following tags:
kernel-fixed-upstream
kernel-fixed-upstream-VERSION-NUMBER

where VERSION-NUMBER is the version number of the kernel you tested. For example:
kernel-fixed-upstream-v3.11

This can be done by clicking on the yellow circle with a black pencil icon next to the word Tags located at the bottom of the bug description. As well, please remove the tag:
needs-upstream-testing

If the mainline kernel does not fix this bug, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-VERSION-NUMBER

As well, please remove the tag:
needs-upstream-testing

Once testing of the upstream kernel is complete, please mark this bug's Status as Confirmed. Please let us know your results. Thank you for your understanding.

tags: added: latest-bios-1.32a
removed: bios-outdated-bios-outdated-1.32a
Revision history for this message
TEN (launchpad-20-ten) wrote :

With no known method of USB-booting in a way that includes the proprietary Nvidia driver and hibernation, as a mere mortal user with one single machine in this place I do not yet see a feasible approach to more elaborately test this (beyond what you call "Incomplete"), short of a re-install to something higher than the LTS that I need.

Revision history for this message
penalvch (penalvch) wrote :

TEN, could you please comment to if the Saucy live environment (which would have nouveau) boots up and suspends successfully?

The mainline kernel test could be done in Precise. Just be advised that it would be testing nouveau (which is the point), not nvidia.

Revision history for this message
TEN (launchpad-20-ten) wrote :

http://fridge.ubuntu.com/2013/09/27/ubuntu-13-10-saucy-salamander-final-beta-released/ booted from USB does not resume back to the GUI from suspend,
but an Ctrl-Alt-F1 console is still usable,
showing a syslog full of nouveau PFIFO warnings (mostly INTR)
and (unlogged) errors to the console, some referring to SUBFIFO 0: ch0 [DRM].
No shortage of mei_me entries either, and a stack dump which I will try to extract for you.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
TEN (launchpad-20-ten) wrote :

http://fridge.ubuntu.com/2013/09/27/ubuntu-13-10-saucy-salamander-final-beta-released/ booted from USB after
apt-get install nvidia-319
resumes from suspend only to a black screen without access to a Ctrl-Alt-F1 etc. text console or even working keyboard LEDs.

Interestingly, it detects what the earlier kernels listed above seem to have had trouble handling on resume:

Sep 29 11:21:52 ubuntu kernel: [ 962.316025] ata4: link is slow to respond, please be patient (ready=0)
Sep 29 11:21:52 ubuntu kernel: [ 964.780056] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Sep 29 11:21:52 ubuntu kernel: [ 964.804299] ata4.00: configured for UDMA/133

All other SATA links are 1.5 Gbps only - could there be a timing issue at this higher speed?

Under nouveau (as per #36), /var/log/syslog was getting flooded with these (7674 rows per SECOND - no wonder this has a performance impact):
Sep 29 11:21:52 ubuntu kernel: [ 981.447176] nouveau W[ PFIFO][0000:01:00.0] INTR 0x01000000: 0x00000005
Sep 29 11:21:52 ubuntu kernel: [ 981.447296] nouveau W[ PFIFO][0000:01:00.0] INTR 0x01000000: 0x00000005
Sep 29 11:21:52 ubuntu kernel: [ 981.447427] nouveau W[ PFIFO][0000:01:00.0] INTR 0x01000000: 0x00000005
...or at higher resolution as per dmesg:
[ 1218.967045] nouveau W[ PFIFO][0000:01:00.0] INTR 0x01000000: 0x00000005
[ 1218.967174] nouveau W[ PFIFO][0000:01:00.0] INTR 0x01000000: 0x00000005
[ 1218.967300] nouveau W[ PFIFO][0000:01:00.0] INTR 0x01000000: 0x00000005
[ 1218.967431] nouveau W[ PFIFO][0000:01:00.0] INTR 0x01000000: 0x00000005
[ 1218.967561] nouveau W[ PFIFO][0000:01:00.0] INTR 0x01000000: 0x00000005
[ 1218.967694] nouveau W[ PFIFO][0000:01:00.0] INTR 0x01000000: 0x00000005
[ 1218.967827] nouveau W[ PFIFO][0000:01:00.0] INTR 0x01000000: 0x00000005
[ 1218.967957] nouveau W[ PFIFO][0000:01:00.0] INTR 0x01000000: 0x00000005

Using nvidia-319 (rather than nouveau), there seems to be "room" for flooding the syslog with mei_me errors instead now (alternating unexpected resets, disconnects/reconnects) as in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1196155 - which had previously made 13.04 unbootable (from USB at least).

Revision history for this message
penalvch (penalvch) wrote :

TEN, could you please test the latest mainline kernel via http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.12-rc3-saucy/ and advise on the results?

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.