ATA errors when link_power_management_policy is min_power

Reported by VS on 2012-05-02
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linux
Confirmed
High
linux (Ubuntu)
Medium
Unassigned

Bug Description

Disk I/O is dead-slow and dmesg is filled with error messages such as the example below when /sys/class/scsi_host/host*/link_power_management_policy is set to min_power.

WORKAROUND: Change it back to max_performance.

~$ cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA Model: HITACHI HTS54161 Rev: SBDI
  Type: Direct-Access ANSI SCSI revision: 05
Host: scsi3 Channel: 00 Id: 00 Lun: 00
  Vendor: HL-DT-ST Model: DVDRAM GSA-U10N Rev: 1.05
  Type: CD-ROM ANSI SCSI revision: 05

dmesg errors:
[ 607.081784] ata1.00: exception Emask 0x10 SAct 0xfe SErr 0x48c0002 action 0xe frozen
[ 607.081796] ata1.00: irq_stat 0x08000040, interface fatal error, connection status changed
[ 607.081807] ata1: SError: { RecovComm CommWake 10B8B LinkSeq DevExch }
[ 607.081816] ata1.00: failed command: WRITE FPDMA QUEUED
[ 607.081833] ata1.00: cmd 61/10:08:df:06:a5/00:00:08:00:00/40 tag 1 ncq 8192 out
[ 607.081837] res 50/00:08:ff:06:d1/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[ 607.081845] ata1.00: status: { DRDY }
[ 607.081852] ata1.00: failed command: WRITE FPDMA QUEUED
[ 607.081867] ata1.00: cmd 61/10:10:2f:07:a5/00:00:08:00:00/40 tag 2 ncq 8192 out
[ 607.081871] res 50/00:08:ff:06:d1/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[ 607.081879] ata1.00: status: { DRDY }
[ 607.081886] ata1.00: failed command: WRITE FPDMA QUEUED
[ 607.081901] ata1.00: cmd 61/08:18:6f:07:a5/00:00:08:00:00/40 tag 3 ncq 4096 out
[ 607.081904] res 50/00:08:ff:06:d1/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[ 607.081912] ata1.00: status: { DRDY }
[ 607.081919] ata1.00: failed command: WRITE FPDMA QUEUED
[ 607.081933] ata1.00: cmd 61/08:20:07:22:a6/00:00:08:00:00/40 tag 4 ncq 4096 out
[ 607.081937] res 50/00:08:ff:06:d1/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[ 607.081945] ata1.00: status: { DRDY }
[ 607.081952] ata1.00: failed command: WRITE FPDMA QUEUED
[ 607.081966] ata1.00: cmd 61/08:28:df:06:a9/00:00:08:00:00/40 tag 5 ncq 4096 out
[ 607.081970] res 50/00:08:ff:06:d1/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[ 607.081978] ata1.00: status: { DRDY }
[ 607.081984] ata1.00: failed command: WRITE FPDMA QUEUED
[ 607.081999] ata1.00: cmd 61/08:30:df:06:d1/00:00:00:00:00/40 tag 6 ncq 4096 out
[ 607.082002] res 50/00:08:ff:06:d1/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[ 607.082010] ata1.00: status: { DRDY }
[ 607.082017] ata1.00: failed command: WRITE FPDMA QUEUED
[ 607.082032] ata1.00: cmd 61/08:38:ff:06:d1/00:00:00:00:00/40 tag 7 ncq 4096 out
[ 607.082035] res 50/00:08:ff:06:d1/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
[ 607.082043] ata1.00: status: { DRDY }
[ 607.082055] ata1: hard resetting link
[ 608.084073] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 608.085166] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 608.085171] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 608.085382] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES) succeeded
[ 608.085387] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
[ 608.087770] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 608.087777] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 608.087965] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES) succeeded
[ 608.087973] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
[ 608.089106] ata1.00: configured for UDMA/100
[ 608.089330] ata1: EH complete

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: linux-image-3.2.0-24-generic-pae 3.2.0-24.37
ProcVersionSignature: Ubuntu 3.2.0-24.37-generic-pae 3.2.14
Uname: Linux 3.2.0-24-generic-pae i686
NonfreeKernelModules: nvidia
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 2.0.1-0ubuntu7
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: AD198x Analog [AD198x Analog]
   Subdevices: 2/2
   Subdevice #0: subdevice #0
   Subdevice #1: subdevice #1
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: vegar 2270 F.... pulseaudio
                      vegar 2382 F.... xfce4-volumed
 /dev/snd/controlC29: vegar 2382 F.... xfce4-volumed
CRDA: Error: [Errno 2] Ingen slik fil eller filkatalog
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xfe220000 irq 49'
   Mixer name : 'Analog Devices AD1984'
   Components : 'HDA:11d41984,17aa20d7,00100400'
   Controls : 32
   Simple ctrls : 20
Card29.Amixer.info:
 Card hw:29 'ThinkPadEC'/'ThinkPad Console Audio Control at EC reg 0x30, fw 7KHT24WW-1.08'
   Mixer name : 'ThinkPad EC 7KHT24WW-1.08'
   Components : ''
   Controls : 1
   Simple ctrls : 1
Card29.Amixer.values:
 Simple mixer control 'Console',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [on]
Date: Wed May 2 21:50:07 2012
HibernationDevice: RESUME=UUID=d4c0af33-0f9f-42e8-99ff-ad8b66f28723
IwConfig: Error: [Errno 2] Ingen slik fil eller filkatalog
MachineType: LENOVO 766418G
ProcEnviron:
 LANGUAGE=nb_NO:nb:no_NO:no:nn_NO:nn:en_GB:en
 TERM=xterm
 PATH=(custom, user)
 LANG=nb_NO.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-24-generic-pae root=UUID=0048d440-8ee4-4871-bc67-0ae31d4e3f70 ro quiet splash pcie_aspm=force vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-24-generic-pae N/A
 linux-backports-modules-3.2.0-24-generic-pae N/A
 linux-firmware 1.79
RfKill: Error: [Errno 2] Ingen slik fil eller filkatalog
SourcePackage: linux
UdevDb: Error: [Errno 2] Ingen slik fil eller filkatalog
UpgradeStatus: Upgraded to precise on 2012-05-02 (0 days ago)
WpaSupplicantLog:

dmi.bios.date: 04/08/2010
dmi.bios.vendor: LENOVO
dmi.bios.version: 7LETC7WW (2.27 )
dmi.board.name: 766418G
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr7LETC7WW(2.27):bd04/08/2010:svnLENOVO:pn766418G:pvrThinkPadT61:rvnLENOVO:rn766418G:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 766418G
dmi.product.version: ThinkPad T61
dmi.sys.vendor: LENOVO

VS (storvann) wrote :
Brad Figg (brad-figg) on 2012-05-02
Changed in linux (Ubuntu):
status: New → Confirmed
Joseph Salisbury (jsalisbury) wrote :

Do you see different results if you change /sys/class/scsi_host/host*/link_power_management_policy ?

Also, did this issue start happening after an update/upgrade? Was there a kernel version where you were not having this particular problem? This will help determine if the problem you are seeing is the result of the introduction of a regression, and when this regression was introduced.

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key
VS (storvann) wrote :

Sorry. Yes, setting /sys/class/scsi_host/host*/link_power_management_policy to min_power causes the mentioned errors to appear whenever there is some significant disk I/O, like e.g. installing/uninstalling packages with apt.

Changing it back to max_performance seems to make the problem go away.

The problem started happening after upgrading to 12.04, and has not been an issue since 2010 when a similar issue caused problems, see e.g. #539467.

Note that I am manually changing the link power management policy setting with a custom bash script, but I assume it will cause issues when the power management scripts changes the policy. I believe this happens when unplugging the AC adapter from the laptop.

Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.4kernel[1] (Not a kernel in the daily directory). Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag(Only that one tag, please leave the other tags). This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4-rc5-precise/

tags: added: needs-upstream-testing
VS (storvann) on 2012-05-03
tags: added: kernel-bug-exists-upstream
removed: needs-upstream-testing
VS (storvann) wrote :

Tested linux-image-3.4.0-030400rc5-generic-pae_3.4.0-030400rc5.201205011817_i386.deb

The same error (below) occurs after changing the S-ATA link power management policy to min_power. This time it was triggered by running apt-get update.

[ 315.000173] ata2.00: exception Emask 0x0 SAct 0x1 SErr 0x40000 action 0x6
frozen
[ 315.000179] ata2: SError: { CommWake }
[ 315.000184] ata2.00: failed command: WRITE FPDMA QUEUED
[ 315.000191] ata2.00: cmd 61/08:00:37:48:0d/00:00:02:00:00/40 tag 0 ncq 4096
out
[ 315.000193] res 50/00:70:0f:63:d1/00:00:00:00:00/40 Emask 0x4
(timeout)
[ 315.000196] ata2.00: status: { DRDY }
[ 315.000202] ata2: hard resetting link
[ 315.488099] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 315.489211] ata2.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 315.489216] ata2.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK)
filtered out
[ 315.489439] ata2.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES) succeeded
[ 315.489444] ata2.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered
out
[ 315.491856] ata2.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 315.491864] ata2.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK)
filtered out
[ 315.492070] ata2.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES) succeeded
[ 315.492078] ata2.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered
out
[ 315.493247] ata2.00: configured for UDMA/100
[ 315.493476] ata2: EH complete

Joseph Salisbury (jsalisbury) wrote :

This issue appears to be an upstream bug, since you tested the latest upstream kernel. Would it be possible for you to open an upstream bug report at bugzilla.kernel.org [1]? That will allow the upstream Developers to examine the issue, and may provide a quicker resolution to the bug.

If you are comfortable with opening a bug upstream, It would be great if you can report back the upstream bug number in this bug report. That will allow us to link this bug to the upstream report.

[1] https://wiki.ubuntu.com/Bugs/Upstream/kernel

Changed in linux (Ubuntu):
status: Confirmed → Triaged
VS (storvann) wrote :

Upstream bug opened with id 43200:
https://bugzilla.kernel.org/show_bug.cgi?id=43200

Changed in linux:
importance: Unknown → High
status: Unknown → Confirmed
tags: added: bios-outdated-2.30
Changed in linux (Ubuntu):
status: Triaged → Incomplete
VS (storvann) wrote :

Upgraded to the latest BIOS, bug still present.

Here's what I did:
1. echo min_power |sudo tee /sys/class/scsi_host/host*/link_power_management_policy
2. Provoked disk activity by starting Firefox. Everything hangs for a while, messages in dmesg pasted below.
3. Returned to normal operation with: echo max_performance |sudo tee /sys/class/scsi_host/host*/link_power_management_policy

Requested output:
$ sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date
7LETD0WW (2.30 )
02/27/2012

dmesg output:
[ 340.064083] ata3.00: exception Emask 0x0 SAct 0x1 SErr 0x40000 action 0x6 frozen
[ 340.064088] ata3: SError: { CommWake }
[ 340.064092] ata3.00: failed command: READ FPDMA QUEUED
[ 340.064099] ata3.00: cmd 60/18:00:df:85:9f/00:00:04:00:00/40 tag 0 ncq 12288 in
[ 340.064099] res 40/00:00:00:4f:c2/00:01:00:00:00/00 Emask 0x4 (timeout)
[ 340.064102] ata3.00: status: { DRDY }
[ 340.064108] ata3: hard resetting link
[ 340.552105] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 340.553225] ata3.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 340.553230] ata3.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 340.553449] ata3.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES) succeeded
[ 340.553454] ata3.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
[ 340.555876] ata3.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 340.555882] ata3.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 340.556077] ata3.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES) succeeded
[ 340.556083] ata3.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
[ 340.557176] ata3.00: configured for UDMA/100
[ 340.557381] ata3.00: device reported invalid CHS sector 0
[ 340.557395] ata3: EH complete

VS, could you please test the latest upstream kernel available (not the daily folder) following https://wiki.ubuntu.com/KernelMainlineBuilds ? Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this bug is fixed in the mainline kernel, please add the following tags:
kernel-fixed-upstream
kernel-fixed-upstream-VERSION-NUMBER

where VERSION-NUMBER is the version number of the kernel you tested. For example:
kernel-fixed-upstream-v3.12

This can be done by clicking on the yellow circle with a black pencil icon next to the word Tags located at the bottom of the bug description. As well, please remove the tag:
needs-upstream-testing

If the mainline kernel does not fix this bug, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-VERSION-NUMBER

As well, please remove the tag:
needs-upstream-testing

Once testing of the upstream kernel is complete, please mark this bug's Status as Confirmed. Please let us know your results. Thank you for your understanding.

tags: added: latest-bios-2.30
removed: bios-outdated-2.30
tags: added: regression-release
description: updated
VS (storvann) wrote :

I tested using the kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.12-trusty/

I'm still getting the same errors in dmesg, triggered by running apt-get upgrade:

[ 329.070415] ata3.00: exception Emask 0x0 SAct 0x1 SErr 0x40000 action 0x6 frozen
[ 329.070423] ata3: SError: { CommWake }
[ 329.070427] ata3.00: failed command: WRITE FPDMA QUEUED
[ 329.070434] ata3.00: cmd 61/08:00:f7:9b:32/00:00:0a:00:00/40 tag 0 ncq 4096 out
[ 329.070434] res 40/00:00:00:4f:c2/00:01:00:00:00/00 Emask 0x4 (timeout)
[ 329.070438] ata3.00: status: { DRDY }
[ 329.070443] ata3: hard resetting link
[ 329.612086] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 329.613207] ata3.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 329.613212] ata3.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 329.613432] ata3.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES) succeeded
[ 329.613437] ata3.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
[ 329.615920] ata3.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 329.615927] ata3.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 329.616115] ata3.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES) succeeded
[ 329.616122] ata3.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
[ 329.617303] ata3.00: configured for UDMA/100
[ 329.617542] ata3.00: device reported invalid CHS sector 0
[ 329.617551] ata3: EH complete
[ 428.128137] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x50000 action 0x6 frozen
[ 428.128149] ata3: SError: { PHYRdyChg CommWake }
[ 428.128152] ata3.00: failed command: FLUSH CACHE
[ 428.128158] ata3.00: cmd e7/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[ 428.128158] res 40/00:00:00:4f:c2/00:01:00:00:00/00 Emask 0x4 (timeout)
[ 428.128161] ata3.00: status: { DRDY }
[ 428.128166] ata3: hard resetting link
[ 428.448135] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 428.449256] ata3.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 428.449261] ata3.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 428.449482] ata3.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES) succeeded
[ 428.449486] ata3.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
[ 428.451921] ata3.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 428.451927] ata3.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 428.452129] ata3.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES) succeeded
[ 428.452135] ata3.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
[ 428.453325] ata3.00: configured for UDMA/100
[ 428.453329] ata3.00: retrying FLUSH 0xe7 Emask 0x4
[ 428.453613] ata3.00: device reported invalid CHS sector 0
[ 428.453621] ata3: EH complete

tags: added: kernel-bug-exists-upstream-v3.12
Changed in linux (Ubuntu):
status: Incomplete → Confirmed

VS, the next step is to fully commit bisect from the release prior to Precise that this problem didn't happen with, to Precise in order to identify the offending commit.

Could you please do this following https://wiki.ubuntu.com/Kernel/KernelBisection ?

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.