Bug #285892 “ata1.00: exception Emask 0x0 SAct 0x807f SErr 0x0 a...” : Bugs : linux package : Ubuntu

Revision history for this message

Marcel (marcel-vd-berg) wrote on 2008-10-19:

#1

system.log Edit (39.6 KiB, text/plain)

Revision history for this message

Marcos (deflagmator) wrote on 2008-10-25:

#2

This seems to be similar to my problem.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/279693

Revision history for this message

kernel-janitor (kernel-janitor) wrote on 2009-08-25:

#3

Hi Marcel,

This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/karmic .

If it remains an issue, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux 285892

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags:	added: needs-kernel-logs
tags:	added: needs-upstream-testing
tags:	added: kj-triage
Changed in linux (Ubuntu):
status:	New → Incomplete

Revision history for this message

Mikael Bergqvist (mikaelb) wrote on 2009-10-05:

#4

I just experienced this after an upgrade from Jaunty to Karmic Beta with kernel: 2.6.31-11-generic #38-Ubuntu SMP Fri Oct 2 11:55:55 UTC 2009 i686 GNU/Linux

[ 221.816249] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 221.816279] ata1.00: cmd c8/00:08:87:95:81/00:00:00:00:00/e4 tag 0 dma 4096 in
[ 221.816285] res 40/00:fe:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
[ 221.816296] ata1.00: status: { DRDY }
[ 226.856074] ata1: link is slow to respond, please be patient (ready=0)
[ 231.840063] ata1: device not ready (errno=-16), forcing hardreset
[ 231.840080] ata1: soft resetting link
[ 232.022185] ata1.00: configured for UDMA/100
[ 232.022199] ata1.00: device reported invalid CHS sector 0
[ 232.022218] ata1: EH complete

Revision history for this message

nahtgesicht (nahtgesicht) wrote on 2009-10-16:

#5

I also have this issue with Jaunty (2.6.28-15-generic #52-Ubuntu SMP Wed Sep 9 10:49:34 UTC 2009 i686 GNU/Linux) on an IBM Thinkpad X31 with an 160GB Samsung disk:

from dmesg startup:

[ 4.087964] ata_piix 0000:00:1f.1: PCI INT A -> Link[LNKC] -> GSI 11 (level, low) -> IRQ 11
[ 4.088073] ata_piix 0000:00:1f.1: setting latency timer to 64
[ 4.088255] scsi0 : ata_piix
[ 4.088697] scsi1 : ata_piix
[ 4.091191] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0x1860 irq 14
[ 4.091200] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0x1868 irq 15
[ 4.254198] ata1.00: ATA-8: SAMSUNG HM160HC, LQ100-10, max UDMA/100
[ 4.254207] ata1.00: 312581808 sectors, multi 16: LBA48
[ 4.270188] ata1.00: configured for UDMA/100
[ 4.424334] scsi 0:0:0:0: Direct-Access ATA SAMSUNG HM160HC LQ10 PQ: 0 ANSI: 5
[ 4.424600] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors: (160 GB/149 GiB)
[ 4.424647] sd 0:0:0:0: [sda] Write Protect is off
[ 4.424655] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[ 4.424727] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 4.424887] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors: (160 GB/149 GiB)
[ 4.424929] sd 0:0:0:0: [sda] Write Protect is off
[ 4.424936] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[ 4.425006] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 4.425017] sda: sda1 sda2 sda3 < sda5 sda6 >
[ 4.502612] sd 0:0:0:0: [sda] Attached SCSI disk

and now the problem:

[ 54.816078] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 54.816090] ata1.00: cmd c8/00:38:cf:88:16/00:00:00:00:00/e1 tag 0 dma 28672 in
[ 54.816092] res 40/00:80:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
[ 54.816096] ata1.00: status: { DRDY }
[ 59.856041] ata1: link is slow to respond, please be patient (ready=0)
[ 64.840127] ata1: device not ready (errno=-16), forcing hardreset
[ 64.840137] ata1: soft resetting link
[ 65.022284] ata1.00: configured for UDMA/100
[ 65.022301] ata1: EH complete
[ 65.031178] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors: (160 GB/149 GiB)

I also have this issue with Jaunty (2.6.28-15-generic #52-Ubuntu SMP Wed Sep 9 10:49:34 UTC 2009 i686 GNU/Linux) on an IBM Thinkpad X31 with an 160GB Samsung disk:

from dmesg startup:

[    4.087964] ata_piix 0000:00:1f.1: PCI INT A -> Link[LNKC] -> GSI 11 (level, low) -> IRQ 11
[    4.088073] ata_piix 0000:00:1f.1: setting latency timer to 64
[    4.088255] scsi0 : ata_piix
[    4.088697] scsi1 : ata_piix
[    4.091191] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0x1860 irq 14
[    4.091200] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0x1868 irq 15
[    4.254198] ata1.00: ATA-8: SAMSUNG HM160HC, LQ100-10, max UDMA/100
[    4.254207] ata1.00: 312581808 sectors, multi 16: LBA48 
[    4.270188] ata1.00: configured for UDMA/100
[    4.424334] scsi 0:0:0:0: Direct-Access     ATA      SAMSUNG HM160HC  LQ10 PQ: 0 ANSI: 5
[    4.424600] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors: (160 GB/149 GiB)
[    4.424647] sd 0:0:0:0: [sda] Write Protect is off
[    4.424655] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    4.424727] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    4.424887] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors: (160 GB/149 GiB)
[    4.424929] sd 0:0:0:0: [sda] Write Protect is off
[    4.424936] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    4.425006] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    4.425017]  sda: sda1 sda2 sda3 < sda5 sda6 >
[    4.502612] sd 0:0:0:0: [sda] Attached SCSI disk

and now the problem:

[   54.816078] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[   54.816090] ata1.00: cmd c8/00:38:cf:88:16/00:00:00:00:00/e1 tag 0 dma 28672 in
[   54.816092]          res 40/00:80:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
[   54.816096] ata1.00: status: { DRDY }
[   59.856041] ata1: link is slow to respond, please be patient (ready=0)
[   64.840127] ata1: device not ready (errno=-16), forcing hardreset
[   64.840137] ata1: soft resetting link
[   65.022284] ata1.00: configured for UDMA/100
[   65.022301] ata1: EH complete
[   65.031178] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors: (160 GB/149 GiB)

Revision history for this message

gab0r (gab0r) wrote on 2009-11-02:

#6

I have the same problem. I've just upgraded to Karmic (Linux asus-lapi 2.6.31-14-generic #48-Ubuntu SMP Fri Oct 16 14:04:26 UTC 2009 i686 GNU/Linux) on my Asus M6VA laptop, also with a 160GB Samsung HDD. I have this problem only when I put the laptop in standby, or resume from it, but not all times.
---
[48113.000528] ata1: drained 151 bytes to clear DRQ.
[48113.000546] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[48113.000570] ata1.00: cmd c8/00:20:a0:a9:33/00:00:00:00:00/e3 tag 0 dma 16384 in
[48113.000574] res 40/00:fe:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
[48113.000582] ata1.00: status: { DRDY }
[48113.056080] ata1: soft resetting link
[48113.246158] ata1.00: configured for UDMA/100
[48113.260560] ata1.01: configured for UDMA/33
[48113.260903] ata1.00: device reported invalid CHS sector 0
[48113.260919] ata1: EH complete

Revision history for this message

Henning Mersch (ubuntu-hmersch) wrote on 2009-11-15:

#7

Same here on a Samsung N140, running latest Karmic kernel 2.6.31-14-generic

Revision history for this message

Anton (avelo) wrote on 2009-11-15:

#8

Same in a Macbook2,1 using a recently installed karmic x86_64 on ext4. Kernel 2.6.31-15-generic

Revision history for this message

xamul (luigi-zanderighi) wrote on 2009-11-16:

#9

Thanks gab0r,
I didn' notice the issue happens after standby, I use it very often and didn't relate the issue to the standby.
Now I always shutdown and the freeze don't happen any more. System is now usable, but without standby :(((((

Revision history for this message

Graham (graham-g-lambert) wrote on 2009-12-18:

#10

I have had the same problem on Ubuntu karmic and Suse 11.1. Both installations went well (apart for the fact the GRUB overwrites the disk area used by the RAID on my system in both cases - this is solved by removing stage1_5 from the GRUB installation directory - rename/move or delete the file).

After a successful install both systems started without error and fairly fast. However, after downloading the 'recent updates' (could be irrelevant - see later) that are applied after installation, the system(s) started with the above error "device reported invalid CHS sector 0". Initially this is attempted at UDMA/100 and then the bus is gradually degraded through UDMA/66 and UDMA/33, until finally the disk connection is run at the slowest speed. This takes just under 10 minutes to complete on my system, and I guess would explain the slow startup behaviour experienced by users of other systems as described above. After this the system runs very raggedly - not as smooth as I am used to with various Linux installations. I assume that the bus connection is kept at the lowest speed and the swap partition does not allow fast paging.

One might think that this is a hardware fault but so many people reporting the same error, here and on other forums, that something tells me this is a software fault... and as it happens on more than one Linux release, it is not system specific, but likely to be linked to the GRUB bootloader itself.

The 'standby' issue raised above and not the updates might give a clue. From what I can perceive, GRUB attempts to 'resume' the system from the data stored on the swap partition when the system shuts down. If the swap partition cannot be read as expected during startup then I expect that we would see an error. Removing or commenting out the option 'resume=/dev/swap' from the grub installation file in /boot/grub/menu.lst should solve this.

I am not in a position to try this immediately but would be interested in any comments. I intend to check this myself in a couple of days.

I have had the same problem on Ubuntu karmic and Suse 11.1. Both installations went well (apart for the fact the GRUB overwrites the disk area used by the RAID on my system in both cases - this is solved by removing stage1_5 from the GRUB installation directory - rename/move or delete the file).

After a successful install both systems started without error and fairly fast. However, after downloading the 'recent updates' (could be irrelevant - see later) that are applied after installation, the system(s) started with the above error "device reported invalid CHS sector 0". Initially this is attempted at UDMA/100 and then the bus is gradually degraded through UDMA/66 and UDMA/33, until finally the disk connection is run at the slowest speed. This takes just under 10 minutes to complete on my system, and I guess would explain the slow startup behaviour experienced by users of other systems as described above. After this the system runs very raggedly - not as smooth as I am used to with various Linux installations. I assume that the bus connection is kept at the lowest speed and the swap partition does not allow fast paging.

One might think that this is a hardware fault but so many people reporting the same error, here and on other forums, that something tells me this is a software fault... and as it happens on more than one Linux release, it is not system specific, but likely to be linked to the GRUB bootloader itself.

The 'standby' issue raised above and not the updates might give a clue. From what I can perceive, GRUB attempts to 'resume' the system from the data stored on the swap partition when the system shuts down. If the swap partition cannot be read as expected during startup then I expect that we would see an error. Removing or commenting out the option 'resume=/dev/swap' from the grub installation file in /boot/grub/menu.lst should solve this.

I am not in a position to try this immediately but would be interested in any comments. I intend to check this myself in a couple of days.

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: apport-collect data

#11

Architecture: i386
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/controlC0: gio 1944 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
Card hw:0 'Intel'/'HDA Intel at 0xfebfc000 irq 30'
   Mixer name : 'Analog Devices AD1986A'
   Components : 'HDA:11d41986,10431153,00100500 HDA:10573055,104310c6,00100700'
   Controls : 22
   Simple ctrls : 13
DistroRelease: Ubuntu 9.10
HibernationDevice: RESUME=UUID=abf538c5-4738-48db-a72f-11d6e25cbe88
MachineType: ASUSTeK Computer Inc. A8J
NonfreeKernelModules: nvidia
Package: linux (not installed)
ProcCmdLine: root=UUID=80ae286d-b392-4ba0-b57b-6ded4fd0d5e8 ro quiet splash irqpoll
ProcEnviron:
SHELL=/bin/bash
PATH=(custom, no user)
LANG=en_US.UTF-8
ProcVersionSignature: Ubuntu 2.6.31-17.54-generic
RelatedPackageVersions: linux-firmware 1.26
RfKill:
0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
Uname: Linux 2.6.31-17-generic i686
UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare
WpaSupplicantLog:

dmi.bios.date: 03/27/2006
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: A8JAS.207
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: A8J
dmi.board.vendor: ASUSTeK Computer Inc.
dmi.board.version: 1.0
dmi.chassis.type: 10
dmi.chassis.vendor: ASUSTeK Computer Inc.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrA8JAS.207:bd03/27/2006:svnASUSTeKComputerInc.:pnA8J:pvr1.0:rvnASUSTeKComputerInc.:rnA8J:rvr1.0:cvnASUSTeKComputerInc.:ct10:cvr:
dmi.product.name: A8J
dmi.product.version: 1.0
dmi.sys.vendor: ASUSTeK Computer Inc.

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: AlsaDevices.txt

#12

AlsaDevices.txt Edit (644 bytes, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: AplayDevices.txt

#13

AplayDevices.txt Edit (385 bytes, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: ArecordDevices.txt

#14

ArecordDevices.txt Edit (268 bytes, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: BootDmesg.txt

#15

BootDmesg.txt Edit (45.5 KiB, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: Card0.Amixer.values.txt

#16

Card0.Amixer.values.txt Edit (2.5 KiB, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: Card0.Codecs.codec.0.txt

#17

Card0.Codecs.codec.0.txt Edit (8.3 KiB, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: Card0.Codecs.codec.1.txt

#18

Card0.Codecs.codec.1.txt Edit (145 bytes, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: CurrentDmesg.txt

#19

CurrentDmesg.txt Edit (10.6 KiB, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: IwConfig.txt

#20

IwConfig.txt Edit (617 bytes, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: Lspci.txt

#21

Lspci.txt Edit (12.5 KiB, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: Lsusb.txt

#22

Lsusb.txt Edit (407 bytes, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: PciMultimedia.txt

#23

PciMultimedia.txt Edit (616 bytes, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: ProcCpuinfo.txt

#24

ProcCpuinfo.txt Edit (1.2 KiB, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: ProcInterrupts.txt

#25

ProcInterrupts.txt Edit (1.6 KiB, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: ProcModules.txt

#26

ProcModules.txt Edit (2.9 KiB, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: UdevDb.txt

#27

UdevDb.txt Edit (92.2 KiB, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: UdevLog.txt

#28

UdevLog.txt Edit (201.1 KiB, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: WifiSyslog.txt

#29

WifiSyslog.txt Edit (1003.0 KiB, text/plain)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21: XsessionErrors.txt

#30

XsessionErrors.txt Edit (903 bytes, text/plain)

Changed in linux (Ubuntu):
status:	Incomplete → New
tags:	added: apport-collected

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-21:

#31

I don't have this option
...
Removing or commenting out the option 'resume=/dev/swap' from the grub installation file in /boot/grub/menu.lst should solve this.
...

in my GRUB, anyway the bug still there, see reports abowe.
and below syslog

Dec 21 19:48:57 my2912071352 kernel: [ 3102.000243] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 21 19:48:57 my2912071352 kernel: [ 3102.000266] ata1.01: cmd a0/00:00:00:00:00/00:00:00:00:00/b0 tag 0
Dec 21 19:48:57 my2912071352 kernel: [ 3102.000268] cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Dec 21 19:48:57 my2912071352 kernel: [ 3102.000271] res 40/00:03:00:00:00/00:00:00:00:00/b0 Emask 0x4 (timeout)
Dec 21 19:48:57 my2912071352 kernel: [ 3102.000279] ata1.01: status: { DRDY }
Dec 21 19:49:02 my2912071352 kernel: [ 3107.040123] ata1: link is slow to respond, please be patient (ready=0)
Dec 21 19:49:07 my2912071352 kernel: [ 3112.024124] ata1: device not ready (errno=-16), forcing hardreset
Dec 21 19:49:07 my2912071352 kernel: [ 3112.024139] ata1: soft resetting link
Dec 21 19:49:08 my2912071352 kernel: [ 3112.228653] ata1.00: configured for UDMA/100
Dec 21 19:49:08 my2912071352 kernel: [ 3112.268527] ata1.01: configured for PIO0
Dec 21 19:49:08 my2912071352 kernel: [ 3112.276533] ata1: EH complete

Revision history for this message

Marcel (marcel-vd-berg) wrote on 2009-12-21:

#32

In reply to kernel-janitor on 2009-08-25

I did not notice your request before, but this bug is not an issue for me anymore.
On 2008-11-03, I reinstalled 8.10 without AHCI enabled in the bios and I did not encounter any freezes.

At this moment I'm running Karmic with 2.6.31-16-generic, still with AHCI disabled in the bios.
(The only, possible related, problem since Karmic is the inability to auto-mount all partitions after a hard reset.)

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-23:

#33

As a supplementary infoo the asus A8J bios don't have AHCI option.
This could be a solution for someone... not yet a solution of the bug.
Any one found different solutions?

Revision history for this message

javi (javuchi) wrote on 2009-12-25:

#34

Same problem on a Samsung N140.
Windows and other Linux do not suffer this issue.

Revision history for this message

javi (javuchi) wrote on 2009-12-25:

#35

# tail /var/log/kern.log
Dec 25 16:55:18 baddha-laptop kernel: [ 250.816351] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 25 16:55:18 baddha-laptop kernel: [ 250.816395] ata1.00: cmd ca/00:08:da:4c:e6/00:00:00:00:00/ea tag 0 dma 4096 out
Dec 25 16:55:18 baddha-laptop kernel: [ 250.816402] res 40/00:0c:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
Dec 25 16:55:18 baddha-laptop kernel: [ 250.816417] ata1.00: status: { DRDY }
Dec 25 16:55:23 baddha-laptop kernel: [ 255.856306] ata1: link is slow to respond, please be patient (ready=0)
Dec 25 16:55:28 baddha-laptop kernel: [ 260.840305] ata1: device not ready (errno=-16), forcing hardreset
Dec 25 16:55:28 baddha-laptop kernel: [ 260.840330] ata1: soft resetting link
Dec 25 16:55:28 baddha-laptop kernel: [ 261.020666] ata1.00: configured for UDMA/100
Dec 25 16:55:28 baddha-laptop kernel: [ 261.020687] ata1.00: device reported invalid CHS sector 0
Dec 25 16:55:28 baddha-laptop kernel: [ 261.020720] ata1: EH complete

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-25:

#36

please have a look to bugs: # 297058, # 397096, # 279693
linking together different hypothesis on hdparm, linut-rt, AHCI, HW failures,and so on... without a real way out of the bug.
Actually running karmic -rt on ausu A8J and the same issue still remains.
....seriously after 2 years of ubuntu and this issue persecuting this laptop in different ways since the beginning, I'm thinking to come back to windows...
Like many users I use this pc for many different home & personal works and bugs like this are unacceptable, since the very long time and releases passed by, without never really coming out of this issue definitively!
PS: I'm sure the HW of this machine is in very good state.

Revision history for this message

javi (javuchi) wrote on 2009-12-27:

#37

Attention: a not simple workaround has been found for some of us (specially Samsung hard disk/bios users), look at this link:

http://wiki.archlinux.org/index.php/Samsung_N140

Please, kernel developers of Ubuntu, insert these workarounds in the next kernel version:

Here I copy the relevant parts:

---
Possible BIOS problem causes a SATA hardreset shortly after boot. This is unresolved up to Samsung N140 BIOS 04CU, and Samsung N130 BIOS 05CM, although a kernel patch is being investigated. See http://bugzilla.kernel.org/show_bug.cgi?id=14314, http://bugzilla.kernel.org/show_bug.cgi?id=13416, http://lkml.indiana.edu/hypermail/linux/kernel/0908.2/02809.html and http://lkml.indiana.edu/hypermail/linux/kernel/0911.3/01604.html .
A summary of the status as currently understood:
About 5 minutes after boot or resume, the BIOS switches on some power saving features which were not enabled at boot. It enables additional (sleepier) processor C-states, and sends power management instructions to the HDD. It does these behind the operating system's back -- not using ACPI, which would be handled correctly by Linux. Instead the sudden change results in a SATA exception at the first disk access following the switch. At that point the SATA driver resets the disk to resolve the problem. The result: the user sees a complete system freeze for about 30 seconds, after which operation of the machine continues normally. This can occur during the periodic fsck at boot if it is running at switch time. Either Samsung needs to be convinced to fix the BIOS, or the Linux kernel needs to be modified to behave more gracefully (Windows doesn't freeze noticeably if at all).
It has been reported that some OpenSUSE kernels [1] do not freeze and testing is progress in the Arch Forums. The patch libata-ata_piix-clear-spurious-IRQ has been reported to resolve the freezing problem. (Hint: to look at the rpm use rpmextract, and then untar config.tar.bz2 and patches.*.tar.bz2).
There is a kernel patch available which changes the backlight brightness using SMI instead of poking PCI config space. It provides a kernel module called "samsung-laptop". Interestingly a special (as yet unreleased?) BIOS for the N130 can be informed that the OS is Linux by a version of this patch which is included in OpenSUSE 11.1. The effect of this hasn't been published.
The N140 and N130 BIOSes have Phoenix FailSafe (you have been warned). It's not clear if the SATA problem has any relation to this.
Version 01CM of the N130 BIOS has been reported to not cause freezes, unlike all later ones which do.
This problem is hazardous for your filesystem so take precautions. For example use ext3 (not ext4) with option data=journal and install backup software.

Attention: a not simple workaround has been found for some of us (specially Samsung hard disk/bios users), look at this link:

http://wiki.archlinux.org/index.php/Samsung_N140

Please, kernel developers of Ubuntu, insert these workarounds in the next kernel version:

Here I copy the relevant parts:

---
Possible BIOS problem causes a SATA hardreset shortly after boot. This is unresolved up to Samsung N140 BIOS 04CU, and Samsung N130 BIOS 05CM, although a kernel patch is being investigated. See http://bugzilla.kernel.org/show_bug.cgi?id=14314, http://bugzilla.kernel.org/show_bug.cgi?id=13416, http://lkml.indiana.edu/hypermail/linux/kernel/0908.2/02809.html and http://lkml.indiana.edu/hypermail/linux/kernel/0911.3/01604.html .
A summary of the status as currently understood:
About 5 minutes after boot or resume, the BIOS switches on some power saving features which were not enabled at boot. It enables additional (sleepier) processor C-states, and sends power management instructions to the HDD. It does these behind the operating system's back -- not using ACPI, which would be handled correctly by Linux. Instead the sudden change results in a SATA exception at the first disk access following the switch. At that point the SATA driver resets the disk to resolve the problem. The result: the user sees a complete system freeze for about 30 seconds, after which operation of the machine continues normally. This can occur during the periodic fsck at boot if it is running at switch time. Either Samsung needs to be convinced to fix the BIOS, or the Linux kernel needs to be modified to behave more gracefully (Windows doesn't freeze noticeably if at all).
It has been reported that some OpenSUSE kernels [1] do not freeze and testing is progress in the Arch Forums. The patch libata-ata_piix-clear-spurious-IRQ has been reported to resolve the freezing problem. (Hint: to look at the rpm use rpmextract, and then untar config.tar.bz2 and patches.*.tar.bz2).
There is a kernel patch available which changes the backlight brightness using SMI instead of poking PCI config space. It provides a kernel module called "samsung-laptop". Interestingly a special (as yet unreleased?) BIOS for the N130 can be informed that the OS is Linux by a version of this patch which is included in OpenSUSE 11.1. The effect of this hasn't been published.
The N140 and N130 BIOSes have Phoenix FailSafe (you have been warned). It's not clear if the SATA problem has any relation to this.
Version 01CM of the N130 BIOS has been reported to not cause freezes, unlike all later ones which do.
This problem is hazardous for your filesystem so take precautions. For example use ext3 (not ext4) with option data=journal and install backup software.

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2009-12-29:

#38

the HD for A8J is an ST9100824A Ultra ATA/100 100GB (seagate)

Revision history for this message

Sandeep Wadhwa (wadhwa100) wrote on 2010-01-11:

#39

ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen.

I am facing the same problem on the N-128 Samsung Netbook with Ubuntu 9.10 UNR. The freeze happens for about 20 seconds after about 5 minutes of switching on. After that it dosen't repeat itself as long as the Netbook is ON. Next start again the same problem. Output from my dmesg:-

Monitor-Mwait will be used to enter C-2 state
[ 245.665234] Marking TSC unstable due to TSC halts in idle
[ 284.816184] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 284.816225] ata1.00: cmd ca/00:08:f6:1e:85/00:00:00:00:00/ed tag 0 dma 4096 out
[ 284.816232] res 40/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
[ 284.816246] ata1.00: status: { DRDY }
[ 289.856117] ata1: link is slow to respond, please be patient (ready=0)
[ 294.840109] ata1: device not ready (errno=-16), forcing hardreset
[ 294.840133] ata1: soft resetting link
[ 295.022466] ata1.00: configured for UDMA/133
[ 295.022487] ata1.00: device reported invalid CHS sector 0
[ 295.022516] ata1: EH complete

Revision history for this message

giorgio_fornara (giorgio-fornara) wrote on 2010-01-11:

#40

with today's last update the laptop is not more experiencing HD freezes,
maybe a temporary good combination between kernel and other modules.
Anyway NO kernel update as occoured since last bugs, only other modules updates.
hope still stable...
linux 2.6.31-9-rt
using ext3 filesystem

Jeremy Foshee (jeremyfoshee) on 2010-06-08

tags:	added: kernel-core kernel-needs-review removed: needs-upstream-testing
Changed in linux (Ubuntu):
status:	New → Triaged
importance:	Undecided → Medium

Chase Douglas (chasedouglas) on 2010-06-14

tags:

added: kernel-candidate kernel-reviewed
removed: kernel-needs-review

Jeremy Foshee (jeremyfoshee) on 2010-06-21

tags:

removed: kernel-candidate

giorgio_fornara (giorgio-fornara) on 2010-12-21

description:

updated

Revision history for this message

Raj B (bigwoof) wrote on 2011-01-03:

#97

fwiw, I am having the same problems with the latest Natty packages.

The kernel is
Linux mythtv 2.6.37-11-generic #25-Ubuntu SMP Tue Dec 21 23:42:56 UTC 2010 x86_64 GNU/Linux

My setup is
Asus P7P55D-E LX Motherboard with the Marvell 88SE9125 SATA 3 controller
Intel i5 760 CPU
PC1333 8G RAM (4GB x 2 Corsair)
5 Sata 2 2 TB Disks

1 Sata 3 Seagate 2 TB Disk (used as the boot disk and the disk on which everything below is being written). plugged in as SATA 3 using AHCI

1 PVR 150 Video Capture card
Asus ENGT430 Graphics Card

I'm getting a ton of
[ 5884.881538] ata9.00: failed command: WRITE FPDMA QUEUED
[ 5884.909807] ata9.00: cmd 61/08:20:f0:71:c5/00:00:73:00:00/40 tag 4 ncq 4096 out
[ 5884.909810] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 5885.022592] ata9.00: status: { DRDY }
[ 5885.051123] ata9: hard resetting link
[ 5885.591937] ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 370)
[ 5885.593771] ata9.00: configured for UDMA/133
[ 5885.593996] ata9.00: device reported invalid CHS sector 0
[ 5885.594002] ata9.00: device reported invalid CHS sector 0
[ 5885.594013] ata9: EH complete

errors in the logs. I noticed this because ivtv was outputting a ton of "Unable to Save MPG stream" errors. I thought it was a bug in ivtv but now I realize that it was a SATA 3 error and that the drive had become read-only.

this has happened a few times now and the system is locked up hard each time. It can ping but nothing is running. I had ssh access during one of these events and nothing worked (reboot, all process were in zombie state, etc.). which makes sense as the root drive was now read-only and inoperable.

I've lost data because of this as well. my entire /var/lib/mysql directory was blown away and recovered into lost+found. other directories are there as well.

I'm going to a) switch the sata 3 drive to the sata 2 controller, and b) reinstall ubuntu (as I'm not sure what went missing with the latest crash). I'm a little surprised that this bug has remained through multiple kernel revisions.

Raj

fwiw, I am having the same problems with the latest Natty packages.

The kernel is 
Linux mythtv 2.6.37-11-generic #25-Ubuntu SMP Tue Dec 21 23:42:56 UTC 2010 x86_64 GNU/Linux

My setup is
Asus P7P55D-E LX Motherboard with the Marvell 88SE9125 SATA 3 controller
Intel i5 760 CPU
PC1333 8G RAM (4GB x 2 Corsair)
5 Sata 2  2 TB Disks

1 Sata 3 Seagate 2 TB Disk (used as the boot disk and the disk on which everything below is being written). plugged in as SATA 3 using AHCI

1 PVR 150 Video Capture card
Asus ENGT430 Graphics Card

I'm getting a ton of 
[ 5884.881538] ata9.00: failed command: WRITE FPDMA QUEUED
[ 5884.909807] ata9.00: cmd 61/08:20:f0:71:c5/00:00:73:00:00/40 tag 4 ncq 4096 out
[ 5884.909810]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 5885.022592] ata9.00: status: { DRDY }
[ 5885.051123] ata9: hard resetting link
[ 5885.591937] ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 370)
[ 5885.593771] ata9.00: configured for UDMA/133
[ 5885.593996] ata9.00: device reported invalid CHS sector 0
[ 5885.594002] ata9.00: device reported invalid CHS sector 0
[ 5885.594013] ata9: EH complete

errors in the logs. I noticed this because ivtv was outputting a ton of "Unable to Save MPG stream" errors. I thought it was a bug in ivtv but now I realize that it was a SATA 3 error and that the drive had become read-only.

this has happened a few times now and the system is locked up hard each time. It can ping but nothing is running. I had ssh access during one of these events and nothing worked (reboot, all process were in zombie state, etc.). which makes sense as the root drive was now read-only and inoperable.

I've lost data because of this as well. my entire /var/lib/mysql directory was blown away and recovered into lost+found. other directories are there as well.

I'm going to a) switch the sata 3 drive to the sata 2 controller, and b) reinstall ubuntu (as I'm not sure what went missing with the latest crash). I'm a little surprised that this bug has remained through multiple kernel revisions.

Raj

Revision history for this message

DFOXpro (dfoxpro) wrote on 2011-01-18:

#98

[Español, por favor traducir]

Yo tengo este error y después de eso parece como si me bloqueara el disco (o me lo dañara) por q' ese error pasa a ser persistente en windows hasta q' lanza el pantallazo azul:

Jan 18 14:08:03 familia-K7S41GX kernel: [14769.056043] ata2: lost interrupt (Status 0x50)
Jan 18 14:08:03 familia-K7S41GX kernel: [14769.056157] ata2: soft resetting link
Jan 18 14:08:03 familia-K7S41GX kernel: [14769.233285] ata2.00: configured for UDMA/33
Jan 18 14:08:03 familia-K7S41GX kernel: [14769.233302] ata2.00: device reported invalid CHS sector 0
Jan 18 14:08:03 familia-K7S41GX kernel: [14769.233327] ata2: EH complete
Jan 18 14:09:17 familia-K7S41GX kernel: [14843.008081] ata2: lost interrupt (Status 0x50)
Jan 18 14:09:17 familia-K7S41GX kernel: [14843.008121] ata2.00: limiting speed to UDMA/25:PIO4
Jan 18 14:09:17 familia-K7S41GX kernel: [14843.008203] ata2: soft resetting link
Jan 18 14:09:17 familia-K7S41GX kernel: [14843.184811] ata2.00: configured for UDMA/25
Jan 18 14:09:17 familia-K7S41GX kernel: [14843.184828] ata2.00: device reported invalid CHS sector 0
Jan 18 14:09:17 familia-K7S41GX kernel: [14843.184852] ata2: EH complete

Bug Watch Updater (bug-watch-updater) on 2011-01-24

Changed in linux:
status:	Unknown → Fix Released

Revision history for this message

PsYcHoK9 (psychok9) wrote on 2011-02-06:

#99

I've solved temporaly adding this parameter on kernel:
libata.force=noncq

The fix when will released?

Revision history for this message

In Red Hat Bugzilla #684599, rick (rick-redhat-bugs) wrote on 2011-03-13:

#113

Description of problem:
Boot pauses for a while, after which the following error is shown:
[ 71.776103] ata2.00: failed command: IDENTIFY PACKET DEVICE

From dmesg:
[code]
[ 71.776063] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 71.776103] ata2.00: failed command: IDENTIFY PACKET DEVICE
[ 71.776132] ata2.00: cmd a1/00:01:00:00:00/00:00:00:00:00/00 tag 0 pio 512 in
[ 71.776133] res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
[ 71.776201] ata2.00: status: { DRDY }
[ 71.776222] ata2: hard resetting link
[ 72.236064] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 73.038369] ata2.00: configured for PIO4
[ 73.838986] ata2: EH complete
[/code]

Version-Release number of selected component (if applicable):
$ uname -a
Linux rickPC 2.6.38-0.rc8.git0.1.fc15.x86_64 #1 SMP Tue Mar 8 08:22:15 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

$ udevd --version
166

How reproducible:
Always

Steps to Reproduce:
1. Boot
2. dmesg

Actual results:
Boot pauses for a while, with error

Expected results:
Boot continues normally

Additional info:
At the archlinux forums they have a topic about this (and a workaround).
https://bbs.archlinux.org/viewtopic.php?pid=894147

Revision history for this message

In Red Hat Bugzilla #684599, Harald (harald-redhat-bugs) wrote on 2011-03-14:

#114

David, any idea about this?

Revision history for this message

Felix Joussein (felix-joussein) wrote on 2011-03-15:

#100

please read my comment here, maybe this is related?
https://bugs.launchpad.net/ubuntu/+bug/550559/comments/41

Revision history for this message

In Red Hat Bugzilla #684599, David (david-redhat-bugs) wrote on 2011-03-15:

#115

(In reply to comment #1)
> David, any idea about this?

Yes, it's most probably caused by this commit

http://git.kernel.org/?p=linux/hotplug/udev.git;a=commitdiff;h=560de575148b7efda3b34a7f7073abd483c5f08e

Looks to me like a hardware problem, not sure how to best work around it... let's start with investigate and get more details from the reporter...

Reporter: what kind of hardware do you see this problem with?

Revision history for this message

In Red Hat Bugzilla #684599, rick (rick-redhat-bugs) wrote on 2011-03-15:

#116

Created attachment 485532
Hardware information

Revision history for this message

In Red Hat Bugzilla #684599, David (david-redhat-bugs) wrote on 2011-03-15:

#117

(In reply to comment #3)
> Created attachment 485532 [details]
> Hardware information

OK, so it's a TSSTcorp CDDVDW SH-S223C . Looks like a run of the mill

Btw, please stick to textual information in the future - it's much easier to deal with that way!

Please try running

/lib/udev/ata_id /dev/sr0

as root and paste the result here. Thanks!

Revision history for this message

In Red Hat Bugzilla #684599, David (david-redhat-bugs) wrote on 2011-03-15:

#118

(In reply to comment #4)
> (In reply to comment #3)
> > Created attachment 485532 [details]
> > Hardware information
>
> OK, so it's a TSSTcorp CDDVDW SH-S223C . Looks like a run of the mill
>
> Btw, please stick to textual information in the future - it's much easier to
> deal with that way!
>
> Please try running
>
> /lib/udev/ata_id /dev/sr0
>
> as root and paste the result here. Thanks!

Sorry, I forgot the --export option. Please run

/lib/udev/ata_id --export /dev/sr0

as root and paste the result. Thanks!

Revision history for this message

In Red Hat Bugzilla #684599, rick (rick-redhat-bugs) wrote on 2011-03-15:

#119

[root@rickPC ~]# time /lib/udev/ata_id --export /dev/sr0
ID_ATA=1
ID_TYPE=cd
ID_BUS=ata
ID_MODEL=TSSTcorp_CDDVDW_SH
ID_MODEL_ENC=TSSTcorp\x20CDDVDW\x20SH
ID_REVISION=SB05
ID_SERIAL=TSSTcorp_CDDVDW_SH_R4136GHZC20180
ID_SERIAL_SHORT=R4136GHZC20180

real 0m32.848s
user 0m0.000s
sys 0m0.003s

Revision history for this message

In Red Hat Bugzilla #684599, David (david-redhat-bugs) wrote on 2011-03-15:

#120

(In reply to comment #6)
> [root@rickPC ~]# time /lib/udev/ata_id --export /dev/sr0
> ID_ATA=1
> ID_TYPE=cd
> ID_BUS=ata
> ID_MODEL=TSSTcorp_CDDVDW_SH
> ID_MODEL_ENC=TSSTcorp\x20CDDVDW\x20SH
> ID_REVISION=SB05
> ID_SERIAL=TSSTcorp_CDDVDW_SH_R4136GHZC20180
> ID_SERIAL_SHORT=R4136GHZC20180
>
> real 0m32.848s
> user 0m0.000s
> sys 0m0.003s

Interesting. So we basically get the information we request, it just takes a lot of time. What is logged to dmesg while doing this?

(Hmm, we really need better debugging infrastructure in ata_id - I'll try to write a patch for that.)

Revision history for this message

In Red Hat Bugzilla #684599, rick (rick-redhat-bugs) wrote on 2011-03-15:

#121

From /var/log/messages, same info in dmesg but without timestamps.

Mar 15 20:32:00 rickPC kernel: [12878.752119] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Mar 15 20:32:00 rickPC kernel: [12878.752133] ata2.00: failed command: IDENTIFY PACKET DEVICE
Mar 15 20:32:00 rickPC kernel: [12878.752150] ata2.00: cmd a1/00:01:00:00:00/00:00:00:00:00/00 tag 0 pio 512 in
Mar 15 20:32:00 rickPC kernel: [12878.752154] res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
Mar 15 20:32:00 rickPC kernel: [12878.752163] ata2.00: status: { DRDY }
Mar 15 20:32:00 rickPC kernel: [12878.752176] ata2: hard resetting link
Mar 15 20:32:00 rickPC kernel: [12879.212095] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Mar 15 20:32:01 rickPC kernel: [12880.015050] ata2.00: configured for PIO4
Mar 15 20:32:02 rickPC kernel: [12880.815639] ata2: EH complete

Revision history for this message

In Red Hat Bugzilla #684599, David (david-redhat-bugs) wrote on 2011-03-15:

#122

Jeff (or other kernel hackers with ATA expereince) : any idea what is going on here? Thanks

Revision history for this message

Mesa (astewart-deactivatedaccount-deactivatedaccount-deactivatedaccount) wrote on 2011-03-16:

#101

Since upgrading to natty alpha about month ago the issue has more or less disappeared for me (see my comment earlier re problem in lucid and maverick) - previously used to get it every single boot (i.e. daily) plus intermittently on top - since upgrading only had the issue once.

Note that upgrading to an alpha release is a bad idea for most - wait for proper release unless you can live with the breakages.

Would be good to know if it's also now fixed for Raj B as he was running an earlier version of Natty than I.

Revision history for this message

In Red Hat Bugzilla #684599, Chuck (chuck-redhat-bugs) wrote on 2011-03-17:

#123

I found this thread:
https://bbs.archlinux.org/viewtopic.php?pid=895404

which says that some devices may not support 16-byte ATA commands. For our kernel, you would disable them by adding "libata.atapi_passthru16=0" to the kernel boot options, though this is just a workaround.

Revision history for this message

Jarek T. (ulvhedin) wrote on 2011-03-27:

#102

Hi, it looks that this problem still exist in 2.6.38 kernel.
Is someone work on this maybe?

Regards
Jarek

Revision history for this message

Jarek T. (ulvhedin) wrote on 2011-03-27:

#103

dmesg Edit (91.2 KiB, text/plain)

Revision history for this message

meWho (mewho) wrote on 2011-04-01:

#104

dmesg output Edit (2.7 KiB, text/plain)

Hi, I also can confirm the problem, because I am experiencing it with Ubuntu 10.10 2.6.35-28-generic on Dell Latitude E6500. Dell's Diagnostic Tool reports no errors (I have checked it several times).

Revision history for this message

Travis Ogdon (togdon) wrote on 2011-04-19:

#105

I'm still seeing the problem in the most recent build of Natty as well:

2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:24 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
ext4 / partition for the whole drive (minus a bit of swap)

This is on a new (to me) box that was happily running Windows 7 prior to me installing Natty Beta2.

Hopefully relevant hardware information:

SAMSUNG Spinpoint F1 HD753LJ 750GB hard drive.
Intel Core i7-920 processor
JMicron Technology Corp. JMB362/JMB363 Serial ATA Controller (rev 03)
nVidia Corporation G92 [GeForce 9800 GT] (rev a2) (with the proprietary driver installed)

It happens if I set the SATA controller either in IDE or AHCI mode in the BIOS.

I've also tried the vanilla mainline kernel (2.6.38-02063803-generic #201104150912 SMP Fri Apr 15 09:15:15 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux from http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.38.3-natty/) and had the same failures.

NCQ seems to be supported everywhere that matters:

[ 3.181549] ata1.00: ATA-7: SAMSUNG HD753LJ, 1AA01114, max UDMA7
[ 3.181551] ata1.00: 1465149168 sectors, multi 16: LBA48 NCQ (depth 0/32)
...
[ 5.427976] ahci 0000:02:00.0: version 3.0
[ 5.427984] ahci 0000:02:00.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19
[ 5.455655] ahci 0000:02:00.0: AHCI 0001.0000 32 slots 2 ports 3 Gbps 0x3 impl SATA mode
[ 5.455661] ahci 0000:02:00.0: flags: 64bit ncq pm led clo pmp pio slum part
[ 5.455668] ahci 0000:02:00.0: setting latency timer to 64
[ 5.456043] scsi6 : ahci
[ 5.456267] scsi7 : ahci
[ 5.456348] ata7: SATA max UDMA/133 abar m8192@0xf7dfe000 port 0xf7dfe100 irq 19
[ 5.456353] ata8: SATA max UDMA/133 abar m8192@0xf7dfe000 port 0xf7dfe180 irq 19

Here's the error that I see:

[ 190.934060] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 190.934065] ata1.00: failed command: FLUSH CACHE EXT
[ 190.934073] ata1.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[ 190.934074] res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 190.934077] ata1.00: status: { DRDY }
[ 190.934087] ata1.00: hard resetting link
[ 191.283287] ata1.01: hard resetting link
[ 196.822167] ata1.00: link is slow to respond, please be patient (ready=0)
[ 200.953882] ata1.00: SRST failed (errno=-16)
[ 200.953889] ata1.00: hard resetting link
[ 201.303204] ata1.01: hard resetting link
[ 206.842098] ata1.00: link is slow to respond, please be patient (ready=0)

What's odd is that the box seemed ok until I started copying my data over to it, now it's essentially unusable.

I can consistently lock up the drive by running a SMART scan on it (smartctl --test=short /dev/sda), so if someone needs a way to consistently repeat the error, there you go.

I'm still seeing the problem in the most recent build of Natty as well:

2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:24 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
ext4 / partition for the whole drive (minus a bit of swap)

This is on a new (to me) box that was happily running Windows 7 prior to me installing Natty Beta2.

Hopefully relevant hardware information:

SAMSUNG Spinpoint F1 HD753LJ 750GB hard drive. 
Intel Core i7-920 processor
JMicron Technology Corp. JMB362/JMB363 Serial ATA Controller (rev 03)
nVidia Corporation G92 [GeForce 9800 GT] (rev a2) (with the proprietary driver installed)

It happens if I set the SATA controller either in IDE or AHCI mode in the BIOS.

I've also tried the vanilla mainline kernel (2.6.38-02063803-generic #201104150912 SMP Fri Apr 15 09:15:15 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux from http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.38.3-natty/) and had the same failures.

NCQ seems to be supported everywhere that matters:

[    3.181549] ata1.00: ATA-7: SAMSUNG HD753LJ, 1AA01114, max UDMA7
[    3.181551] ata1.00: 1465149168 sectors, multi 16: LBA48 NCQ (depth 0/32)
...
[    5.427976] ahci 0000:02:00.0: version 3.0
[    5.427984] ahci 0000:02:00.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19
[    5.455655] ahci 0000:02:00.0: AHCI 0001.0000 32 slots 2 ports 3 Gbps 0x3 impl SATA mode
[    5.455661] ahci 0000:02:00.0: flags: 64bit ncq pm led clo pmp pio slum part 
[    5.455668] ahci 0000:02:00.0: setting latency timer to 64
[    5.456043] scsi6 : ahci
[    5.456267] scsi7 : ahci
[    5.456348] ata7: SATA max UDMA/133 abar m8192@0xf7dfe000 port 0xf7dfe100 irq 19
[    5.456353] ata8: SATA max UDMA/133 abar m8192@0xf7dfe000 port 0xf7dfe180 irq 19

Here's the error that I see:

[  190.934060] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  190.934065] ata1.00: failed command: FLUSH CACHE EXT
[  190.934073] ata1.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[  190.934074]          res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[  190.934077] ata1.00: status: { DRDY }
[  190.934087] ata1.00: hard resetting link
[  191.283287] ata1.01: hard resetting link
[  196.822167] ata1.00: link is slow to respond, please be patient (ready=0)
[  200.953882] ata1.00: SRST failed (errno=-16)
[  200.953889] ata1.00: hard resetting link
[  201.303204] ata1.01: hard resetting link
[  206.842098] ata1.00: link is slow to respond, please be patient (ready=0)

What's odd is that the box seemed ok until I started copying my data over to it, now it's essentially unusable.

I can consistently lock up the drive by running a SMART scan on it (smartctl --test=short /dev/sda), so if someone needs a way to consistently repeat the error, there you go.

Revision history for this message

Sebastián Salazar Molina. (sebasalazar) wrote on 2011-05-01:

#106

Same issue here.

OS: Ubuntu Server lucid
Kernel: 2.6.38-7-generic (kernel-ppa)
HD: WDC WD10EARS-00Y

Error:

[ 222.848056] ata2: lost interrupt (Status 0x50)
[ 222.848094] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 222.848166] ata2.00: failed command: READ DMA EXT
[ 222.848234] ata2.00: cmd 25/00:60:e0:a6:a7/00:00:4d:00:00/e0 tag 0 dma 49152 in
[ 222.848238] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 222.848353] ata2.00: status: { DRDY }
[ 222.848414] ata2: soft resetting link
[ 223.056419] ata2.00: configured for UDMA/133
[ 223.056436] ata2.00: device reported invalid CHS sector 0
[ 223.056465] ata2: EH complete
[ 973.008042] ata2: lost interrupt (Status 0x50)
[ 973.008080] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 973.008111] ata2.00: failed command: READ DMA EXT
[ 973.008140] ata2.00: cmd 25/00:08:78:c3:b1/00:00:6a:00:00/e0 tag 0 dma 4096 in
[ 973.008144] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 973.008179] ata2.00: status: { DRDY }
[ 973.008201] ata2: soft resetting link
[ 973.697072] ata2.00: configured for UDMA/133
[ 973.697090] ata2.00: device reported invalid CHS sector 0
[ 973.697116] ata2: EH complete
[ 3694.048044] ata2: lost interrupt (Status 0x50)
[ 3694.048081] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 3694.048113] ata2.00: failed command: WRITE DMA EXT
[ 3694.048142] ata2.00: cmd 35/00:20:e8:2b:a3/00:00:64:00:00/e0 tag 0 dma 16384 out
[ 3694.048146] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 3694.048182] ata2.00: status: { DRDY }
[ 3694.048211] ata2: soft resetting link
[ 3694.244372] ata2.00: configured for UDMA/133
[ 3694.244389] ata2.00: device reported invalid CHS sector 0
[ 3694.244415] ata2: EH complete

Revision history for this message

Liunx (liunx163) wrote on 2011-06-25:

#107

have the same problem recently.
ubuntu11.04 natty
Linux enet 2.6.38-8-server #42-Ubuntu SMP Mon Apr 11 03:49:04 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
name WDC WD3200AAJS-00L7A0
size 320 GB
speed 7200 r/m
cache 8 MB
interface SATA Rev 2.5
transrate 300 MB/s
feature S.M.A.R.T, 48-bit LBA, NCQ
[ 5216.002643] ata1: soft resetting link
[ 5216.180056] ata1.01: NODEV after polling detection
[ 5216.180062] ata1.01: revalidation failed (errno=-2)
[ 5221.160032] ata1: soft resetting link
[ 5221.380820] ata1.00: configured for UDMA/133
[ 5221.420145] ata1.01: configured for UDMA/100
[ 5221.420940] ata1: EH complete
[ 5276.002197] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 5276.002205] ata1.01: ST_FIRST: !(DRQ|ERR|DF)
[ 5276.002212] sr 0:0:1:0: CDB: Get event status notification: 4a 01 00 00 10 00 00 00 08 00
[ 5276.002233] ata1.01: cmd a0/00:00:00:08:00/00:00:00:00:00/b0 tag 0 pio 16392 in
[ 5276.002236] res 00/00:00:00:08:00/00:00:00:00:00/b0 Emask 0x2 (HSM violation)
[ 5276.002249] ata1: soft resetting link
[ 5276.170183] ata1.01: NODEV after polling detection
[ 5276.170189] ata1.01: revalidation failed (errno=-2)
[ 5281.170031] ata1: soft resetting link
[ 5281.410467] ata1.00: configured for UDMA/133
[ 5281.450172] ata1.01: configured for UDMA/100
[ 5281.450991] ata1: EH complete

Bug Watch Updater (bug-watch-updater) on 2011-08-11

Changed in udev (Debian):
status:	Unknown → Confirmed

Revision history for this message

John Doe (b2109455) wrote on 2011-09-21:

#108

hwinfo.txt Edit (3.3 KiB, text/plain)

I have the same problem on a Asus E35M1-I DELUXE with two samsung spinpoints F1's and one F4 ecogreen, running Ubuntu 11.04 Linux 2.6.38-8-server:

[68623.060362] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[68623.060521] ata2.00: failed command: WRITE DMA EXT
[68623.060626] ata2.00: cmd 35/00:00:9f:a2:05/00:04:40:00:00/e0 tag 0 dma 524288 out
[68623.060630] res 40/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
[68623.060896] ata2.00: status: { DRDY }
[68623.060976] ata2: hard resetting link
[68633.090302] ata2: softreset failed (device not ready)
[68633.090423] ata2: hard resetting link
[68643.120265] ata2: softreset failed (device not ready)
[68643.120387] ata2: hard resetting link
[68653.750324] ata2: link is slow to respond, please be patient (ready=0)
[68678.170307] ata2: softreset failed (device not ready)
[68678.170429] ata2: limiting SATA link speed to 1.5 Gbps
[68678.170439] ata2: hard resetting link
[68683.380272] ata2: softreset failed (device not ready)
[68683.380391] ata2: reset failed, giving up
[68683.380471] ata2.00: disabled
[68683.380491] ata2.00: device reported invalid CHS sector 0
[68683.380524] ata2: EH complete

Turning off NCQ didn't help, Smartctl and fsck didn't reveal any problems.
Pretty annoying bug, which lingers around for a long time.

Revision history for this message

In Red Hat Bugzilla #684599, Fedora (fedora-redhat-bugs) wrote on 2011-10-20:

#124

This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.

Revision history for this message

Swâmi Petaramesh (swami-petaramesh) wrote on 2011-12-10:

#109

This bug might be the same as bug #640525 that I encounter on a Dell Inspiron 9300 with a Samsung HM160HC HD.

It's still present in Oneiric with all kernels up to and including 3.0.0-14-generic.

Symptom : System very often boots either while starting a KDE or Gnome session (right after bootup), or when waking up from resume. The lock corresponds to a steady lit HD LED and entries looking very much like previous comment's.

Revision history for this message

Swâmi Petaramesh (swami-petaramesh) wrote on 2011-12-11:

#110

Sorry, I wrote "System very often boots" when I meant "System very often HANGS", this correction may make my comment above more understandable ;-)

Revision history for this message

Sergei Andreev (seajey) wrote on 2011-12-17:

#111

Same error here:

Linux Bellerophon-117 3.0.0-15-generic #24-Ubuntu SMP Mon Dec 12 15:23:55 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

Model Family: Seagate Barracuda 7200.11
Device Model: ST31500341AS
Serial Number: 9VS0931W
LU WWN Device Id: 5 000c50 01051e2ab
Firmware Version: SD17
User Capacity: 1 500 301 910 016 bytes [1,50 TB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Sun Dec 18 00:34:09 2011 MSK

17.12.11 23:29:42 Bellerophon-117 kernel [33659.872047] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
17.12.11 23:29:42 Bellerophon-117 kernel [33659.872055] ata3.00: failed command: FLUSH CACHE EXT
17.12.11 23:29:42 Bellerophon-117 kernel [33659.872065] ata3.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
17.12.11 23:29:42 Bellerophon-117 kernel [33659.872067] res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
17.12.11 23:29:42 Bellerophon-117 kernel [33659.872072] ata3.00: status: { DRDY }
17.12.11 23:29:42 Bellerophon-117 kernel [33659.872079] ata3: hard resetting link
17.12.11 23:29:42 Bellerophon-117 kernel [33660.364022] ata3: softreset failed (device not ready)
17.12.11 23:29:42 Bellerophon-117 kernel [33660.364029] ata3: applying SB600 PMP SRST workaround and retrying
17.12.11 23:29:42 Bellerophon-117 kernel [33660.536032] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
17.12.11 23:29:42 Bellerophon-117 kernel [33660.539209] ata3.00: configured for UDMA/133
17.12.11 23:29:42 Bellerophon-117 kernel [33660.539213] ata3.00: retrying FLUSH 0xea Emask 0x4
17.12.11 23:29:42 Bellerophon-117 kernel [33660.552015] ata3.00: device reported invalid CHS sector 0
17.12.11 23:29:42 Bellerophon-117 kernel [33660.552026] ata3: EH complete

Revision history for this message

In Red Hat Bugzilla #684599, Jeff (jeff-redhat-bugs) wrote on 2012-07-19:

#128

Reassigning to kernel pkg, as this appears to be a kernel problem.

TSSTcorp devices have special workarounds in the kernel. In ata_device_blacklist[] we see

/* Maybe we should just blacklist TSSTcorp... */
{ "TSSTcorp CDDVDW SH-S202[HJN]", "SB0[01]", ATA_HORKAGE_IVB, },

It is worth attempting to emulate what the above workaround attempts to accomplish with ATA_HORKAGE_IVB, which is to force the cable type.

Kernel module "libata" accepts the "force" module parameter.

Try setting libata.force=40c to force a 40c cable type.

Revision history for this message

In Red Hat Bugzilla #684599, Dave (dave-redhat-bugs) wrote on 2012-10-23:

#129

# Mass update to all open bugs.

Kernel 3.6.2-1.fc16 has just been pushed to updates.
This update is a significant rebase from the previous version.

Please retest with this kernel, and let us know if your problem has been fixed.

In the event that you have upgraded to a newer release and the bug you reported
is still present, please change the version field to the newest release you have
encountered the issue with. Before doing so, please ensure you are testing the
latest kernel update in that release and attach any new and relevant information
you may have gathered.

If you are not the original bug reporter and you still experience this bug,
please file a new report, as it is possible that you may be seeing a
different problem.
(Please don't clone this bug, a fresh bug referencing this bug in the comment is sufficient).

Revision history for this message

In Red Hat Bugzilla #684599, rick (rick-redhat-bugs) wrote on 2012-10-23:

#130

Problem persists.

[root@xuplin rick]# uname -a
Linux xuplin 3.6.2-1-ARCH #1 SMP PREEMPT Fri Oct 12 23:58:58 CEST 2012 x86_64 GNU/Linux

[root@xuplin rick]# udevd --version
194

[root@xuplin rick]# time /lib/udev/ata_id --export /dev/sr0
ID_ATA=1
ID_TYPE=cd
ID_BUS=ata
ID_MODEL=TSSTcorp_CDDVDW_SH____________________qQ
ID_MODEL_ENC=TSSTcorp\x20CDDVDW\x20SH\x21\x08\x3e\x04\x27\x3e\x3e\x98\x21\x08\x3e\x84\x27\x3e\x3e\x98\x21\x08\x3e\x84qQ
ID_REVISION=SB05
ID_SERIAL=TSSTcorp_CDDVDW_SH____________________qQ_R4136GHZC20180
ID_SERIAL_SHORT=R4136GHZC20180

real 0m33.045s
user 0m0.000s
sys 0m0.000s

Revision history for this message

In Red Hat Bugzilla #684599, Antoine (antoine-redhat-bugs) wrote on 2012-11-14:

#131

$ time /lib/udev/ata_id --export /dev/sr0
ID_ATA=1
ID_TYPE=cd
ID_BUS=ata
ID_MODEL=TSSTcorp_CDDVDW_SHA___4_
ID_MODEL_ENC=TSSTcorp\x20CDDVDW\x20SHA\x29\xfd\x604\x19
ID_REVISION=SB04
ID_SERIAL=TSSTcorp_CDDVDW_SHA___4__R78U6GEZ707218
ID_SERIAL_SHORT=R78U6GEZ707218
ID_ATA_WRITE_CACHE=1
ID_ATA_WRITE_CACHE_ENABLED=1
ID_ATA_FEATURE_SET_HPA=1
ID_ATA_FEATURE_SET_HPA_ENABLED=1
ID_ATA_FEATURE_SET_PM=1
ID_ATA_FEATURE_SET_PM_ENABLED=0
ID_ATA_FEATURE_SET_SMART=1
ID_ATA_FEATURE_SET_SMART_ENABLED=0
ID_ATA_SATA=1
0.00user 0.00system 0:32.78elapsed 0%CPU (0avgtext+0avgdata 692maxresident)k
56inputs+0outputs (1major+210minor)pagefaults 0swaps

$ uname -a
Linux desktop 3.6.6-1.fc17.x86_64 #1 SMP Mon Nov 5 21:59:35 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

$ /usr/lib/udev/udevd --version
182

Revision history for this message

In Red Hat Bugzilla #684599, Miro (miro-redhat-bugs) wrote on 2012-11-18:

#132

Download full text (3.4 KiB)

Hello,

I was able to solve this issue on my configuration so I hope this will help somebody. Recently I have migrated my CentOS 6.3 server from Intel based motherboard Gigabyte GA-P55-UD3L (rev. 1.0) to AMD based motherboard MSI 790FX-GD70. The only change in the system was the motherboard, cpu and memory, the rest of the configuration stayed the same. Immediately after migration I have started experiencing this exact issue

During boot I was getting errors

Nov 17 23:41:12 storage kernel: ata6.00: qc timeout (cmd 0xec)
Nov 17 23:41:12 storage kernel: ata6.00: failed to IDENTIFY (I/O error, err_mask=0x5)
Nov 17 23:41:12 storage kernel: ata6.00: revalidation failed (errno=-5)
Nov 17 23:41:12 storage kernel: ata6: hard resetting link
Nov 17 23:41:13 storage kernel: ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Nov 17 23:41:13 storage kernel: ata6.00: configured for UDMA/133
Nov 17 23:41:13 storage kernel: sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Nov 17 23:41:13 storage kernel: sd 5:0:0:0: [sdb] Sense Key : Aborted Command [current] [descriptor]
Nov 17 23:41:13 storage kernel: Descriptor sense data with sense descriptors (in hex):
Nov 17 23:41:13 storage kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
Nov 17 23:41:13 storage kernel: 91 40 16 4e
Nov 17 23:41:13 storage kernel: sd 5:0:0:0: [sdb] Add. Sense: No additional sense information
Nov 17 23:41:13 storage kernel: sd 5:0:0:0: [sdb] CDB: Read(10): 28 00 91 40 1b 4f 00 00 c0 00
Nov 17 23:41:13 storage kernel: ata6: EH complete
N

After boot this error transformed to

Nov 17 21:05:26 storage kernel: ata6.00: qc timeout (cmd 0xec)
Nov 17 21:05:26 storage kernel: ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Nov 17 21:05:26 storage kernel: ata6.00: revalidation failed (errno=-5)
Nov 17 21:05:26 storage kernel: ata6: hard resetting link
Nov 17 21:05:27 storage kernel: ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Nov 17 21:05:27 storage kernel: ata6.00: configured for UDMA/33
Nov 17 21:05:27 storage kernel: ata6: EH complete
Nov 17 21:06:26 storage kernel: ata6.00: exception Emask 0x50 SAct 0x1 SErr 0x280900 action 0x6 frozen
Nov 17 21:06:26 storage kernel: ata6.00: irq_stat 0x08000000, interface fatal error
Nov 17 21:06:26 storage kernel: ata6: SError: { UnrecovData HostInt 10B8B BadCRC }
Nov 17 21:06:26 storage kernel: ata6.00: failed command: READ FPDMA QUEUED
Nov 17 21:06:26 storage kernel: ata6.00: cmd 60/30:00:3f:9b:cc/00:00:8e:00:00/40 tag 0 ncq 24576 in
Nov 17 21:06:26 storage kernel: res 40/00:00:3f:9b:cc/00:00:8e:00:00/40 Emask 0x50 (ATA bus error)
Nov 17 21:06:26 storage kernel: ata6.00: status: { DRDY }
Nov 17 21:06:26 storage kernel: ata6: hard resetting link
Nov 17 21:06:31 storage kernel: ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

In case it matters, these are the disks used in the system
Nov 17 23:53:58 storage kernel: ata6.00: ATA-8: ST31500341AS, SD1B, max UDMA/133
Nov 17 23:53:58 storage kernel: ata5.00: ATA-8: WDC WD20EADS-00S2B0, 01.00A01, max UDMA/133
Nov 17 23:53:58 storage kernel: ata7.00: ATA-6: ST380011A, 3.16, max UDMA/100
Nov 17 23:53:58 storage kernel: ata7.01: ATA-6: ST38...

Changed in udev (Fedora):
importance:	Unknown → Medium
status:	Unknown → Won't Fix

no longer affects:	linux (Ubuntu)
affects:	udev (Fedora) → linux (Ubuntu)
Changed in linux (Ubuntu):
importance:	Medium → Undecided
status:	Won't Fix → New
no longer affects:	linux (Ubuntu)
affects:	udev (Debian) → linux (Ubuntu)
Changed in linux (Ubuntu):
importance:	Unknown → Undecided
status:	Confirmed → New
no longer affects:	linux (Ubuntu)
affects:	linux → linux (Ubuntu)
Changed in linux (Ubuntu):
importance:	Unknown → Undecided
status:	Fix Released → New
status:	New → Invalid

Ubuntu
linux package

ata1.00: exception Emask 0x0 SAct 0x807f SErr 0x0 action 0x6 frozen

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntulinux package

ata1.00: exception Emask 0x0 SAct 0x807f SErr 0x0 action 0x6 frozen

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux package