Ubuntu

hdd problems, failed command: READ FPDMA QUEUED

Reported by Crashbit on 2010-03-28
448
This bug affects 90 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned

Bug Description

Hello!

I have a brand new computer. With a SSD device and a SATA hard drive, a Seagate Barracuda XT specifically 6Gb / s of 2TB. The latter is connected to a Marvell 9123 controller that I set AHCI mode in BIOS.

I have the OS installed on the SSD device, but when you try to read the disc 2TB gives several bugs.

I tried to change the disk to another controller and gives the same problem, I even removed the disk partition table, having the same fate.

I checked the disc for flaws from Windows with hd tune and verification tool official record, and does not give me any errors.

I have tested with kernel version 2.6.34-rc2 and it works properly with this disc.

The errors given are the following:

[ 9.115544] ata9: exception Emask 0x0 SAct 0xf SErr 0x0 action 0x10 frozen
[ 9.115550] ata9.00: failed command: READ FPDMA QUEUED
[ 9.115556] ata9.00: cmd 60/04:00:d4:82:85/00:00:1f:00:00/40 tag 0 ncq 2048 in
[ 9.115557] res 40/00:18:d3:82:85/00:00:1f:00:00/40 Emask 0x4 (timeout)
[ 9.115560] ata9.00: status: { DRDY }
[ 9.115562] ata9.00: failed command: READ FPDMA QUEUED
[ 9.115568] ata9.00: cmd 60/01:08:d1:82:85/00:00:1f:00:00/40 tag 1 ncq 512 in
[ 9.115569] res 40/00:18:d3:82:85/00:00:1f:00:00/40 Emask 0x4 (timeout)
[ 9.115572] ata9.00: status: { DRDY }
[ 9.115574] ata9.00: failed command: READ FPDMA QUEUED
[ 9.115579] ata9.00: cmd 60/01:10:d2:82:85/00:00:1f:00:00/40 tag 2 ncq 512 in
[ 9.115581] res 40/00:18:d3:82:85/00:00:1f:00:00/40 Emask 0x4 (timeout)
[ 9.115583] ata9.00: status: { DRDY }
[ 9.115586] ata9.00: failed command: READ FPDMA QUEUED
[ 9.115591] ata9.00: cmd 60/01:18:d3:82:85/00:00:1f:00:00/40 tag 3 ncq 512 in
[ 9.115592] res 40/00:18:d3:82:85/00:00:1f:00:00/40 Emask 0x4 (timeout)
[ 9.115595] ata9.00: status: { DRDY }
[ 9.115609] sd 8:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 9.115612] sd 8:0:0:0: [sdb] Sense Key : Aborted Command [current] [descriptor]
[ 9.115616] Descriptor sense data with sense descriptors (in hex):
[ 9.115618] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
[ 9.115626] 1f 85 82 d3
[ 9.115629] sd 8:0:0:0: [sdb] Add. Sense: No additional sense information
[ 9.115633] sd 8:0:0:0: [sdb] CDB: Read(10): 28 00 1f 85 82 d4 00 00 04 00
[ 9.115640] end_request: I/O error, dev sdb, sector 528843476
[ 9.115643] __ratelimit: 18 callbacks suppressed
[ 9.115646] Buffer I/O error on device sdb2, logical block 317299556
[ 9.115649] Buffer I/O error on device sdb2, logical block 317299557
[ 9.115652] Buffer I/O error on device sdb2, logical block 317299558
[ 9.115655] Buffer I/O error on device sdb2, logical block 317299559
[ 9.115671] sd 8:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 9.115674] sd 8:0:0:0: [sdb] Sense Key : Aborted Command [current] [descriptor]
[ 9.115678] Descriptor sense data with sense descriptors (in hex):
[ 9.115679] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
[ 9.115687] 1f 85 82 d3
[ 9.115690] sd 8:0:0:0: [sdb] Add. Sense: No additional sense information
[ 9.115693] sd 8:0:0:0: [sdb] CDB: Read(10): 28 00 1f 85 82 d1 00 00 01 00
[ 9.115700] end_request: I/O error, dev sdb, sector 528843473
[ 9.115702] Buffer I/O error on device sdb2, logical block 317299553
[ 9.115707] sd 8:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 9.115710] sd 8:0:0:0: [sdb] Sense Key : Aborted Command [current] [descriptor]
[ 9.115714] Descriptor sense data with sense descriptors (in hex):
[ 9.115716] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
[ 9.115723] 1f 85 82 d3
[ 9.115726] sd 8:0:0:0: [sdb] Add. Sense: No additional sense information
[ 9.115729] sd 8:0:0:0: [sdb] CDB: Read(10): 28 00 1f 85 82 d2 00 00 01 00
[ 9.115736] end_request: I/O error, dev sdb, sector 528843474
[ 9.115738] Buffer I/O error on device sdb2, logical block 317299554
[ 9.115743] sd 8:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 9.115746] sd 8:0:0:0: [sdb] Sense Key : Aborted Command [current] [descriptor]
[ 9.115749] Descriptor sense data with sense descriptors (in hex):
[ 9.115751] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
[ 9.115759] 1f 85 82 d3
[ 9.115762] sd 8:0:0:0: [sdb] Add. Sense: No additional sense information
[ 9.115765] sd 8:0:0:0: [sdb] CDB: Read(10): 28 00 1f 85 82 d3 00 00 01 00
[ 9.115771] end_request: I/O error, dev sdb, sector 528843475
[ 9.115774] Buffer I/O error on device sdb2, logical block 317299555
[ 16.243531] sd 8:0:0:0: timing out command, waited 7s
[ 23.241557] sd 8:0:0:0: timing out command, waited 7s

lsb_release -rd
Description: Ubuntu lucid (development branch)
Release: 10.04
ignasi@ignasi-desktop:~$

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: yelp 2.29.5-0ubuntu3
ProcVersionSignature: Ubuntu 2.6.32-17.26-generic 2.6.32.10+drm33.1
Uname: Linux 2.6.32-17-generic x86_64
NonfreeKernelModules: nvidia
Architecture: amd64
Date: Mon Mar 29 01:06:27 2010
ExecutablePath: /usr/bin/yelp
InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Beta amd64 (20100318)
ProcEnviron:
 LANG=ca_ES.utf8
 SHELL=/bin/bash
SourcePackage: yelp

Crashbit (crashbit-gmail) wrote :
Crashbit (crashbit-gmail) wrote :

Sorry!

add my dmesg

Crashbit (crashbit-gmail) wrote :

Eps!

I connect the Seagate Barracuda XT 6Gb/s to jmicron (JMB361) controller, no Marvell 9123, and no errors found using linux.
I think the problem is Marvell 9123 controller.

Crashbit (crashbit-gmail) wrote :

The problems still here!

If I connect seagate disk to jmicron controller works fine, but if I copy /home directory to this disk, and modify fstab and UUID's to mount /home directory, ubuntu doesn't start.
It seems that problem is similar

Pho Dyssey (phodyssey) wrote :

Did you solve your problem? It looks like i have a similar issue with the same disk and same controller (Fedora 13beta, 2.6.33.3).

Pho

Tony T (tonytovar) wrote :

Also suffering this but only at boot-up and only with Lucid 10.04 'Final'. I have a new Dell Latitude E5500 laptop with a traditional SATA drive (not SSD). I initially installed the Lucid Beta-2 and have steadily updated from there. Not sure when these errors first appeared but now my bootup is delayed by 30s, then a screenful of these errors pops-up before the GUI finally loads.

I haven't tested any other kernels, e.g. 2.6.34, instead I'm just running the current 2.6.32-22.

Alex Watson (alexfromapex) wrote :

Same here except my GUI never loads. So basically this error has bricked my Ubuntu installation. I have gone into recovery mode and tried all of the options:

dpkg - fix broken packages
netroot - terminal with networking
grub - update grub
failsafeX - supposed to boot into a failsafe version of X but just goes to a blank screen....

I tried booting other kernels but the same story....

Lucas Hope (lucas-r-hope) wrote :

I am struggling with this problem. I have tried a few kernels from http://kernel.ubuntu.com/~kernel-ppa/mainline/ . Tested http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.34-lucid/ and 2.6.35-lucid-rc1

The same issue seems to appear here: http://ubuntuforums.org/showthread.php?t=1396465

and here:

http://vip.asus.com/forum/view.aspx?board_id=1&model=P6X58D%20Premium&id=20100702050055531&page=1&SLanguage=en-us

The second post in that link implies that switching to a non-Marvell hard drive port is a workaround which may fix the problem. That is what I am trying now.

Crashbit (crashbit-gmail) wrote :

Hey!

Greetings again!
I have the same problem with Maverick.
The kernel is 2.6.35-12, if I connect the disk controller Jmicron not give me errors.
The lspci-k shows this in relation to Marvell 9123 controller:

05:00.0 SATA controller: Device 1b4b:9123 (rev 10)
 Kernel driver in use: ahci
 Kernel modules: ahci

tags: added: maverick
Lucas Hope (lucas-r-hope) wrote :

Crashbit's post reminded me: switching to the non-Marvell port fixed the problem for me, but it should be considered a WORKAROUND. You can't get SATA3 speeds from it. For me, the drive was SATA2 anyway. It might be that you have to set your bios to ahci or sata, too.

It is probably a good idea to re-install linux once you change the ports, too, as disk corruption caused by the original problem can cause ongoing crashes.

I changed the port, re-installed, and have had zero problems for the last two weeks.

moojix (moojix) wrote :

I had the same ugly ata errors with my Asus P7P55D-E Premium and a Crucial C300 SATA drive.
lspci: SATA controller: Device 1b4b:9123 (rev 10)

my workaround: disable NCQ and now I can use my SATA3 drive through the Marvell 9123 controller of my MoBo.
(see: http://ubuntuforums.org/showpost.php?p=9684933&postcount=12)
I tested this workaround with iozone3 without any errors.

before this workaround:
Aug 6 09:58:08 st-002 kernel: [ 3.249455] ata5.00: ATA-9: C300-CTFDDAC256MAG, 0002, max UDMA/100
Aug 6 09:58:08 st-002 kernel: [ 3.249461] ata5.00: 500118192 sectors, multi 1: LBA48 NCQ (depth 31/32), AA

after this workaround:
Aug 6 10:01:36 st-002 kernel: [ 3.369991] ata5.00: ATA-9: C300-CTFDDAC256MAG, 0002, max UDMA/100
Aug 6 10:01:36 st-002 kernel: [ 3.369996] ata5.00: 500118192 sectors, multi 1: LBA48 NCQ (not used)

I have not found, if this bug is patched in any linux kernel yet (I'm using 2.6.32-24 64-bit).

moojix (moojix) wrote :

marvell 9123 sata ahci initialization errors: https://bugzilla.kernel.org/show_bug.cgi?id=15573

Colan Schwartz (colan) wrote :

Confirming this in Lucid.

Changed in ubuntu:
status: New → Confirmed
Colan Schwartz (colan) wrote :

Oct 3 02:40:55 tiger kernel: [447432.011325] ata5.00: exception Emask 0x0 SAct 0x3ffff SErr 0x0 action 0x6 frozen
Oct 3 02:40:55 tiger kernel: [447432.011334] ata5.00: failed command: READ FPDMA QUEUED
Oct 3 02:40:55 tiger kernel: [447432.011344] ata5.00: cmd 60/80:00:bf:1f:3c/00:00:27:00:00/40 tag 0 ncq 65536 in
Oct 3 02:40:55 tiger kernel: [447432.011346] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Oct 3 02:40:55 tiger kernel: [447432.011350] ata5.00: status: { DRDY }

João Pinto (joaopinto) wrote :

I am having the same issue in Maverick, WD Black Caviar 1TB SATA III disk, AHCI mode.

Vangelis Tasoulas (cyberang3l) wrote :

I am affected of the same bug too :(

pepre (ea1256) wrote :

Same here :-(

After adding "libata.force=noncq" to kernel-bootparameters, the error changed to

exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
irq_stat 0x40000001
failed command: READ DMA EXT
cmd 25/00:e0:df:d7:f8/00:00:88:00:00/e0 tag 0 dma 114688 in
         res 51/40:00:f5:d7:f8/00:00:88:00:00/e0 Emask 0x9 (media error)
status: { DRDY ERR }
error: { UNC }
configured for UDMA/133
EH complete

when reading big files fast. I think the disaster began suddenly about two months ago.

Running lucid up to date. HDs: raid5/lvm.

Gerry Reno (greno-verizon) wrote :

I'm seeing this same error with Lucid x86_64 and kernel 2.6.32-21.

My drive is a Hitachi 500GB SATA.

Once this error starts I get hung task timeouts and the systems starts freezing.

Gerry Reno (greno-verizon) wrote :

It also generates filesystem errors such that I have to run fsck on the next boot.

Gerry Reno (greno-verizon) wrote :

And I just checked and my south bridge is an AMD SB750 with 6 SATA. I'm using AHCI mode.

Gerry Reno (greno-verizon) wrote :

I just went and upgraded to kernel 2.6.37-12-server from kernel-ppa
and after rebooting into it I'm still see the same READ FPDMA QUEUED errors I was before.
Both drives in the machine check out fine according to the drive tests.

I've noticed a little unexplained freezing up and releasing the past couple days and then today it started with these errors almost constantly. And I'm trying to remember what packages I might have updated recently that could have contributed to this issue.

I'm also going to check all the cabling in the server to make sure nothing has worked loose.

Gerry Reno (greno-verizon) wrote :

Cabling is fine.

I just tried to run 'sudo apt-get update' and it is telling me that I need to run 'dpkg --configure -a'. When I do that it tries to reconfigure the new kernel package again. So the read errors must have prevented the configure from completing.

But the read errors will not let this kernel package configure run to completion even now so I'm stuck. Cannot run apt-get till I figure out how to get the configure to finish without triggering all these errors.

Gerry Reno (greno-verizon) wrote :

Tried adding libata.force=noncq to 2.6.37-12-server kernel boot line and the errors are changed to READ DMA EXT.

This particular machine has a Gigabtye M/B.

Some details:
# lspci
00:00.0 Host bridge: Advanced Micro Devices [AMD] RS780 Host Bridge
00:01.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (int gfx)
00:0a.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (PCIE port 5)
00:11.0 SATA controller: ATI Technologies Inc SB700/SB800 SATA Controller [AHCI mode]
00:12.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller
00:12.1 USB Controller: ATI Technologies Inc SB700 USB OHCI1 Controller
00:12.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller
00:13.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller
00:13.1 USB Controller: ATI Technologies Inc SB700 USB OHCI1 Controller
00:13.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller
00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 3a)
00:14.1 IDE interface: ATI Technologies Inc SB700/SB800 IDE Controller
00:14.2 Audio device: ATI Technologies Inc SBx00 Azalia (Intel HDA)
00:14.3 ISA bridge: ATI Technologies Inc SB700/SB800 LPC host controller
00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge
00:14.5 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI2 Controller
00:18.0 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] HyperTransport Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] Miscellaneous Control
00:18.4 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] Link Control
01:05.0 VGA compatible controller: ATI Technologies Inc Radeon HD 3300 Graphics
01:05.1 Audio device: ATI Technologies Inc RS780 Azalia controller
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 02)
03:0e.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 Controller (PHY/Link)

Adam Ziegler (mrbond) wrote :

I can confirm the same on Maverick x64 Server, kernel 2.6.35-24-server, all latest software updates. ASUS P6X58D Premium, latest BIOS.

I am seeing ata soft resets, hard resets, and then crashes on both devices connected to my Marvell 9123 ports (SATA-6Gbps), a SATA-3Gbps Corsair F120 SSD (w/latest firmware, as the OS/boot drive) and a SATA-1.5Gbps LG Blu-ray reader. Preceded by "READ FPDMA QUEUED" command errors in dmesg. All other SATA devices connected to the SATA-3Gbps controller are fine.

The system is hardlocked, except...I can still login via SSH from a networked machine, and I can browse the machine through SFTP. Trying to run basic system commands (like ls -l, fdisk, sudo, and the like) results in "Bus error" or "Input/output error". "shutdown" is impossible, so i have to hard-reset the machine. This happens randomly, without any apparent cause, and no data loss or corruption (though it scares me a little to hard-reset at all).

I recall this happening with prior kernels, as well. I will try switching the drive to a SATA-3Gbps port and see what happens. I don't care so much about losing access to the Blu-ray drive periodically, but I'd rather not lose the OS drive.

Adam Ziegler (mrbond) wrote :

Also, I'd move that this bug be marked fairly important/high priority, as if you have your OS drive connected to the affected ports, the system hardcrashes, an absolute failure for systems that need stable/consistent uptime.

Nicolas Krzywinski (nsk7even) wrote :

For me its even worse so that Crucial RealSSD C300 only sometimes is recognised even in GRUB stage. When I am lucky and it is found I rarely can bootup until the end ... mostly it stucks at those .... Exception Emask ... frozen .... messages.

Thing got worse with upgrade from Lucid (where bootup and working with the system worked most of the time but those messages above where omnipresent in syslog and i hoped to solve this with dist-upgrade .... failed) - it was hard work to do the upgrade, I had to do it with minimal system without GUI to not get stuck at some installation steps. And now I can not use the system anymore.

Notice that Windows 7 works without a problem, though I notice some kind of delay at bootup before the first bootup screen appears.

The system is Asus P7P55D-E LX and C300 connected to 6 gb/s Marvel controller.

Lucas Hope (lucas-r-hope) wrote :

For people having problems with this, I would like to re-iterate two things:

1. The problem was fixed when I went through the 3mb/second channel.

2. I had to re-install the OS completely due to data corruption caused by the READ FDMA QUEUED errors. Don't expect your system to work properly until you've re-installed.

The workaround of using the 3mb/second sata channel has worked perfectly for me for five months.

Good luck.

Gerry Reno (greno-verizon) wrote :

>>> 1. The problem was fixed when I went through the 3mb/second channel.

That's probably because it was a different controller or the hardware at least differed sufficiently enought so as to not exhibit the problem.

>>> 2. I had to re-install the OS completely due to data corruption caused by the READ FDMA QUEUED errors. Don't expect your system to work properly until you've re-installed.

I've had no problem getting 'fsck.ext4' to repair the minimal amount of problems that were caused so far. My system remounts the filesystem read-only immediately upon detection of errors so nothing else can be written and maybe that helped reduce any corruption.

Nevertheless, this is certainly a very serious problem and this bug warrants a high priority.

Gerry Reno (greno-verizon) wrote :

Opened a kernel bug about this problem: https://bugzilla.kernel.org/show_bug.cgi?id=26702

.

Nicolas Krzywinski (nsk7even) wrote :

I am pretty sure that my system would work if I connect C300 to other sata ports, controlled by Intel ICH, but I selected those hardware combination _especially_ because of C300 and Marvell controller being able to communication beyond sata 3G performance (benchmark proved that they really use that bandwith).
As soon as there is the other operating system being able to work at that speed (though I admit I never measured, because at work I have to work and cannot play around with stuff like that for a long time...) there is no option for me to downgrade to older interface specifitations.

pepre (ea1256) wrote :

It ist not only caused by marvell chips:

$ lspci
00:00.0 Host bridge: ATI Technologies Inc RX780/RX790 Chipset Host Bridge
00:02.0 PCI bridge: ATI Technologies Inc RD790 PCI to PCI bridge (external gfx0 port A)
00:04.0 PCI bridge: ATI Technologies Inc RD790 PCI to PCI bridge (PCI express gpp port A)
00:0a.0 PCI bridge: ATI Technologies Inc RD790 PCI to PCI bridge (PCI express gpp port F)
00:11.0 SATA controller: ATI Technologies Inc SB700/SB800 SATA Controller [AHCI mode]
00:12.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller
00:12.1 USB Controller: ATI Technologies Inc SB700 USB OHCI1 Controller
00:12.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller
00:13.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller
00:13.1 USB Controller: ATI Technologies Inc SB700 USB OHCI1 Controller
00:13.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller
00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 3c)
00:14.1 IDE interface: ATI Technologies Inc SB700/SB800 IDE Controller
00:14.3 ISA bridge: ATI Technologies Inc SB700/SB800 LPC host controller
00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge
00:14.5 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI2 Controller
00:18.0 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] HyperTransport Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron, Athlon64, Sempron] Miscellaneous Control

Edwin Chiu (edwin-chiu) wrote :

Try a different HD, I've had similar issues with my Seagate ST32000542AS, 3 out of 5 drives died within 4 weeks already.... prior to "officlal" death, I saw similar errors above...

pepre (ea1256) wrote :

> Try a different HD

This doesn't help. My RAID5 with 4 HDs works perfectly with archlinux. SMART and various stresstests didn't show any HD-errors .

It's a real bug, not a hardware failure.

I'm having this problem in 2.6.35-25-generic #44-Ubuntu SMP Fri Jan 21 17:40:44 UTC 2011 x86_64 GNU/Linux too.

Crashing at least daily. Only able to recover with great effort.

jnygaard (jens-olav-nygaard) wrote :

Same here. One way to trigger the problem is to do a "du" on a 500GB partition with a lot of files, both small and large. After a while:

Feb 3 23:10:19 xx kernel: [199664.670378] ata9: hard resetting link
Feb 3 23:10:20 xx kernel: [199665.032664] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Feb 3 23:10:20 xx kernel: [199665.045688] ata9.00: configured for UDMA/133
Feb 3 23:10:20 xx kernel: [199665.045695] ata9.00: device reported invalid CHS sector 0
Feb 3 23:10:20 xx kernel: [199665.045702] ata9: EH complete

I just noticed this after changing my 3 SATA-disks from the Intel SATA 3Gbps ports on my P8P67-mainboard (the Intel bug thingy) to the 6 Gbps ports on the same mainboard. The error messages stems from the Marvell-ports.

andornaut (andornaut) wrote :

I'm experiencing a similar issue. The system hangs periodically for about a minute while the HD resets.

Environment:
Ubuntu running in a Virtual Box VM running on WIndows 7 64bit.
Asus G53JW Laptop
Intel X25 SSD

Log excerpts:
Jan 17 14:47:58 vm rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="578" x-info="http://www.rsyslog.com"] rsyslogd was HUPed, type 'lightweight'.
Jan 17 14:48:32 vm kernel: [ 1513.120252] ata3: hard resetting link
Jan 17 14:48:33 vm kernel: [ 1513.470345] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jan 17 14:48:33 vm kernel: [ 1513.471232] ata3.00: configured for UDMA/133
Jan 17 14:48:33 vm kernel: [ 1513.471241] ata3.00: device reported invalid CHS sector 0
Jan 17 14:48:33 vm kernel: [ 1513.471255] ata3: EH complete

[ 1513.120200] ata3.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
[ 1513.120210] ata3.00: failed command: READ FPDMA QUEUED
[ 1513.120222] ata3.00: cmd 60/08:00:30:66:4c/00:00:00:00:00/40 tag 0 ncq 4096 in
[ 1513.120224] res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
[ 1513.120230] ata3.00: status: { DRDY }
[ 1513.120252] ata3: hard resetting link
[ 1513.470345] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 1513.471232] ata3.00: configured for UDMA/133
[ 1513.471241] ata3.00: device reported invalid CHS sector 0
[ 1513.471255] ata3: EH complete

Edwin Chiu (edwin-chiu) wrote :

So this appears to be happening on Marvell, JMicron and SB700/800 chips, not good!

What kernel version of archlinux are you running? A possible regression sounds likely in around the 2.6.33 timeframe?

Another solution i've seen (didn't work for me) was to try pcie_aspm=off in your boot options.

Edwin Chiu (edwin-chiu) wrote :
Download full text (3.3 KiB)

Tried booting 2.6.31-22-server (from karmic) on a maverick install and same error. I'm not entirely convinced this is a software bug, seems to target the same drive. I have 5 identical drives, and switching them around, so they are on different ports/cables, etc. doesn't seem to make the problem shift. Seems to be the drive...

On an individual basis, I'd say I had some bad drives, but when taking into account other reports, seems to be more than just a bad drive, but on a single system basis, it doesn't add up? If this was a software or hardware (non HD) bug, why does the problem follow the bad drive around? Why don't I get the problem on other drives?

Below is the output from 2.6.31-22, looks like the ata code isn't as robust, as it fails the drive and kicks it. Maverick seems better at recovering the drive so that it's usable.

My "reliable" way of triggering this is to launch a kvm process (tried virtio and ide emulation, same trigger). On the LV that hosts the kvm guest, I was able to dd the entire volume to /dev/null without and read issues...

[ 286.010222] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 286.010242] ata5.00: cmd 25/00:08:20:b4:4d/00:00:78:00:00/e0 tag 0 dma 4096 in
[ 286.010246] res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
[ 286.010253] ata5.00: status: { DRDY }
[ 286.010262] ata5: hard resetting link
[ 291.580170] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 291.580180] ata5.00: link online but device misclassifed
[ 296.580129] ata5.00: qc timeout (cmd 0xec)
[ 296.580166] ata5.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 296.580172] ata5.00: revalidation failed (errno=-5)
[ 296.580181] ata5: hard resetting link
[ 302.150170] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 302.150179] ata5.00: link online but device misclassifed
[ 312.150066] ata5.00: qc timeout (cmd 0xec)
[ 312.150103] ata5.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 312.150109] ata5.00: revalidation failed (errno=-5)
[ 312.150116] ata5: limiting SATA link speed to 1.5 Gbps
[ 312.150124] ata5: hard resetting link
[ 317.720136] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 317.720145] ata5.00: link online but device misclassifed
[ 347.720098] ata5.00: qc timeout (cmd 0xec)
[ 347.720135] ata5.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 347.720142] ata5.00: revalidation failed (errno=-5)
[ 347.720148] ata5.00: disabled
[ 347.720162] ata5.00: device reported invalid CHS sector 0
[ 347.720176] ata5: hard resetting link
[ 353.290169] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 353.290178] ata5.00: link online but device misclassifed
[ 353.290198] ata5: EH complete
[ 353.290224] sd 4:0:0:0: [sdd] Unhandled error code
[ 353.290229] sd 4:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 353.290237] end_request: I/O error, dev sdd, sector 2018358304
[ 353.290245] raid10: sdd4: rescheduling sector 1542176
[ 353.290271] sd 4:0:0:0: [sdd] Unhandled error code
[ 353.290275] sd 4:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 353.290282] end_request: I/O error, dev sdd, sector 22636850...

Read more...

Matt Cargo (mcargo) wrote :

Affected by the same bug. Running Ubuntu 10.04.2 LTS on hp laptop. See
attached lspci output.

After running fine for a while, the root file system develops errors and
is remounted as read-only. Upon reboot, I get only simple ash shell.
I can fix the long list of file system errors with e2fsck, but I'm
worried about whether this will continue to work, and doing these fixes
wastes time. Any other info supplied on request.

Some history: The first time I upgraded to 10.04, I started having
similar file system problems. Bought a new hard drive, and downgraded
to the previous Ubuntu. No problems for a long time. I decided I should
upgrade, and the problem resurfaced.

affects: ubuntu → linux (Ubuntu)
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
71 comments hidden view all 151 comments
P. S. (sadowsky46) wrote :

Shame on me... on my machine this issue was in fact an issue of the HDD: some not-readable sectors. Strange, the "long" self-test of SMART did report no errors. Then I ran the WD proprietary HD checker. It also reported "no error". But after the tool ran the issue is gone. Obviously it did silently fix the bad blocks from the spare area.
So in the end, also not a SW bug on my side.

Steve Franks (bahamasfranks) wrote :

@penalvch :

It's happening to me right now on 3.1.8, on a month-old Lenovo E420 and a brand-new OCZ Virtex2 60GB SSD. Is that upstream enough for ya?

Steve

Apr 9 12:01:07 fyre kernel: [ 104.803190] ata1: hard resetting link
Apr 9 12:01:07 fyre kernel: [ 105.129298] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Apr 9 12:01:07 fyre kernel: [ 105.200236] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
Apr 9 12:01:07 fyre kernel: [ 105.200246] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
Apr 9 12:01:07 fyre kernel: [ 105.220215] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
Apr 9 12:01:07 fyre kernel: [ 105.220223] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
Apr 9 12:01:07 fyre kernel: [ 105.230597] ata1.00: configured for UDMA/133
Apr 9 12:01:07 fyre kernel: [ 105.230965] ata1.00: device reported invalid CHS sector 0
Apr 9 12:01:07 fyre kernel: [ 105.230971] ata1.00: device reported invalid CHS sector 0
Apr 9 12:01:07 fyre kernel: [ 105.230976] ata1.00: device reported invalid CHS sector 0
Apr 9 12:01:07 fyre kernel: [ 105.230982] ata1.00: device reported invalid CHS sector 0
Apr 9 12:01:07 fyre kernel: [ 105.230987] ata1.00: device reported invalid CHS sector 0
Apr 9 12:01:07 fyre kernel: [ 105.231006] ata1: EH complete
steve@fyre:~$ uname -a
Linux fyre 3.1.8-030108-generic-pae #201201061759 SMP Fri Jan 6 23:15:58 UTC 2012 i686 GNU/Linux
steve@fyre:~$

Steve Franks (bahamasfranks) wrote :

@penalvch :

> apport-collect -p linux 550559

"You are not the reporter or subscriber of this problem report, or the report is a duplicate or already closed.
Please create a new report using "apport-bug"."

Then tried apport-bug and it complains that I'm not running a ubuntu kernel, which kinda makes sense since it's upstream like you asked. Go figure. If you want your insider info, you're gonna have to help me fool my system into getting it for you...

Steve

Steve Frank, please execute the following at the Terminal and feel free to subscribe me to it:
ubuntu-bug linux

Thanks!

Steve Frank = Steve Franks

PieroCampa (piero-campa) wrote :

Same problem here.
Sometime I get very long boots.
Very. Very. Long..
I attach my whole dmesg, whereas here I report a significant extract:
     ...
     811 [ 14.297122] EXT4-fs (sda5): INFO: recovery required on readonly filesystem
     812 [ 14.297130] EXT4-fs (sda5): write access will be enabled during recovery
     813 [ 186.461499] EXT4-fs (sda5): recovery complete
     814 [ 186.934519] EXT4-fs (sda5): mounted filesystem with ordered data mode. Opts: (null)
     815 [ 270.832089] ata1.00: exception Emask 0x0 SAct 0x6003fffe SErr 0x0 action 0x6 frozen
     816 [ 270.832103] ata1.00: failed command: READ FPDMA QUEUED
     817 [ 270.832113] ata1.00: cmd 60/00:08:00:10:4d/04:00:0e:00:00/40 tag 1 ncq 524288 in
     818 [ 270.832114] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
     ...

Running Ubuntu Oneiric + LXDE on Dell XPS Studio 1640 Laptop, with following HDD:

    ATA device, with non-removable media
 Model Number: WDC WD3200BJKT-75F4T0
 Serial Number: WD-WXM209AS1908
 Firmware Revision: 11.01A11
 Transport: Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5

Thanks for all your work to help us.

PieroCampa, please execute the following at the Terminal and feel free to subscribe me to it:
ubuntu-bug linux

Thanks!

I'm also seeing "failed command: WRITE FPDMA QUEUED" using a GA 990FXA-UD7 board with Crucial m4 SSDs connected to the Southbridge SATA controller (ATI SB950) on oneiric . Sometimes this error would crash the machine (the RAID5 module actually), sometimes it will just reset the SATA link and keep going.

A workaround that solves the issue for me is to go to the BIOS and disable "SATA3.0 Mode" for the southbridge SATA. This reduces the SATA link speed to 3Gbps, although Linux still reports 6Gpbs in the kernel log.

I'm using a SATA drive bay from Thermaltake. Maybe the additional connectors in there degrade the SATA link in a way that makes SATA3 unreliable. This is just speculation, I didn't test it when the drives are directly connected to the mainboard.

Christoph Dwertmann, please execute the following at the Terminal and feel free to subscribe me to it:
ubuntu-bug linux

Thanks!

yemu (yemu) wrote :

same thing happens here on a OCZ Vertex 3 with Asrock 770 Extereme 3 board. the system randomly freezes for a couple of seconds and then I see errors in the log.
...
ata6.00: exception Emask 0x0 SAct 0x1fffff SErr 0x0 action 0x6 frozen
ata6.00: failed command: READ FPDMA QUEUED
ata6.00: cmd 60/08:00:d0:d4:e6/00:00:00:00:00/40 tag 0 ncq 4096 in
res 40/00:01:00:00:00/00:00:00:00:00/e0 Emask 0x4 (timeout)
...
and so on

I'm using Precise upgraded on 26.04.2012 - kernel 3.2.0-23-generic-pae.
y

yemu, please execute the following at the Terminal and feel free to subscribe me to it:
ubuntu-bug linux

Thanks!

yemu (yemu) wrote :

@Christopher: thanks, I've just created new bug report as you said and subscribed you to it.

I have been just been hit by this issue, with the exact same messages as the OP. This is on Dell E6410 2 year old with Ubuntu 12.04 completely updated.

The system boots but it is so slow it is useless. I'm trying to report a bug from the system but is is so slow that it is becoming almost impossible, I started with the ubuntu-bug but I will try to do it without the GUI and attach the files

The new bug is bug 1002670 , after 45 minutes I have been able to send it from the affected system. If this is is a bug it is really serious in my opinion.

If anybody knows of a workaround to stop the kernel checking the disk and make the system usable...

ads (garboge) wrote :

I've had freeze ups for a while now and I seem to remember it occurring after a kernel upgrade a while back. Would generally have to reboot and sometimes boot from a live install CD to fix disk errors. The recent upgrade to Lucid kernel 3.0.0-19 seems to have remedied my situation

yemu (yemu) wrote :

after searching forums for the solution to my problem for the last couple of days, I think I may have found what's causing freezes (at least for the SSD drives). I'm still testing it to check for 100% if this is the case, but for now I believe that freezes are caused by the driver using the NCQ to communicate with SSD (which is of course pointless - SSD drives don't need to optimze the order of disk operationsm apparently it causes some errors).

The point is that after disabling NCQ for the SSD the errors stoped occuring (I've not encountered any for about 48h - earlier te freezes happened every couple of hours or even more often).

To disable NCQ I added "libata.force=5:noncq" to default kernel options in /etc/default/grub like this:

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash libata.force=5:noncq"

and updated grub.

I used "5:" because errors i saw occured with ata5 info in dmesg - in original report to this bug it was ata6, as you can see above, but it changed to ata5, probably because I was switching ports on my motherboard.

if anyone experience similar bug, please try this solution and report here.

credits for the solution go to: http://ubuntuforums.org/showpost.php?p=10480137&postcount=8

I had this problem with 10.04 some years ago. I just updated to 12.04 (and 3.2.0-26-generic kernel) and its back. I am using an older OCZ Agility SSD. Booting with NCQ disabled for that drive fixed my problem then and now, in terms of boot time. I still get an error message:

[ 2.895665] EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
[ 2.906325] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
[ 2.908085] ACPI: resource piix4_smbus [io 0x0b00-0x0b07] conflicts with ACPI region SOR1 [io 0xb00-0xb0f]
[ 2.908087] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver

I have no idea how to handle that one.
Cheers.

Tulasi (murtulasi) wrote :

I have oneiric on my IBM lenovo and today i have this problem of failed command: READ FPDMA QUEUED when iam trying to boot . How to solve this issue ? Any pointers will be helpful.

Thanks.

Tulasi, could you please file a new report by executing the following in a terminal:
ubuntu-bug linux

For more on this, please see https://help.ubuntu.com/community/ReportingBugs#Bug_Reporting_Etiquette . If you do file a new report, please feel free to subscribe me to it. Thank you for your understanding.

Helpful Bug Reporting Links:
https://help.ubuntu.com/community/ReportingBugs#A3._Make_sure_the_bug_hasn.27t_already_been_reported
https://help.ubuntu.com/community/ReportingBugs#Adding_Apport_Debug_Information_to_an_Existing_Launchpad_Bug
https://help.ubuntu.com/community/ReportingBugs#Adding_Additional_Attachments_to_an_Existing_Launchpad_Bug

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Laurent Dinclaux (dreadlox) wrote :

What is wrong with that bug report ? There are 130 comments, about 80 affected people, also the issue has been narrowed to NCQ (disabling it is a workaround), that bug even prevents people facing it to properly install Ubuntu.

Isn't that "complete" enough ??

pepre (me-pepre) wrote :

> Isn't that "complete" enough?

I don't know why Christopher incompletes it all the time; perhaps:

"We don't need to think about things that are not possible."

;-)

Just for completeness:

since installing 12.04.1 server (adding fluxbox) the bug doesn't appear any more.

SW RAID5 with

SATA controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode]
SATA controller: Marvell Technology Group Ltd. 88SE9123 PCIe SATA 6.0 Gb/s controller (rev 11)

Pepre, if you have a bug in Ubuntu, could you please file a new report by executing the following in a terminal:
ubuntu-bug linux

For more on this, please see the Ubuntu Bug Control and Ubuntu Bug Squad article:
https://wiki.ubuntu.com/Bugs/BestPractices#X.2BAC8-Reporting.Focus_on_One_Issue

and Ubuntu Community article:
https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette

When opening up the new report, please feel free to subscribe me to it.

Please note, not filing a new report may delay your problem being addressed as quickly as possible.

Thank you for your understanding.

Bernhard (baumber) wrote :

Hello,

I have the same problem on an ASUS P7P55D-E PRO mainboard with the Marvell
88SE9123 PCIe SATA 6.0 onboard controller and two WDC WD1002FAEX-00Z3A0 Sata 6.0 disks.

=> READ and WRITE freezes with SATA 6.0Gbps and NCQ; Limiting to SATA 3.0Gbps without NCQ gives me a stable system.
( kernel parameter at grub: libata.force=7:noncq,3.0Gbps,8:noncq,3.0Gbps )

There is a kernel thread with the problem and I added my description to it:

https://bugzilla.kernel.org/show_bug.cgi?id=43160

Best regards, Bernhard

pepre (me-pepre) wrote :

> Christopher M. Penalver (penalvch) wrote on 2012-11-01:

> > Pepre, if you have a bug in Ubuntu, could you please file a
> > new report by executing the following in a terminal:
> > ubuntu-bug linux

Ok, done. See:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1077718

:-)

I have the same problem, but running Debian 6.0.6 with 2.6 kernel. I have tried two different motherboards, different cables etc. but the problem is always there with the SSD SATA3 drive (that works perfectly in Windows 7 with the same hardware). I have SATA3 HDDs on the same machine that work without problems. So it is either a misunderstanding between the hardware designers and kernel coders or then many SSD drives are just faulty somehow. I wouldn't be surprised if they all use some common component that causes these problems.

Oh, forgot to mention in comment #135 that the SSD drive is connected to Intel SATA3 port.

00:1f.2 SATA controller: Intel Corporation Device 1e02 (rev 04)

igor (icicimov-gmail) wrote :
Download full text (6.9 KiB)

Same problem here after upgrade to 12.04 I think. It's on 1TB Seagate drive that I have running for 4 and a half years now in Mythbuntu setup. It's the OS drive ufortunatelly, I have another WD 2TB which is fine. Started experiencing Mythtv freezes and crashes lately so thought to do a SMARt test and found the errors. AHCI mode enbaled in BIOS for the drive pre install.

00:11.0 SATA controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (prog-if 01 [AHCI 1.0])
 Subsystem: Giga-byte Technology Device b002
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
 Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 Latency: 32
 Interrupt: pin A routed to IRQ 42
 Region 0: I/O ports at ff00 [size=8]
 Region 1: I/O ports at fe00 [size=4]
 Region 2: I/O ports at fd00 [size=8]
 Region 3: I/O ports at fc00 [size=4]
 Region 4: I/O ports at fb00 [size=16]
 Region 5: Memory at fe02f000 (32-bit, non-prefetchable) [size=1K]
 Capabilities: <access denied>
 Kernel driver in use: ahci

[ 9397.726911] ata1.00: exception Emask 0x0 SAct 0x1ff SErr 0x0 action 0x0
[ 9397.726918] ata1.00: irq_stat 0x40000008
[ 9397.726926] ata1.00: failed command: READ FPDMA QUEUED
[ 9397.726941] ata1.00: cmd 60/38:20:6e:29:7a/00:00:1b:00:00/40 tag 4 ncq 28672 in
[ 9397.726950] ata1.00: status: { DRDY ERR }
[ 9397.726955] ata1.00: error: { UNC }
[ 9397.749426] ata1.00: configured for UDMA/133
[ 9397.749455] ata1: EH complete
[ 9532.787964] ata1.00: exception Emask 0x0 SAct 0xfc SErr 0x0 action 0x0
[ 9532.787972] ata1.00: irq_stat 0x40000008
[ 9532.787981] ata1.00: failed command: READ FPDMA QUEUED
[ 9532.787996] ata1.00: cmd 60/e0:10:f6:2b:64/00:00:1b:00:00/40 tag 2 ncq 114688 in
[ 9532.788025] ata1.00: status: { DRDY ERR }
[ 9532.788031] ata1.00: error: { UNC }
[ 9532.867447] ata1.00: configured for UDMA/133
[ 9532.867475] ata1: EH complete
[ 9536.952430] ata1.00: exception Emask 0x0 SAct 0x3e SErr 0x0 action 0x0
[ 9536.952438] ata1.00: irq_stat 0x40000008
[ 9536.952446] ata1.00: failed command: READ FPDMA QUEUED
[ 9536.952460] ata1.00: cmd 60/e0:28:f6:2b:64/00:00:1b:00:00/40 tag 5 ncq 114688 in
[ 9536.952469] ata1.00: status: { DRDY ERR }
[ 9536.952474] ata1.00: error: { UNC }
[ 9536.974938] ata1.00: configured for UDMA/133
[ 9536.974964] ata1: EH complete
[ 9541.683446] ata1.00: exception Emask 0x0 SAct 0x1fff SErr 0x0 action 0x0
[ 9541.683454] ata1.00: irq_stat 0x40000008
[ 9541.683463] ata1.00: failed command: READ FPDMA QUEUED
[ 9541.683477] ata1.00: cmd 60/e0:00:f6:2b:64/00:00:1b:00:00/40 tag 0 ncq 114688 in
[ 9541.683487] ata1.00: status: { DRDY ERR }
[ 9541.683492] ata1.00: error: { UNC }
[ 9541.705986] ata1.00: configured for UDMA/133
[ 9541.706020] ata1: EH complete
[48304.968186] ata1.00: exception Emask 0x0 SAct 0x3ff SErr 0x0 action 0x0
[48304.968195] ata1.00: irq_stat 0x40000001
[48304.968203] ata1.00: failed command: READ FPDMA QUEUED
[48304.968218] ata1.00: cmd 60/20:00:9e:73:a2/00:00:1b:00:00/40 tag 0 ncq 16384 in
[48304.968228] ata1.00: status: { DRDY ERR }
[48304.968232] ata1.00: error: { ABRT }
[48304.968238] ata1.00: faile...

Read more...

igor, if you have a bug in Ubuntu, could you please file a new report by executing the following in a terminal:
ubuntu-bug linux

For more on this, please see the Ubuntu Kernel team article:
https://wiki.ubuntu.com/KernelTeam/KernelTeamBugPolicies#Filing_Kernel_Bug_reports

the Ubuntu Bug Control team and Ubuntu Bug Squad team article:
https://wiki.ubuntu.com/Bugs/BestPractices#X.2BAC8-Reporting.Focus_on_One_Issue

and Ubuntu Community article:
https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette

When opening up the new report, please feel free to subscribe me to it.

Please note, not filing a new report may delay your problem being addressed as quickly as possible.

Thank you for your understanding.

Phillip Susi (psusi) wrote :

This bug should have expired a while ago due to lack of response from the original reporter. It appears that several other people have piled on different and unrelated issues. If they are still having issues, they should file separate bug reports.

Changed in linux (Ubuntu):
status: Incomplete → Invalid
Vincent (vincent-voyer) wrote :

I would like to say that comment #126 was right, in my case at least.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/550559/comments/126

Try disabling ncq for your SSD drive before anything else.

I also tried latest kernels (3.7) with no change.

Vincent, if you have a bug in Ubuntu, could you please file a new report by executing the following in a terminal:
ubuntu-bug linux

For more on this, please see the Ubuntu Kernel team article:
https://wiki.ubuntu.com/KernelTeam/KernelTeamBugPolicies#Filing_Kernel_Bug_reports

the Ubuntu Bug Control team and Ubuntu Bug Squad team article:
https://wiki.ubuntu.com/Bugs/BestPractices#X.2BAC8-Reporting.Focus_on_One_Issue

and Ubuntu Community article:
https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette

When opening up the new report, please feel free to subscribe me to it.

Please note, not filing a new report may delay your problem being addressed as quickly as possible.

Thank you for your understanding.

tigrez (davide-bernardo) wrote :

I've also this problem, I've solved using partition ext2 and ext3. Maybe and ext4 bug?

tigrez, if you have a bug in Ubuntu, the Ubuntu Kernel team, Ubuntu Bug Control team, and Ubuntu Bug Squad would like you to please file a new report by executing the following in a terminal:
ubuntu-bug linux

For more on this, please see the Ubuntu Kernel team article:
https://wiki.ubuntu.com/KernelTeam/KernelTeamBugPolicies#Filing_Kernel_Bug_reports

the Ubuntu Bug Control team and Ubuntu Bug Squad team article:
https://wiki.ubuntu.com/Bugs/BestPractices#X.2BAC8-Reporting.Focus_on_One_Issue

and Ubuntu Community article:
https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette

When opening up the new report, please feel free to subscribe me to it.

Please note, not filing a new report would delay your problem being addressed as quickly as possible.

Thank you for your understanding.

pepre (me-pepre) wrote :

NB: since using lowlatency-kernel and disabling NCQ the problem disappeared too :-)

Adrian Jones (adriqn) wrote :

I have had this problem, both with read and write errors but I have only recently noticed it after doing a new install on a headless server and leaving the screen plugged in. I have not had any crashed, data loss or data corruption.
Interestingly I have 3 HP ML150 servers but only 2 report this error the spec of each server is as follows:
server 1
Operon quad core 2.2
8GB RAM
2 additional PCI-e ethernet cards
4 x 250GB SATA HDD in software RAID10
Ubuntu server 12.04 64bit

server 2
Operon quad core 2.2
4GB RAM
1 additional pci-e ethernet card
4 x 1.5TB SATA HDD in software RAID10
Ubuntu server 10.04 64bit

server 3
Operon quad core 2.2
2GB RAM
0 additional pci/pci-e cards
4 x 1.5TB SATA HDD in software RAID10
Ubuntu server 10.04 64bit

The Bios settings are all identical. Servers 1 & 2 both have this error, the 250GB drive are getting on a bit but the 1.5TB are brand new. I have tried changing the cables with no effect. The error has been reported on all 4 drives for each machine.

Server 3 has not had any reports. The only other difference is the PSU has been replaced in server 3 for a cheap one from amazon!

This has lead me to think that is is a power related issue, but since it does not seem to have any impact on performance or data integrity I am not sure I need to be concerned??

 My next plan is to use the drive from either server 1 or 2 and put them in server 3 to see if I get any errors.

Adrian Jones, if you have a bug in Ubuntu, the Ubuntu Kernel team, Ubuntu Bug Control team, and Ubuntu Bug Squad would like you to please file a new report by executing the following in a terminal:
ubuntu-bug linux

For more on this, please see the Ubuntu Kernel team article:
https://wiki.ubuntu.com/KernelTeam/KernelTeamBugPolicies#Filing_Kernel_Bug_reports

the Ubuntu Bug Control team and Ubuntu Bug Squad team article:
https://wiki.ubuntu.com/Bugs/BestPractices#X.2BAC8-Reporting.Focus_on_One_Issue

and Ubuntu Community article:
https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette

When opening up the new report, please feel free to subscribe me to it.

Please note, not filing a new report would delay your problem being addressed as quickly as possible.

Thank you for your understanding.

Dmitriy Altuhov (altuhov.su) wrote :

Same at server HP ProLiant DL320e Gen8 v2, BIOS P80 08/28/2013 with two hard drives ST500NM0011 and ST2000NC001-1DY164

Problems only with ST2000NC001! ST500NM0011 working fine.

Dmitriy Altuhov, as this report is Status Invalid, please file a new report via a terminal:
ubuntu-bug linux

pepre (me-pepre) wrote :

After removing marvell-card (now SiI 3132, sata_sil24 module) this error and various other problems (e.g. "gpu has fallen of the bus", early freezes, nic sometimes not available) are gone. Looks like an incompatibility between marvell and board [Asrock A770DE+]).

Displaying first 40 and last 40 comments. View all 151 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.