ATA bus error

Bug #1385034 reported by Leoaloha
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Medium
Unassigned

Bug Description

Maybe related to long lasting bug #550559. Consistent ATAbus error resulting in crashed disk. changed three different SATA cables. change three different physical disks. Three other disks in system OK. Have SSD /dev/sda with operating system on it. Crashing disk always /home /tmp /var and swap on it. Three different ports used.

always;

ata3: exception Emask 0x10 SAct 0x0 SErr 0x10202 action 0xe frozen
ata3: irq_stat 0x00400000, PHY RDY changed
ata3: SError: { RecovComm Persist PHYRdyChg }
ata3: hard resetting link
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
ata3.00: supports DRM functions and may not be fully accessible
ata3.00: supports DRM functions and may not be fully accessible
ata3.00: configured for UDMA/133
ata3: EH complete

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.13.0-37-generic 3.13.0-37.64
ProcVersionSignature: Ubuntu 3.13.0-37.64-generic 3.13.11.7
Uname: Linux 3.13.0-37-generic x86_64
NonfreeKernelModules: nvidia
ApportVersion: 2.14.1-0ubuntu3.5
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: leoaloha 2776 F.... pulseaudio
 /dev/snd/controlC2: leoaloha 2776 F.... pulseaudio
 /dev/snd/controlC0: leoaloha 2776 F.... pulseaudio
CurrentDesktop: X-Cinnamon
Date: Thu Oct 23 18:27:10 2014
HibernationDevice: RESUME=UUID=deb735eb-e665-4f99-bcb3-b156a53694a0
InstallationDate: Installed on 2014-09-19 (34 days ago)
InstallationMedia: Ubuntu 14.04.1 LTS "Trusty Tahr" - Release amd64 (20140722.2)
IwConfig:
 eth0 no wireless extensions.

 eth1 no wireless extensions.

 lo no wireless extensions.
MachineType: Gigabyte Technology Co., Ltd. To be filled by O.E.M.
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-37-generic root=UUID=541845e4-c06b-4c59-8586-dcb8c476fb91 ro quiet splash iommu=soft
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-37-generic N/A
 linux-backports-modules-3.13.0-37-generic N/A
 linux-firmware 1.127.7
RfKill:

SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 01/23/2013
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: FB
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: 990FXA-UD5
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: To be filled by O.E.M.
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: Gigabyte Technology Co., Ltd.
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrFB:bd01/23/2013:svnGigabyteTechnologyCo.,Ltd.:pnTobefilledbyO.E.M.:pvrTobefilledbyO.E.M.:rvnGigabyteTechnologyCo.,Ltd.:rn990FXA-UD5:rvrTobefilledbyO.E.M.:cvnGigabyteTechnologyCo.,Ltd.:ct3:cvrToBeFilledByO.E.M.:
dmi.product.name: To be filled by O.E.M.
dmi.product.version: To be filled by O.E.M.
dmi.sys.vendor: Gigabyte Technology Co., Ltd.

Revision history for this message
Leoaloha (lpheobus) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
tags: added: patch
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.18 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.18-rc1-utopic/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Revision history for this message
Leoaloha (lpheobus) wrote :

tested with new kernel 3.18
had error processing linux-headers
kernel did not bring up second monitor and crashed cinnamon but was able to run.
Same error condition as before with ATAbus error

'kernel-bug-exists-upstream'

Revision history for this message
Leoaloha (lpheobus) wrote :

'kernel-bug-exists-upstream'

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Leoaloha (lpheobus)
tags: added: kernel-bug-exists-upstream
Revision history for this message
Leoaloha (lpheobus) wrote :

Just got this in"dmesg" command and I dont even have raid turned on

[ 2105.233121] SGI XFS with ACLs, security attributes, realtime, large block/inode numbers, no debug enabled
[ 2105.240312] JFS: nTxBlock = 8192, nTxLock = 65536
[ 2105.258658] NTFS driver 2.1.30 [Flags: R/O MODULE].
[ 2105.285705] QNX4 filesystem 0.2.3 registered.
[ 2105.296235] xor: automatically using best checksumming function:
[ 2105.334630] avx : 2825.000 MB/sec
[ 2105.402565] raid6: sse2x1 7258 MB/s
[ 2105.470519] raid6: sse2x2 10996 MB/s
[ 2105.538453] raid6: sse2x4 13018 MB/s
[ 2105.538455] raid6: using algorithm sse2x4 (13018 MB/s)
[ 2105.538456] raid6: using ssse3x2 recovery algorithm
[ 2105.564179] bio: create slab <bio-1> at 1
[ 2105.565200] Btrfs loaded

Revision history for this message
Leoaloha (lpheobus) wrote :

After last power off shutdown and restart no occurance untill 60000 seconds later (17 hours or so)

[60611.054825] ata3.00: exception Emask 0x10 SAct 0x20000 SErr 0x90202 action 0xe frozen
[60611.054830] ata3.00: irq_stat 0x00400000, PHY RDY changed
[60611.054833] ata3: SError: { RecovComm Persist PHYRdyChg 10B8B }
[60611.054836] ata3.00: failed command: READ FPDMA QUEUED
[60611.054841] ata3.00: cmd 60/08:88:b0:19:81/00:00:1e:00:00/40 tag 17 ncq 4096 in
[60611.054841] res 40/00:88:b0:19:81/00:00:1e:00:00/40 Emask 0x10 (ATA bus error)
[60611.054844] ata3.00: status: { DRDY }
[60611.054848] ata3: hard resetting link
[60617.035732] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[60617.071071] ata3.00: supports DRM functions and may not be fully accessible
[60617.072066] ata3.00: supports DRM functions and may not be fully accessible
[60617.072249] ata3.00: configured for UDMA/133
[60617.087693] ata3: EH complete

Revision history for this message
Leoaloha (lpheobus) wrote :

Just for grins I changes which power rail the drive was on.
No affect

When I boot sysresccd with 3.10.25 kernel I get same error.

Summary for now;

Things that are different
   SATA cable
   SATA port
   Physical disk -- 3 different ones used (all Seagate though)
          ( Have 5 drives in system, 1 SSD 180GB and 3 seagate HD and 1 Samsung HD all 2TB )
   Kernels 3.13, 3.14, 3.18, 3.10
   Different Power Rail on Power supply

Things the Same
   Files on the disk --- used ddrescue and dd to move files
   mother board -- i have a new motherboard - different brand but really dont want to open the box
   power supply -- i do have another power supply

penalvch (penalvch)
tags: added: bios-outdated-f12
removed: patch
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Leoaloha (lpheobus) wrote :

Updated bios to FCg
I have rev 3 MB

Revision history for this message
Leoaloha (lpheobus) wrote :

Updated BIOS to version "FCg"

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Leoaloha (lpheobus) wrote :

New BIOS no improvement or maybe even worse

Revision history for this message
Leoaloha (lpheobus) wrote :

output of sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date

FCg
10/07/2014

Revision history for this message
penalvch (penalvch) wrote :

Leoaloha, my mistake as I read the BIOS as F8.

Despite this, did this problem not occur in a release prior to Trusty?

tags: added: latest-bios-fb
removed: bios-outdated-f12
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Leoaloha (lpheobus) wrote :

I dont recall. I moved to another motherboard (current) with 3.10 installed Then promptly upgraded to 14.04. THis system is currently unreliable and I am moving to an AZUS mother board. I let you know if the error persists

Revision history for this message
penalvch (penalvch) wrote :

Leoaloha, could you please remove the non-default kernel parameter iommu=soft and advise if this is reproducible?

As well, for regression testing purposes, could you please test 12.04.0 (kernel 3.2.x) via http://old-releases.ubuntu.com/releases/12.04.0/ and advise to this?

Revision history for this message
Leoaloha (lpheobus) wrote :

I switched to an AZUS Motherboard. Uptime of 17 hours and NO errors as of yet.
I only switched the board. THe disks, files, configurations, processor, power supply all the same.
 I have no way to test the other board so I suppose you can close this bug

Revision history for this message
penalvch (penalvch) wrote :

Leoaloha, this bug report is being closed due to your last comment https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1385034/comments/19 regarding you switched hardware. For future reference you can manage the status of your own bugs by clicking on the current status in the yellow line and then choosing a new status in the revealed drop down box. You can learn more about bug statuses at https://wiki.ubuntu.com/Bugs/Status. Thank you again for taking the time to report this bug and helping to make Ubuntu better. Please submit any future bugs you may find.

Changed in linux (Ubuntu):
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.