Ubuntu
linux package

hardy / ibex - raid5 - ata#: hard resetting link

Bug #263160 reported by q on 2008-08-31

This bug affects 2 people

	Status	Importance	Assigned to
linux (Fedora)	Fix Released	Critical	redhat-bugs #462425
linux (Ubuntu)	Fix Released	Medium	Unassigned
Hardy	Won't Fix	Medium	Bryan Wu
Intrepid	Invalid	Medium	Unassigned

Bug Description

Running 7 disk raid 5 array with the following card:
SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09)

file system is XFS.

Trying to do a 'cp' from the system drive (IDE, XFS) to the raid would constantly lead to the process stalling (state: D+) and leading to a cold reset. I believe network transfers are also suffering from this.

Hardy wasn't reporting _any_ of these errors in dmesg or /var/log/messages. Upgraded to Ibex to try and help track down what was going on and got the following _when_ transferring to the raid.

dmesg:
[11285.918535] ata9.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
[11285.918567] ata9.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0 ncq 1536 out
[11285.918568] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[11285.918619] ata9.00: status: { DRDY }
[11285.918635] ata9: hard resetting link
[11286.420039] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[11286.460065] ata9.00: max_sectors limited to 256 for NCQ
[11286.520054] ata9.00: max_sectors limited to 256 for NCQ
[11286.520059] ata9.00: configured for UDMA/133
[11286.520077] ata9: EH complete
[11286.520119] sd 8:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
[11286.520132] sd 8:0:0:0: [sdd] Write Protect is off
[11286.520134] sd 8:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[11286.520154] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[11326.988529] ata8.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
[11326.988554] ata8.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0 ncq 1536 out
[11326.988555] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[11326.988606] ata8.00: status: { DRDY }
[11326.988623] ata8: hard resetting link
[11327.500037] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[11327.580053] ata8.00: max_sectors limited to 256 for NCQ
[11327.657199] ata8.00: max_sectors limited to 256 for NCQ
[11327.657202] ata8.00: configured for UDMA/133
[11327.657207] ata8: EH complete
[11327.657257] sd 7:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
[11327.657272] sd 7:0:0:0: [sdc] Write Protect is off
[11327.657273] sd 7:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[11327.657296] sd 7:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[11377.938532] ata7.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
[11377.938557] ata7.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0 ncq 1536 out
[11377.938558] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[11377.938608] ata7.00: status: { DRDY }
[11377.938624] ata7: hard resetting link
[11378.440037] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[11378.520056] ata7.00: max_sectors limited to 256 for NCQ
[11378.600065] ata7.00: max_sectors limited to 256 for NCQ
[11378.600068] ata7.00: configured for UDMA/133
[11378.600073] ata7: EH complete
[11378.600120] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB)
[11378.600133] sd 6:0:0:0: [sdb] Write Protect is off
[11378.600135] sd 6:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[11378.600155] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[11711.718523] ata9.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
[11711.718548] ata9.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0 ncq 1536 out
[11711.718549] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[11711.718600] ata9.00: status: { DRDY }
[11711.718616] ata9: hard resetting link
[11712.220041] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[11712.260058] ata9.00: max_sectors limited to 256 for NCQ
[11712.320057] ata9.00: max_sectors limited to 256 for NCQ
[11712.320066] ata9.00: configured for UDMA/133
[11712.320072] ata9: EH complete
[11712.320112] sd 8:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
[11712.320125] sd 8:0:0:0: [sdd] Write Protect is off
[11712.320127] sd 8:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[11712.320148] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[11849.328524] ata7.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
[11849.328549] ata7.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0 ncq 1536 out
[11849.328549] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[11849.328600] ata7.00: status: { DRDY }
[11849.328617] ata7: hard resetting link
[11849.830037] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[11849.910070] ata7.00: max_sectors limited to 256 for NCQ
[11849.990053] ata7.00: max_sectors limited to 256 for NCQ
[11849.990057] ata7.00: configured for UDMA/133
[11849.990069] ata7: EH complete
[11849.990109] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB)
[11849.990123] sd 6:0:0:0: [sdb] Write Protect is off
[11849.990125] sd 6:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[11849.990147] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[11909.629773] ata9.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
[11909.629797] ata9.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0 ncq 1536 out
[11909.629798] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[11909.629849] ata9.00: status: { DRDY }
[11909.629865] ata9: hard resetting link
[11910.131295] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[11910.180068] ata9.00: max_sectors limited to 256 for NCQ
[11910.231316] ata9.00: max_sectors limited to 256 for NCQ
[11910.231319] ata9.00: configured for UDMA/133
[11910.231327] ata9: EH complete
[11910.231381] sd 8:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
[11910.231394] sd 8:0:0:0: [sdd] Write Protect is off
[11910.231396] sd 8:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[11910.231417] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[11996.729773] ata7.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
[11996.729797] ata7.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0 ncq 1536 out
[11996.729798] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[11996.729848] ata7.00: status: { DRDY }
[11996.729865] ata7: hard resetting link
[11997.231291] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[11997.311308] ata7.00: max_sectors limited to 256 for NCQ
[11997.391306] ata7.00: max_sectors limited to 256 for NCQ
[11997.391316] ata7.00: configured for UDMA/133
[11997.391322] ata7: EH complete
[11997.391366] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB)
[11997.391378] sd 6:0:0:0: [sdb] Write Protect is off
[11997.391380] sd 6:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[11997.391400] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FU

/var/log/messages:
Aug 30 20:12:43 isis kernel: [11285.918635] ata9: hard resetting link
Aug 30 20:12:43 isis kernel: [11286.420039] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Aug 30 20:12:43 isis kernel: [11286.460065] ata9.00: max_sectors limited to 256 for NCQ
Aug 30 20:12:43 isis kernel: [11286.520054] ata9.00: max_sectors limited to 256 for NCQ
Aug 30 20:12:43 isis kernel: [11286.520059] ata9.00: configured for UDMA/133
Aug 30 20:12:43 isis kernel: [11286.520077] ata9: EH complete
Aug 30 20:12:43 isis kernel: [11286.520119] sd 8:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
Aug 30 20:12:43 isis kernel: [11286.520132] sd 8:0:0:0: [sdd] Write Protect is off
Aug 30 20:12:43 isis kernel: [11286.520154] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 30 20:13:24 isis kernel: [11326.988623] ata8: hard resetting link
Aug 30 20:13:24 isis kernel: [11327.500037] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Aug 30 20:13:24 isis kernel: [11327.580053] ata8.00: max_sectors limited to 256 for NCQ
Aug 30 20:13:24 isis kernel: [11327.657199] ata8.00: max_sectors limited to 256 for NCQ
Aug 30 20:13:24 isis kernel: [11327.657202] ata8.00: configured for UDMA/133
Aug 30 20:13:24 isis kernel: [11327.657207] ata8: EH complete
Aug 30 20:13:24 isis kernel: [11327.657257] sd 7:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
Aug 30 20:13:24 isis kernel: [11327.657272] sd 7:0:0:0: [sdc] Write Protect is off
Aug 30 20:13:24 isis kernel: [11327.657296] sd 7:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 30 20:14:15 isis kernel: [11377.938624] ata7: hard resetting link
Aug 30 20:14:15 isis kernel: [11378.440037] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Aug 30 20:14:15 isis kernel: [11378.520056] ata7.00: max_sectors limited to 256 for NCQ
Aug 30 20:14:15 isis kernel: [11378.600065] ata7.00: max_sectors limited to 256 for NCQ
Aug 30 20:14:15 isis kernel: [11378.600068] ata7.00: configured for UDMA/133
Aug 30 20:14:15 isis kernel: [11378.600073] ata7: EH complete
Aug 30 20:14:15 isis kernel: [11378.600120] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB)
Aug 30 20:14:15 isis kernel: [11378.600133] sd 6:0:0:0: [sdb] Write Protect is off
Aug 30 20:14:15 isis kernel: [11378.600155] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 30 20:19:48 isis kernel: [11711.718616] ata9: hard resetting link
Aug 30 20:19:49 isis kernel: [11712.220041] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Aug 30 20:19:49 isis kernel: [11712.260058] ata9.00: max_sectors limited to 256 for NCQ
Aug 30 20:19:49 isis kernel: [11712.320057] ata9.00: max_sectors limited to 256 for NCQ
Aug 30 20:19:49 isis kernel: [11712.320066] ata9.00: configured for UDMA/133
Aug 30 20:19:49 isis kernel: [11712.320072] ata9: EH complete
Aug 30 20:19:49 isis kernel: [11712.320112] sd 8:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
Aug 30 20:19:49 isis kernel: [11712.320125] sd 8:0:0:0: [sdd] Write Protect is off
Aug 30 20:19:49 isis kernel: [11712.320148] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 30 20:22:06 isis kernel: [11849.328617] ata7: hard resetting link
Aug 30 20:22:06 isis kernel: [11849.830037] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Aug 30 20:22:06 isis kernel: [11849.910070] ata7.00: max_sectors limited to 256 for NCQ
Aug 30 20:22:07 isis kernel: [11849.990053] ata7.00: max_sectors limited to 256 for NCQ
Aug 30 20:22:07 isis kernel: [11849.990057] ata7.00: configured for UDMA/133
Aug 30 20:22:07 isis kernel: [11849.990069] ata7: EH complete
Aug 30 20:22:07 isis kernel: [11849.990109] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB)
Aug 30 20:22:07 isis kernel: [11849.990123] sd 6:0:0:0: [sdb] Write Protect is off
Aug 30 20:22:07 isis kernel: [11849.990147] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 30 20:23:06 isis kernel: [11909.629865] ata9: hard resetting link
Aug 30 20:23:07 isis kernel: [11910.131295] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Aug 30 20:23:07 isis kernel: [11910.180068] ata9.00: max_sectors limited to 256 for NCQ
Aug 30 20:23:07 isis kernel: [11910.231316] ata9.00: max_sectors limited to 256 for NCQ
Aug 30 20:23:07 isis kernel: [11910.231319] ata9.00: configured for UDMA/133
Aug 30 20:23:07 isis kernel: [11910.231327] ata9: EH complete
Aug 30 20:23:07 isis kernel: [11910.231381] sd 8:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
Aug 30 20:23:07 isis kernel: [11910.231394] sd 8:0:0:0: [sdd] Write Protect is off
Aug 30 20:23:07 isis kernel: [11910.231417] sd 8:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 30 20:24:33 isis kernel: [11996.729865] ata7: hard resetting link
Aug 30 20:24:34 isis kernel: [11997.231291] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Aug 30 20:24:34 isis kernel: [11997.311308] ata7.00: max_sectors limited to 256 for NCQ
Aug 30 20:24:34 isis kernel: [11997.391306] ata7.00: max_sectors limited to 256 for NCQ
Aug 30 20:24:34 isis kernel: [11997.391316] ata7.00: configured for UDMA/133
Aug 30 20:24:34 isis kernel: [11997.391322] ata7: EH complete
Aug 30 20:24:34 isis kernel: [11997.391366] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB)
Aug 30 20:24:34 isis kernel: [11997.391378] sd 6:0:0:0: [sdb] Write Protect is off
Aug 30 20:24:34 isis kernel: [11997.391400] sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

I've replaced the card and cables and i'm still getting the issue.

This card&raid was working on a centos last week (2.6.18 32bit).
Replaced OS (ubuntu 64bit), cpu (core2duo), mobo (asus p5k pro)

other info:
1) ubuntu release:
Description: Ubuntu intrepid (development branch)
Release: 8.10
2) package versions:
linux-server:
  Installed: 2.6.27.2.2
mdadm:
  Installed: 2.6.7-3ubuntu4
  Candidate: 2.6.7-3ubuntu4

I'm really at a loss here, not sure what else to do. I stressed the other components of the system in windows and they seemed fine. not sure if its the card or something with the newer kernels.

also, these issues are not causing my raid to fail.
q@test:/storage$ sudo mdadm -D /dev/md1
/dev/md1:
        Version : 01.02
  Creation Time : Sat Jan 19 13:29:40 2008
     Raid Level : raid5
     Array Size : 2930302464 (2794.55 GiB 3000.63 GB)
  Used Dev Size : 976767488 (931.52 GiB 1000.21 GB)
   Raid Devices : 7
  Total Devices : 7
Preferred Minor : 1
    Persistence : Superblock is persistent

Intent Bitmap : Internal

    Update Time : Sat Aug 30 20:49:05 2008
          State : active
Active Devices : 7
Working Devices : 7
Failed Devices : 0
  Spare Devices : 0

Layout : left-symmetric
Chunk Size : 128K

           Name : big500raid
           UUID : 51ba59f2:45e85c89:53a81444:b210e1c6
         Events : 46

    Number Major Minor RaidDevice State
       0 8 17 0 active sync /dev/sdb1
       1 8 33 1 active sync /dev/sdc1
       2 8 97 2 active sync /dev/sdg1
       3 8 49 3 active sync /dev/sdd1
       4 8 81 4 active sync /dev/sdf1
       5 8 65 5 active sync /dev/sde1
       7 8 113 6 active sync /dev/sdh1

See original description

Tags:

q (qr7atgwu) on 2008-09-02

description:

updated

Revision history for this message

Chris (billytwowilly) wrote on 2008-11-06:

Download full text (6.4 KiB)

I am having the exact same problem in 8.10 (kubuntu fresh install) with my six disk raid 5 array. I'm using software raid, so I'm not sure if that is exactly the same. What motherboard are you using? I'm using an asus p5q and all drives are plugged into the onboard sata controller, not the onboard xpress backup ports(I believe it's an intel sata controller).

Here's the relevant bit of my /var/log/messages:

I am having the exact same problem in 8.10 (kubuntu fresh install) with my six disk raid 5 array. I'm using software raid, so I'm not sure if that is exactly the same. What motherboard are you using? I'm using an asus p5q and all drives are plugged into the onboard sata controller, not the onboard  xpress backup ports(I believe it's an intel sata controller).

Here's the relevant bit of  my /var/log/messages:

Nov  5 12:49:12 serverv2 -- MARK --
Nov  5 13:09:12 serverv2 -- MARK --
Nov  5 13:29:12 serverv2 -- MARK --
Nov  5 13:49:12 serverv2 -- MARK --
Nov  5 14:09:12 serverv2 -- MARK --
Nov  5 14:29:12 serverv2 -- MARK --
Nov  5 14:49:12 serverv2 -- MARK --
Nov  5 15:09:12 serverv2 -- MARK --
Nov  5 15:29:12 serverv2 -- MARK --
Nov  5 15:49:12 serverv2 -- MARK --
Nov  5 15:50:55 serverv2 python: hp-systray(init)[6671]: warning: No hp: or hpfax: devices found in any installed CUPS queue. Exiting.
Nov  5 15:52:07 serverv2 kernel: [12192.326735] type=1503 audit(1225925527.559:4): operation="inode_permission" requested_mask="r::" denied_mask="r::" fsuid=7 name="/proc/6778/net/" pid=6778 profile="/usr/sbin/cupsd"
Nov  5 15:52:08 serverv2 kernel: [12193.222565] type=1503 audit(1225925528.454:5): operation="inode_permission" requested_mask="r::" denied_mask="r::" fsuid=7 name="/proc/6782/net/" pid=6782 profile="/usr/sbin/cupsd"
Nov  5 15:52:08 serverv2 kernel: [12193.222606] type=1503 audit(1225925528.454:6): operation="socket_create" family="ax25" sock_type="dgram" protocol=0 pid=6782 profile="/usr/sbin/cupsd"
Nov  5 15:52:08 serverv2 kernel: [12193.222614] type=1503 audit(1225925528.454:7): operation="socket_create" family="netrom" sock_type="seqpacket" protocol=0 pid=6782 profile="/usr/sbin/cupsd"
Nov  5 15:52:08 serverv2 kernel: [12193.222621] type=1503 audit(1225925528.454:8): operation="socket_create" family="rose" sock_type="dgram" protocol=0 pid=6782 profile="/usr/sbin/cupsd"
Nov  5 15:52:08 serverv2 kernel: [12193.222628] type=1503 audit(1225925528.454:9): operation="socket_create" family="ipx" sock_type="dgram" protocol=0 pid=6782 profile="/usr/sbin/cupsd"
Nov  5 15:52:08 serverv2 kernel: [12193.222634] type=1503 audit(1225925528.454:10): operation="socket_create" family="appletalk" sock_type="dgram" protocol=0 pid=6782 profile="/usr/sbin/cupsd"
Nov  5 15:52:08 serverv2 kernel: [12193.222641] type=1503 audit(1225925528.454:11): operation="socket_create" family="econet" sock_type="dgram" protocol=0 pid=6782 profile="/usr/sbin/cupsd"
Nov  5 15:52:08 serverv2 kernel: [12193.222648] type=1503 audit(1225925528.454:12): operation="socket_create" family="ash" sock_type="dgram" protocol=0 pid=6782 profile="/usr/sbin/cupsd"
Nov  5 15:52:08 serverv2 kernel: [12193.222654] type=1503 audit(1225925528.454:13): operation="socket_create" family="x25" sock_type="seqpacket" protocol=0 pid=6782 profile="/usr/sbin/cupsd"
Nov  5 16:05:21 serverv2 kernel: [12986.184075] ata3: hard resetting link
Nov  5 16:05:21 serverv2 kernel: [12986.184077] ata4: hard resetting link
Nov  5 16:05:21 serverv2 kernel: [12986.668023] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Nov  5 16:05:21 serverv2 kernel: [12986.668709] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Nov  5 16:05:21 serverv2 kernel: [12986.670396] ata4.00: configured for UDMA/133
Nov  5 16:05:21 serverv2 kernel: [12986.670419] ata4: EH complete
Nov  5 16:05:21 serverv2 kernel: [12986.670494] sd 3:0:0:0: [sdd] 2930277168 512-byte hardware sectors (1500302 MB)
Nov  5 16:05:21 serverv2 kernel: [12986.670517] sd 3:0:0:0: [sdd] Write Protect is off
Nov  5 16:05:21 serverv2 kernel: [12986.670556] sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Nov  5 16:05:21 serverv2 kernel: [12986.670941] ata3.00: configured for UDMA/133
Nov  5 16:05:21 serverv2 kernel: [12986.670952] ata3: EH complete
Nov  5 16:05:21 serverv2 kernel: [12986.670992] sd 2:0:0:0: [sdc] 2930277168 512-byte hardware sectors (1500302 MB)
Nov  5 16:05:21 serverv2 kernel: [12986.671012] sd 2:0:0:0: [sdc] Write Protect is off
Nov  5 16:05:21 serverv2 kernel: [12986.671050] sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Nov  5 16:05:21 serverv2 kernel: [12986.682437] md: super_written gets error=-5, uptodate=0
Nov  5 16:05:21 serverv2 kernel: [12986.704202] md: super_written gets error=-5, uptodate=0
Nov  5 16:05:21 serverv2 kernel: [12986.757591] RAID5 conf printout:
Nov  5 16:05:21 serverv2 kernel: [12986.757598]  --- rd:6 wd:4
Nov  5 16:05:21 serverv2 kernel: [12986.757601]  disk 0, o:1, dev:sda1
Nov  5 16:05:21 serverv2 kernel: [12986.757604]  disk 1, o:1, dev:sdb1
Nov  5 16:05:21 serverv2 kernel: [12986.757606]  disk 2, o:0, dev:sdc1
Nov  5 16:05:21 serverv2 kernel: [12986.757608]  disk 3, o:0, dev:sdd1
Nov  5 16:05:21 serverv2 kernel: [12986.757610]  disk 4, o:1, dev:sde1
Nov  5 16:05:21 serverv2 kernel: [12986.757612]  disk 5, o:1, dev:sdf1
Nov  5 16:05:22 serverv2 kernel: [12986.769512] RAID5 conf printout:
Nov  5 16:05:22 serverv2 kernel: [12986.769520]  --- rd:6 wd:4
Nov  5 16:05:22 serverv2 kernel: [12986.769523]  disk 0, o:1, dev:sda1
Nov  5 16:05:22 serverv2 kernel: [12986.769525]  disk 1, o:1, dev:sdb1
Nov  5 16:05:22 serverv2 kernel: [12986.769527]  disk 2, o:0, dev:sdc1
Nov  5 16:05:22 serverv2 kernel: [12986.769529]  disk 4, o:1, dev:sde1
Nov  5 16:05:22 serverv2 kernel: [12986.769531]  disk 5, o:1, dev:sdf1
Nov  5 16:05:22 serverv2 kernel: [12986.769549] RAID5 conf printout:
Nov  5 16:05:22 serverv2 kernel: [12986.769551]  --- rd:6 wd:4
Nov  5 16:05:22 serverv2 kernel: [12986.769552]  disk 0, o:1, dev:sda1
Nov  5 16:05:22 serverv2 kernel: [12986.769554]  disk 1, o:1, dev:sdb1
Nov  5 16:05:22 serverv2 kernel: [12986.769556]  disk 2, o:0, dev:sdc1
Nov  5 16:05:22 serverv2 kernel: [12986.769558]  disk 4, o:1, dev:sde1
Nov  5 16:05:22 serverv2 kernel: [12986.769560]  disk 5, o:1, dev:sdf1
Nov  5 16:05:22 serverv2 kernel: [12986.789508] RAID5 conf printout:
Nov  5 16:05:22 serverv2 kernel: [12986.789513]  --- rd:6 wd:4
Nov  5 16:05:22 serverv2 kernel: [12986.789516]  disk 0, o:1, dev:sda1
Nov  5 16:05:22 serverv2 kernel: [12986.789518]  disk 1, o:1, dev:sdb1
Nov  5 16:05:22 serverv2 kernel: [12986.789520]  disk 4, o:1, dev:sde1
Nov  5 16:05:22 serverv2 kernel: [12986.789522]  disk 5, o:1, dev:sdf1
Nov  5 16:17:00 serverv2 kernel: [13684.778463] md: md0 still in use.
Nov  5 16:17:51 serverv2 kernel: [13736.418180] ip6_tables: (C) 2000-2006 Netfilter Core Team
Nov  5 16:18:03 serverv2 exiting on signal 15

Revision history for this message

Chris (billytwowilly) wrote on 2008-11-06:

http://forums.seagate.com/stx/board/message?board.id=ata_drives&thread.id=2390&view=by_date_ascending&page=1

I'm probably getting the above problem, perhaps you are too? Are your drives seagate drives?

Revision history for this message

q (qr7atgwu) wrote on 2008-11-08:

2 of my drives are seagate, another is a western digital. They're 500GB drives.

i just did a clean install of 8.10 and its still happening. this wasn't an issue back in January when i ran RHEL5...

Revision history for this message

Richard Ayotte (rich-ayotte) wrote on 2008-11-16:

sata_errors.txt Edit (62.8 KiB, text/plain)

I have the same problem.

rich@cheetah:~$ sudo hdparm -I /dev/sda

/dev/sda:

ATA device, with non-removable media
Model Number: Hitachi HDS721010KLA330
Serial Number: GTE005PAJXM1PL
Firmware Revision: GKAOAB0A
Standards:
Used: ATA/ATAPI-7 T13 1532D revision 1
Supported: 7 6 5 4 & some of 8
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 1953525168
device size with M = 1024*1024: 953869 MBytes
device size with M = 1000*1000: 1000204 MBytes (1000 GB)
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = 1
Advanced power management level: disabled
Recommended acoustic management value: 128, current value: 254
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
      Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
      Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
    * SMART feature set
      Security Mode feature set
    * Power Management feature set
    * Write cache
    * Look-ahead
    * Host Protected Area feature set
    * WRITE_BUFFER command
    * READ_BUFFER command
    * DOWNLOAD_MICROCODE
      Advanced Power Management feature set
      Power-Up In Standby feature set
      SET_FEATURES required to spinup after power up
      Address Offset Reserved Area Boot
      SET_MAX security extension
      Automatic Acoustic Management feature set
    * 48-bit Address feature set
    * Device Configuration Overlay feature set
    * Mandatory FLUSH_CACHE
    * FLUSH_CACHE_EXT
    * SMART error logging
    * SMART self-test
      Media Card Pass-Through
    * General Purpose Logging feature set
    * WRITE_{DMA|MULTIPLE}_FUA_EXT
    * 64-bit World wide name
    * URG for READ_STREAM[_DMA]_EXT
    * URG for WRITE_STREAM[_DMA]_EXT
    * WRITE_UNCORRECTABLE_EXT command
    * Segmented DOWNLOAD_MICROCODE
    * SATA-I signaling speed (1.5Gb/s)
    * SATA-II signaling speed (3.0Gb/s)
    * Native Command Queueing (NCQ)
    * Host-initiated interface power management
    * Phy event counters
    * unknown 76[12]
      Non-Zero buffer offsets in DMA Setup FIS
      DMA Setup Auto-Activate optimization
      Device-initiated interface power management
      In-order data delivery
    * Software settings preservation
    * SMART Command Transport (SCT) feature set
    * SCT Long Sector Access (AC1)
    * SCT LBA Segment Access (AC2)
    * SCT Error Recovery Control (AC3)
    * SCT Features Control (AC4)
    * SCT Data Tables (AC5)
Security:
Master password revision code = 65534
  supported
not enabled
not locked
not frozen
not expired: security count
not supported: enhanced erase
340min for SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 5000cca216e930ed
NAA : 5
IEEE OUI : cca
Unique ID : 216e930ed
Checksum: correct

I have the same problem.

rich@cheetah:~$ sudo hdparm -I /dev/sda

/dev/sda:

ATA device, with non-removable media
	Model Number:       Hitachi HDS721010KLA330                 
	Serial Number:      GTE005PAJXM1PL
	Firmware Revision:  GKAOAB0A
Standards:
	Used: ATA/ATAPI-7 T13 1532D revision 1 
	Supported: 7 6 5 4 & some of 8
Configuration:
	Logical		max	current
	cylinders	16383	16383
	heads		16	16
	sectors/track	63	63
	--
	CHS current addressable sectors:   16514064
	LBA    user addressable sectors:  268435455
	LBA48  user addressable sectors: 1953525168
	device size with M = 1024*1024:      953869 MBytes
	device size with M = 1000*1000:     1000204 MBytes (1000 GB)
Capabilities:
	LBA, IORDY(can be disabled)
	Queue depth: 32
	Standby timer values: spec'd by Standard, no device specific minimum
	R/W multiple sector transfer: Max = 16	Current = 1
	Advanced power management level: disabled
	Recommended acoustic management value: 128, current value: 254
	DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 
	     Cycle time: min=120ns recommended=120ns
	PIO: pio0 pio1 pio2 pio3 pio4 
	     Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
	Enabled	Supported:
	   *	SMART feature set
	    	Security Mode feature set
	   *	Power Management feature set
	   *	Write cache
	   *	Look-ahead
	   *	Host Protected Area feature set
	   *	WRITE_BUFFER command
	   *	READ_BUFFER command
	   *	DOWNLOAD_MICROCODE
	    	Advanced Power Management feature set
	    	Power-Up In Standby feature set
	    	SET_FEATURES required to spinup after power up
	    	Address Offset Reserved Area Boot
	    	SET_MAX security extension
	    	Automatic Acoustic Management feature set
	   *	48-bit Address feature set
	   *	Device Configuration Overlay feature set
	   *	Mandatory FLUSH_CACHE
	   *	FLUSH_CACHE_EXT
	   *	SMART error logging
	   *	SMART self-test
	    	Media Card Pass-Through
	   *	General Purpose Logging feature set
	   *	WRITE_{DMA|MULTIPLE}_FUA_EXT
	   *	64-bit World wide name
	   *	URG for READ_STREAM[_DMA]_EXT
	   *	URG for WRITE_STREAM[_DMA]_EXT
	   *	WRITE_UNCORRECTABLE_EXT command
	   *	Segmented DOWNLOAD_MICROCODE
	   *	SATA-I signaling speed (1.5Gb/s)
	   *	SATA-II signaling speed (3.0Gb/s)
	   *	Native Command Queueing (NCQ)
	   *	Host-initiated interface power management
	   *	Phy event counters
	   *	unknown 76[12]
	    	Non-Zero buffer offsets in DMA Setup FIS
	    	DMA Setup Auto-Activate optimization
	    	Device-initiated interface power management
	    	In-order data delivery
	   *	Software settings preservation
	   *	SMART Command Transport (SCT) feature set
	   *	SCT Long Sector Access (AC1)
	   *	SCT LBA Segment Access (AC2)
	   *	SCT Error Recovery Control (AC3)
	   *	SCT Features Control (AC4)
	   *	SCT Data Tables (AC5)
Security: 
	Master password revision code = 65534
		supported
	not	enabled
	not	locked
	not	frozen
	not	expired: security count
	not	supported: enhanced erase
	340min for SECURITY ERASE UNIT. 
Logical Unit WWN Device Identifier: 5000cca216e930ed
	NAA		: 5
	IEEE OUI	: cca
	Unique ID	: 216e930ed
Checksum: correct

Revision history for this message

q (qr7atgwu) wrote on 2008-12-01:

Download full text (30.2 KiB)

Pretty fed up with people saying this could be so many different issues. So much so that i finally decided to risk my data to prove it.... read the following.

***___This has got to be the card / chipset / sata_mv driver._____***

Short and simple version of my issues:
    - This does not depend on drive types
    - Appears to be caused by MV88SX6081 chipset
    - Could be a problem in SATA_MV driver
    - I need replacement controller suggestions

Details to all non believers (it’s not a power / hardware issue):
I moved 5 of the 7 drives to my onboard controller (have 6 sata ports on the mobo, last was used by the system drive).
Left 2 of the western digital drives on the MV88SX6081 8-port SATA II:
- sdg
- sdh

After the advice of some through email, I unplugged everything that wasn't needed. They assumed that it could have been power giving the number of drives I had in the machine. What was left on a tx750w corsair power supply:
    - mobo (c2d, 4gb ram)
    - 7 sata raid drives - spread across multiple power supply rails
    - 1 sata system drive
    - Super Micro SAT2-MV8 (MV88SX6081 8-port SATA II)
    - intel pcie 10/100/1000 network card

Then I replaced the sate cables 1 more time with old cables I knew worked. I also threw in the brand new controller card as well (have a few spares lying around).
I brought everything up and upgraded to:

Then I started to rebuild the raid. Everything went fine, no freezes.
**This was the first indication that this only happens under heavy load on multiple ports as has been brought up before.
So then I started copying data over. About 180GB's the card hard reset both of the drives attached to it and knocked them both out of the raid.
**This was also significantly different from before when I was utilizing all the ports as it seemed to work great for quite some time, it wasn't until I was well into the process that the card finally gave up.
See the attached dmesg and /var/log/messages. This is the 2nd time I’ve had this card degrade my raid and almost give me a heart attack.

The cards are going in the trash at this point. I'm open to suggestions as to possibly replacement. I don’t need a hardware raid card, just a decent controller with great *nix support and lots of ports.
::sigh:: I don’t know who to contact but this is the end of the line for me with this controller and hopefully my issues.

Attempting to get my data back as we speak with 2 failed drives in a raid 5... wonderful times.

Pretty fed up with people saying this could be so many different issues. So much so that i finally decided to risk my data to prove it.... read the following.

***___This has got to be the card / chipset / sata_mv driver._____***

Attempting to get my data back as we speak with 2 failed drives in a raid 5... wonderful times.

dmsg of the event:
[ 1061.040118] md: recovery of RAID array md1
[ 1061.040120] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[ 1061.040122] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
[ 1061.040126] md: using 128k window, over a total of 488383744 blocks.
[11208.852220] md: md1: recovery done.
[11209.020072] RAID5 conf printout:
[11209.020076]  --- rd:7 wd:7
[11209.020079]  disk 0, o:1, dev:sdd1
[11209.020080]  disk 1, o:1, dev:sdb1
[11209.020081]  disk 2, o:1, dev:sdh1
[11209.020082]  disk 3, o:1, dev:sdc1
[11209.020083]  disk 4, o:1, dev:sdf1
[11209.020084]  disk 5, o:1, dev:sde1
[11209.020085]  disk 6, o:1, dev:sdg1
[19844.431690] SGI XFS with ACLs, security attributes, realtime, large block/inode numbers, no debug enabled
[19844.433148] SGI XFS Quota Management subsystem
[19844.442507] Filesystem "md1": Disabling barriers, trial barrier write failed
[19844.442658] XFS mounting filesystem md1
[19844.893398] Ending clean XFS mount for filesystem: md1
[27027.170016] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[27027.170041] ata5.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[27027.170041]          res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[27027.170083] ata5.00: status: { DRDY }
[27027.170099] ata5: hard resetting link
[27027.680034] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[27027.720050] ata5.00: max_sectors limited to 256 for NCQ
[27027.780047] ata5.00: max_sectors limited to 256 for NCQ
[27027.780050] ata5.00: configured for UDMA/133
[27027.780055] end_request: I/O error, dev sdg, sector 73
[27027.780073] md: super_written gets error=-5, uptodate=0
[27027.780076] raid5: Disk failure on sdg1, disabling device.
[27027.780077] raid5: Operation continuing on 6 devices.
[27027.780117] ata5: EH complete
[27027.780674] sd 4:0:0:0: [sdg] 976773168 512-byte hardware sectors (500108 MB)
[27027.780800] sd 4:0:0:0: [sdg] Write Protect is off
[27027.780803] sd 4:0:0:0: [sdg] Mode Sense: 00 3a 00 00
[27027.781038] sd 4:0:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[27057.930015] ata12.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[27057.930039] ata12.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[27057.930040]          res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[27057.930081] ata12.00: status: { DRDY }
[27057.930098] ata12: hard resetting link
[27058.440033] ata12: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[27058.480049] ata12.00: max_sectors limited to 256 for NCQ
[27058.540047] ata12.00: max_sectors limited to 256 for NCQ
[27058.540050] ata12.00: configured for UDMA/133
[27058.540055] end_request: I/O error, dev sdh, sector 71
[27058.540072] md: super_written gets error=-5, uptodate=0
[27058.540075] raid5: Disk failure on sdh1, disabling device.
[27058.540076] raid5: Operation continuing on 5 devices.
[27058.540113] ata12: EH complete
[27058.540754] sd 11:0:0:0: [sdh] 976773168 512-byte hardware sectors (500108 MB)
[27058.540879] sd 11:0:0:0: [sdh] Write Protect is off
[27058.540882] sd 11:0:0:0: [sdh] Mode Sense: 00 3a 00 00
[27058.541070] sd 11:0:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[27058.584017] RAID5 conf printout:
[27058.584020]  --- rd:7 wd:5
[27058.584022]  disk 0, o:1, dev:sdd1
[27058.584023]  disk 1, o:1, dev:sdb1
[27058.584024]  disk 2, o:0, dev:sdh1
[27058.584025]  disk 3, o:1, dev:sdc1
[27058.584027]  disk 4, o:1, dev:sdf1
[27058.584028]  disk 5, o:1, dev:sde1
[27058.584029]  disk 6, o:0, dev:sdg1
[27061.521245] BUG: soft lockup - CPU#1 stuck for 61s! [smbd:28171]
[27061.521251] Modules linked in: xfs aes_x86_64 aes_generic ecb crypto_blkcipher ecryptfs ipv6 af_packet iptable_filter ip_tables x_tables ac sbp2 parport_pc lp parport loop psmouse pcspkr serio_raw iTCO_wdt iTCO_vendor_support evdev button intel_agp snd_hda_intel snd_pcm shpchp snd_timer pci_hotplug snd soundcore snd_page_alloc ext3 jbd mbcache sd_mod crc_t10dif sg pata_acpi pata_marvell usbhid hid ohci1394 ieee1394 sata_mv ata_generic ata_piix libata scsi_mod dock sky2 e1000e ehci_hcd uhci_hcd usbcore raid10 raid456 async_xor async_memcpy async_tx xor raid1 raid0 multipath linear md_mod dm_mirror dm_log dm_snapshot dm_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
[27061.521251] CPU 1:
[27061.521251] Modules linked in: xfs aes_x86_64 aes_generic ecb crypto_blkcipher ecryptfs ipv6 af_packet iptable_filter ip_tables x_tables ac sbp2 parport_pc lp parport loop psmouse pcspkr serio_raw iTCO_wdt iTCO_vendor_support evdev button intel_agp snd_hda_intel snd_pcm shpchp snd_timer pci_hotplug snd soundcore snd_page_alloc ext3 jbd mbcache sd_mod crc_t10dif sg pata_acpi pata_marvell usbhid hid ohci1394 ieee1394 sata_mv ata_generic ata_piix libata scsi_mod dock sky2 e1000e ehci_hcd uhci_hcd usbcore raid10 raid456 async_xor async_memcpy async_tx xor raid1 raid0 multipath linear md_mod dm_mirror dm_log dm_snapshot dm_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
[27061.521251] Pid: 28171, comm: smbd Not tainted 2.6.27-9-server #1
[27061.521251] RIP: 0010:[<ffffffff802abf0c>]  [<ffffffff802abf0c>] find_get_pages+0x6c/0x110
[27061.521251] RSP: 0018:ffff880129453358  EFLAGS: 00000246
[27061.521251] RAX: ffff880128d89330 RBX: ffff880129453398 RCX: 0000000000000002
[27061.521251] RDX: 0000000000000003 RSI: 0000000000000000 RDI: ffffe200022e9e80
[27061.521251] RBP: ffff880129453308 R08: ffffe200009df6c8 R09: 0000000000000005
[27061.521251] R10: 0000000000000037 R11: 00000000001c5778 R12: ffffffff802b6b29
[27061.521251] R13: ffff880123a107d0 R14: ffffe20001c6f6c0 R15: 0000000000000286
[27061.521251] FS:  00007fb72cdf6700(0000) GS:ffff88012fc02980(0000) knlGS:0000000000000000
[27061.521251] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[27061.521251] CR2: 00007f1648629000 CR3: 000000012956d000 CR4: 00000000000006e0
[27061.521251] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[27061.521251] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[27061.521251]
[27061.521251] Call Trace:
[27061.521251]  [<ffffffff802abee3>] ? find_get_pages+0x43/0x110
[27061.521251]  [<ffffffff802b6984>] ? pagevec_lookup+0x24/0x30
[27061.521251]  [<ffffffffa04e100d>] ? xfs_cluster_write+0xad/0x180 [xfs]
[27061.521251]  [<ffffffffa04e1578>] ? xfs_page_state_convert+0x498/0x760 [xfs]
[27061.521251]  [<ffffffffa04e19a1>] ? xfs_vm_writepage+0x71/0x120 [xfs]
[27061.521251]  [<ffffffff802b9274>] ? pageout+0x124/0x270
[27061.521251]  [<ffffffff802ab06a>] ? page_waitqueue+0xa/0x90
[27061.521251]  [<ffffffff802b986d>] ? shrink_page_list+0x34d/0x530
[27061.521251]  [<ffffffff802b8e49>] ? __isolate_lru_page+0x79/0xb0
[27061.521251]  [<ffffffff802b8f0a>] ? isolate_lru_pages+0x8a/0x220
[27061.521251]  [<ffffffff802b9bf2>] ? shrink_inactive_list+0x1a2/0x4b0
[27061.521251]  [<ffffffff802b9f7b>] ? shrink_zone+0x7b/0x160
[27061.521251]  [<ffffffff802ba0ed>] ? shrink_zones+0x8d/0x150
[27061.521251]  [<ffffffff802ba236>] ? do_try_to_free_pages+0x86/0x2e0
[27061.521251]  [<ffffffff802ba587>] ? try_to_free_pages+0x67/0x70
[27061.521251]  [<ffffffff802b90a0>] ? isolate_pages_global+0x0/0x50
[27061.521251]  [<ffffffff802b28b1>] ? __alloc_pages_internal+0x241/0x510
[27061.521251]  [<ffffffff802d565d>] ? alloc_pages_current+0xad/0x110
[27061.521251]  [<ffffffff802ac477>] ? __page_cache_alloc+0x67/0x80
[27061.521251]  [<ffffffff802ad0b3>] ? __grab_cache_page+0x63/0xb0
[27061.521251]  [<ffffffff80316a59>] ? block_write_begin+0x89/0xf0
[27061.521251]  [<ffffffffa04e04ca>] ? xfs_vm_write_begin+0x2a/0x30 [xfs]
[27061.521251]  [<ffffffffa04e0040>] ? xfs_get_blocks+0x0/0x20 [xfs]
[27061.521251]  [<ffffffff802ab7ac>] ? generic_perform_write+0xbc/0x1c0
[27061.521251]  [<ffffffff802ad512>] ? generic_file_buffered_write+0x92/0x170
[27061.521251]  [<ffffffffa04e92d3>] ? xfs_write+0x6b3/0x9b0 [xfs]
[27061.521251]  [<ffffffff80385a69>] ? apparmor_socket_recvmsg+0x19/0x20
[27061.521251]  [<ffffffff803aaf70>] ? memset_c+0x20/0x30
[27061.521251]  [<ffffffffa04e4c88>] ? xfs_file_aio_write+0x58/0x60 [xfs]
[27061.521251]  [<ffffffff802e9559>] ? do_sync_write+0xf9/0x140
[27061.521251]  [<ffffffff802e9699>] ? do_sync_read+0xf9/0x140
[27061.521251]  [<ffffffff80266fb0>] ? autoremove_wake_function+0x0/0x40
[27061.521251]  [<ffffffff80386821>] ? aa_file_permission+0x21/0xf0
[27061.521251]  [<ffffffff80386948>] ? apparmor_file_permission+0x28/0x30
[27061.521251]  [<ffffffff803613e6>] ? security_file_permission+0x16/0x20
[27061.521251]  [<ffffffff802e9c1b>] ? vfs_write+0xcb/0x130
[27061.521251]  [<ffffffff802e9d1a>] ? sys_pwrite64+0x9a/0xa0
[27061.521251]  [<ffffffff8021285a>] ? system_call_fastpath+0x16/0x1b
[27061.521251]
[27095.080066] RAID5 conf printout:
[27095.080071]  --- rd:7 wd:5
[27095.080074]  disk 0, o:1, dev:sdd1
[27095.080076]  disk 1, o:1, dev:sdb1
[27095.080077]  disk 2, o:0, dev:sdh1
[27095.080079]  disk 3, o:1, dev:sdc1
[27095.080080]  disk 4, o:1, dev:sdf1
[27095.080082]  disk 5, o:1, dev:sde1
[27095.080090] RAID5 conf printout:
[27095.080091]  --- rd:7 wd:5
[27095.080092]  disk 0, o:1, dev:sdd1
[27095.080093]  disk 1, o:1, dev:sdb1
[27095.080094]  disk 2, o:0, dev:sdh1
[27095.080095]  disk 3, o:1, dev:sdc1
[27095.080097]  disk 4, o:1, dev:sdf1
[27095.080098]  disk 5, o:1, dev:sde1
[27095.140011] RAID5 conf printout:
[27095.140017]  --- rd:7 wd:5
[27095.140019]  disk 0, o:1, dev:sdd1
[27095.140022]  disk 1, o:1, dev:sdb1
[27095.140024]  disk 3, o:1, dev:sdc1
[27095.140026]  disk 4, o:1, dev:sdf1
[27095.140027]  disk 5, o:1, dev:sde1
[27095.140511] Buffer I/O error on device md1, logical block 455870845
[27095.140545] lost page write due to I/O error on md1
[27095.140550] Buffer I/O error on device md1, logical block 455870846
[27095.140567] lost page write due to I/O error on md1
[27095.140569] Buffer I/O error on device md1, logical block 455870847
[27095.140585] lost page write due to I/O error on md1
[27095.140587] Buffer I/O error on device md1, logical block 455870848
[27095.140604] lost page write due to I/O error on md1
[27095.140606] Buffer I/O error on device md1, logical block 455870849
[27095.140622] lost page write due to I/O error on md1
[27095.140624] Buffer I/O error on device md1, logical block 455870850
[27095.140641] lost page write due to I/O error on md1
[27095.140642] Buffer I/O error on device md1, logical block 455870851
[27095.140659] lost page write due to I/O error on md1
[27095.140661] Buffer I/O error on device md1, logical block 455870852
[27095.140677] lost page write due to I/O error on md1
[27095.140679] Buffer I/O error on device md1, logical block 455870853
[27095.140696] lost page write due to I/O error on md1
[27095.140697] Buffer I/O error on device md1, logical block 455870854
[27095.140714] lost page write due to I/O error on md1
[27095.141327] I/O error in filesystem ("md1") meta-data dev md1 block 0xaeaa9810       ("xlog_iodone") error 5 buf count 12288
[27095.141359] xfs_force_shutdown(md1,0x2) called from line 1056 of file /build/buildd/linux-2.6.27/fs/xfs/xfs_log.c.  Return address = 0xffffffffa04c80d3
[27095.141380] Filesystem "md1": Log I/O Error Detected.  Shutting down filesystem: md1
[27095.141407] Please umount the filesystem, and rectify the problem(s)
[27100.140015] Filesystem "md1": xfs_log_force: error 5 returned.
[27113.440011] Filesystem "md1": xfs_log_force: error 5 returned.
[27143.440010] Filesystem "md1": xfs_log_force: error 5 returned.
[27173.440009] Filesystem "md1": xfs_log_force: error 5 returned.
[27203.440012] Filesystem "md1": xfs_log_force: error 5 returned.

/var/log/messages:
Nov 30 18:39:24 isis kernel: [ 1061.040118] md: recovery of RAID array md1
Nov 30 18:39:24 isis kernel: [ 1061.040120] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
Nov 30 18:39:24 isis kernel: [ 1061.040122] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
Nov 30 18:39:24 isis kernel: [ 1061.040126] md: using 128k window, over a total of 488383744 blocks.
Nov 30 19:02:08 isis -- MARK --
Nov 30 19:22:08 isis -- MARK --
Nov 30 19:42:08 isis -- MARK --
Nov 30 20:02:08 isis -- MARK --
Nov 30 20:22:08 isis -- MARK --
Nov 30 20:42:08 isis -- MARK --
Nov 30 21:02:08 isis -- MARK --
Nov 30 21:22:08 isis -- MARK --
Nov 30 21:28:32 isis kernel: [11208.852220] md: md1: recovery done.
Nov 30 21:28:32 isis kernel: [11209.020072] RAID5 conf printout:
Nov 30 21:28:32 isis kernel: [11209.020076]  --- rd:7 wd:7
Nov 30 21:28:32 isis kernel: [11209.020079]  disk 0, o:1, dev:sdd1
Nov 30 21:28:32 isis kernel: [11209.020080]  disk 1, o:1, dev:sdb1
Nov 30 21:28:32 isis kernel: [11209.020081]  disk 2, o:1, dev:sdh1
Nov 30 21:28:32 isis kernel: [11209.020082]  disk 3, o:1, dev:sdc1
Nov 30 21:28:32 isis kernel: [11209.020083]  disk 4, o:1, dev:sdf1
Nov 30 21:28:32 isis kernel: [11209.020084]  disk 5, o:1, dev:sde1
Nov 30 21:28:32 isis kernel: [11209.020085]  disk 6, o:1, dev:sdg1
Nov 30 21:42:08 isis -- MARK --
Nov 30 22:02:08 isis -- MARK --
Nov 30 22:22:08 isis -- MARK --
Nov 30 22:42:08 isis -- MARK --
Nov 30 23:02:08 isis -- MARK --
Nov 30 23:22:08 isis -- MARK --
Nov 30 23:42:08 isis -- MARK --
Nov 30 23:52:27 isis kernel: [19844.431690] SGI XFS with ACLs, security attributes, realtime, large block/inode numbers, no debug enabled
Nov 30 23:52:27 isis kernel: [19844.433148] SGI XFS Quota Management subsystem
Nov 30 23:52:27 isis kernel: [19844.442507] Filesystem "md1": Disabling barriers, trial barrier write failed
Nov 30 23:52:27 isis kernel: [19844.442658] XFS mounting filesystem md1
Dec  1 00:22:08 isis -- MARK --
Dec  1 00:42:08 isis -- MARK --
Dec  1 01:02:08 isis -- MARK --
Dec  1 01:22:08 isis -- MARK --
Dec  1 01:42:08 isis -- MARK --
Dec  1 01:52:10 isis kernel: [27027.170099] ata5: hard resetting link
Dec  1 01:52:10 isis kernel: [27027.680034] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Dec  1 01:52:11 isis kernel: [27027.720050] ata5.00: max_sectors limited to 256 for NCQ
Dec  1 01:52:11 isis kernel: [27027.780047] ata5.00: max_sectors limited to 256 for NCQ
Dec  1 01:52:11 isis kernel: [27027.780050] ata5.00: configured for UDMA/133
Dec  1 01:52:11 isis kernel: [27027.780073] md: super_written gets error=-5, uptodate=0
Dec  1 01:52:11 isis kernel: [27027.780117] ata5: EH complete
Dec  1 01:52:11 isis kernel: [27027.780674] sd 4:0:0:0: [sdg] 976773168 512-byte hardware sectors (500108 MB)
Dec  1 01:52:11 isis kernel: [27027.780800] sd 4:0:0:0: [sdg] Write Protect is off
Dec  1 01:52:11 isis kernel: [27027.781038] sd 4:0:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Dec  1 01:52:41 isis kernel: [27057.930098] ata12: hard resetting link
Dec  1 01:52:41 isis kernel: [27058.440033] ata12: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Dec  1 01:52:41 isis kernel: [27058.480049] ata12.00: max_sectors limited to 256 for NCQ
Dec  1 01:52:41 isis kernel: [27058.540047] ata12.00: max_sectors limited to 256 for NCQ
Dec  1 01:52:41 isis kernel: [27058.540050] ata12.00: configured for UDMA/133
Dec  1 01:52:41 isis kernel: [27058.540072] md: super_written gets error=-5, uptodate=0
Dec  1 01:52:41 isis kernel: [27058.540113] ata12: EH complete
Dec  1 01:52:41 isis kernel: [27058.540754] sd 11:0:0:0: [sdh] 976773168 512-byte hardware sectors (500108 MB)
Dec  1 01:52:41 isis kernel: [27058.540879] sd 11:0:0:0: [sdh] Write Protect is off
Dec  1 01:52:41 isis kernel: [27058.541070] sd 11:0:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Dec  1 01:52:41 isis kernel: [27058.584017] RAID5 conf printout:
Dec  1 01:52:41 isis kernel: [27058.584020]  --- rd:7 wd:5
Dec  1 01:52:41 isis kernel: [27058.584022]  disk 0, o:1, dev:sdd1
Dec  1 01:52:41 isis kernel: [27058.584023]  disk 1, o:1, dev:sdb1
Dec  1 01:52:41 isis kernel: [27058.584024]  disk 2, o:0, dev:sdh1
Dec  1 01:52:41 isis kernel: [27058.584025]  disk 3, o:1, dev:sdc1
Dec  1 01:52:41 isis kernel: [27058.584027]  disk 4, o:1, dev:sdf1
Dec  1 01:52:41 isis kernel: [27058.584028]  disk 5, o:1, dev:sde1
Dec  1 01:52:41 isis kernel: [27058.584029]  disk 6, o:0, dev:sdg1
Dec  1 01:52:44 isis kernel: [27061.521251] Modules linked in: xfs aes_x86_64 aes_generic ecb crypto_blkcipher ecryptfs ipv6 af_packet iptable_filter ip_tables x_tables ac sbp2 parport_pc lp parport loop psmouse pcspkr serio_raw iTCO_wdt iTCO_vendor_support evdev button intel_agp snd_hda_intel snd_pcm shpchp snd_timer pci_hotplug snd soundcore snd_page_alloc ext3 jbd mbcache sd_mod crc_t10dif sg pata_acpi pata_marvell usbhid hid ohci1394 ieee1394 sata_mv ata_generic ata_piix libata scsi_mod dock sky2 e1000e ehci_hcd uhci_hcd usbcore raid10 raid456 async_xor async_memcpy async_tx xor raid1 raid0 multipath linear md_mod dm_mirror dm_log dm_snapshot dm_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
Dec  1 01:52:44 isis kernel: [27061.521251] CPU 1:
Dec  1 01:52:44 isis kernel: [27061.521251] Modules linked in: xfs aes_x86_64 aes_generic ecb crypto_blkcipher ecryptfs ipv6 af_packet iptable_filter ip_tables x_tables ac sbp2 parport_pc lp parport loop psmouse pcspkr serio_raw iTCO_wdt iTCO_vendor_support evdev button intel_agp snd_hda_intel snd_pcm shpchp snd_timer pci_hotplug snd soundcore snd_page_alloc ext3 jbd mbcache sd_mod crc_t10dif sg pata_acpi pata_marvell usbhid hid ohci1394 ieee1394 sata_mv ata_generic ata_piix libata scsi_mod dock sky2 e1000e ehci_hcd uhci_hcd usbcore raid10 raid456 async_xor async_memcpy async_tx xor raid1 raid0 multipath linear md_mod dm_mirror dm_log dm_snapshot dm_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
Dec  1 01:52:44 isis kernel: [27061.521251] Pid: 28171, comm: smbd Not tainted 2.6.27-9-server #1
Dec  1 01:52:44 isis kernel: [27061.521251] RIP: 0010:[<ffffffff802abf0c>]  [<ffffffff802abf0c>] find_get_pages+0x6c/0x110
Dec  1 01:52:44 isis kernel: [27061.521251] RSP: 0018:ffff880129453358  EFLAGS: 00000246
Dec  1 01:52:44 isis kernel: [27061.521251] RAX: ffff880128d89330 RBX: ffff880129453398 RCX: 0000000000000002
Dec  1 01:52:44 isis kernel: [27061.521251] RDX: 0000000000000003 RSI: 0000000000000000 RDI: ffffe200022e9e80
Dec  1 01:52:44 isis kernel: [27061.521251] RBP: ffff880129453308 R08: ffffe200009df6c8 R09: 0000000000000005
Dec  1 01:52:44 isis kernel: [27061.521251] R10: 0000000000000037 R11: 00000000001c5778 R12: ffffffff802b6b29
Dec  1 01:52:44 isis kernel: [27061.521251] R13: ffff880123a107d0 R14: ffffe20001c6f6c0 R15: 0000000000000286
Dec  1 01:52:44 isis kernel: [27061.521251] FS:  00007fb72cdf6700(0000) GS:ffff88012fc02980(0000) knlGS:0000000000000000
Dec  1 01:52:44 isis kernel: [27061.521251] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec  1 01:52:44 isis kernel: [27061.521251] CR2: 00007f1648629000 CR3: 000000012956d000 CR4: 00000000000006e0
Dec  1 01:52:44 isis kernel: [27061.521251] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec  1 01:52:44 isis kernel: [27061.521251] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec  1 01:52:44 isis kernel: [27061.521251]
Dec  1 01:52:44 isis kernel: [27061.521251] Call Trace:
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802abee3>] ? find_get_pages+0x43/0x110
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802b6984>] ? pagevec_lookup+0x24/0x30
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffffa04e100d>] ? xfs_cluster_write+0xad/0x180 [xfs]
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffffa04e1578>] ? xfs_page_state_convert+0x498/0x760 [xfs]
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffffa04e19a1>] ? xfs_vm_writepage+0x71/0x120 [xfs]
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802b9274>] ? pageout+0x124/0x270
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802ab06a>] ? page_waitqueue+0xa/0x90
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802b986d>] ? shrink_page_list+0x34d/0x530
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802b8e49>] ? __isolate_lru_page+0x79/0xb0
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802b8f0a>] ? isolate_lru_pages+0x8a/0x220
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802b9bf2>] ? shrink_inactive_list+0x1a2/0x4b0
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802b9f7b>] ? shrink_zone+0x7b/0x160
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802ba0ed>] ? shrink_zones+0x8d/0x150
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802ba236>] ? do_try_to_free_pages+0x86/0x2e0
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802ba587>] ? try_to_free_pages+0x67/0x70
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802b90a0>] ? isolate_pages_global+0x0/0x50
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802b28b1>] ? __alloc_pages_internal+0x241/0x510
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802d565d>] ? alloc_pages_current+0xad/0x110
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802ac477>] ? __page_cache_alloc+0x67/0x80
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802ad0b3>] ? __grab_cache_page+0x63/0xb0
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff80316a59>] ? block_write_begin+0x89/0xf0
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffffa04e04ca>] ? xfs_vm_write_begin+0x2a/0x30 [xfs]
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffffa04e0040>] ? xfs_get_blocks+0x0/0x20 [xfs]
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802ab7ac>] ? generic_perform_write+0xbc/0x1c0
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802ad512>] ? generic_file_buffered_write+0x92/0x170
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffffa04e92d3>] ? xfs_write+0x6b3/0x9b0 [xfs]
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff80385a69>] ? apparmor_socket_recvmsg+0x19/0x20
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff803aaf70>] ? memset_c+0x20/0x30
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffffa04e4c88>] ? xfs_file_aio_write+0x58/0x60 [xfs]
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802e9559>] ? do_sync_write+0xf9/0x140
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802e9699>] ? do_sync_read+0xf9/0x140
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff80266fb0>] ? autoremove_wake_function+0x0/0x40
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff80386821>] ? aa_file_permission+0x21/0xf0
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff80386948>] ? apparmor_file_permission+0x28/0x30
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff803613e6>] ? security_file_permission+0x16/0x20
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802e9c1b>] ? vfs_write+0xcb/0x130
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff802e9d1a>] ? sys_pwrite64+0x9a/0xa0
Dec  1 01:52:44 isis kernel: [27061.521251]  [<ffffffff8021285a>] ? system_call_fastpath+0x16/0x1b
Dec  1 01:52:44 isis kernel: [27061.521251]
Dec  1 01:53:18 isis kernel: [27095.080066] RAID5 conf printout:
Dec  1 01:53:18 isis kernel: [27095.080071]  --- rd:7 wd:5
Dec  1 01:53:18 isis kernel: [27095.080074]  disk 0, o:1, dev:sdd1
Dec  1 01:53:18 isis kernel: [27095.080076]  disk 1, o:1, dev:sdb1
Dec  1 01:53:18 isis kernel: [27095.080077]  disk 2, o:0, dev:sdh1
Dec  1 01:53:18 isis kernel: [27095.080079]  disk 3, o:1, dev:sdc1
Dec  1 01:53:18 isis kernel: [27095.080080]  disk 4, o:1, dev:sdf1
Dec  1 01:53:18 isis kernel: [27095.080082]  disk 5, o:1, dev:sde1
Dec  1 01:53:18 isis kernel: [27095.080090] RAID5 conf printout:
Dec  1 01:53:18 isis kernel: [27095.080091]  --- rd:7 wd:5
Dec  1 01:53:18 isis kernel: [27095.080092]  disk 0, o:1, dev:sdd1
Dec  1 01:53:18 isis kernel: [27095.080093]  disk 1, o:1, dev:sdb1
Dec  1 01:53:18 isis kernel: [27095.080094]  disk 2, o:0, dev:sdh1
Dec  1 01:53:18 isis kernel: [27095.080095]  disk 3, o:1, dev:sdc1
Dec  1 01:53:18 isis kernel: [27095.080097]  disk 4, o:1, dev:sdf1
Dec  1 01:53:18 isis kernel: [27095.080098]  disk 5, o:1, dev:sde1
Dec  1 01:53:18 isis kernel: [27095.140011] RAID5 conf printout:
Dec  1 01:53:18 isis kernel: [27095.140017]  --- rd:7 wd:5
Dec  1 01:53:18 isis kernel: [27095.140019]  disk 0, o:1, dev:sdd1
Dec  1 01:53:18 isis kernel: [27095.140022]  disk 1, o:1, dev:sdb1
Dec  1 01:53:18 isis kernel: [27095.140024]  disk 3, o:1, dev:sdc1
Dec  1 01:53:18 isis kernel: [27095.140026]  disk 4, o:1, dev:sdf1
Dec  1 01:53:18 isis kernel: [27095.140027]  disk 5, o:1, dev:sde1
Dec  1 01:53:18 isis kernel: [27095.140545] lost page write due to I/O error on md1
Dec  1 01:53:18 isis kernel: [27095.140567] lost page write due to I/O error on md1
Dec  1 01:53:18 isis kernel: [27095.140585] lost page write due to I/O error on md1
Dec  1 01:53:18 isis kernel: [27095.140604] lost page write due to I/O error on md1
Dec  1 01:53:18 isis kernel: [27095.140622] lost page write due to I/O error on md1
Dec  1 01:53:18 isis kernel: [27095.140641] lost page write due to I/O error on md1
Dec  1 01:53:18 isis kernel: [27095.140659] lost page write due to I/O error on md1
Dec  1 01:53:18 isis kernel: [27095.140677] lost page write due to I/O error on md1
Dec  1 01:53:18 isis kernel: [27095.140696] lost page write due to I/O error on md1
Dec  1 01:53:18 isis kernel: [27095.140714] lost page write due to I/O error on md1
Dec  1 01:53:18 isis kernel: [27095.141359] xfs_force_shutdown(md1,0x2) called from line 1056 of file /build/buildd/linux-2.6.27/fs/xfs/xfs_log.c.  Return address = 0xffffffffa04c80d3
Dec  1 01:53:23 isis kernel: [27100.140015] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 01:53:36 isis kernel: [27113.440011] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 01:54:06 isis kernel: [27143.440010] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 01:54:36 isis kernel: [27173.440009] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 01:55:06 isis kernel: [27203.440012] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 01:55:36 isis kernel: [27233.440011] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 01:56:06 isis kernel: [27263.440011] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 01:56:36 isis kernel: [27293.440010] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 01:57:06 isis kernel: [27323.440016] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 01:57:36 isis kernel: [27353.440015] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 01:58:06 isis kernel: [27383.440015] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 01:58:36 isis kernel: [27413.440016] Filesystem "md1": xfs_log_force: error 5 returned.
^^^^^^continues this for a while
Dec  1 02:12:06 isis kernel: [28223.440015] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 02:12:36 isis kernel: [28253.440013] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 02:13:06 isis kernel: [28283.440014] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 02:13:36 isis kernel: [28313.440013] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 02:14:06 isis kernel: [28343.440013] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 02:14:36 isis kernel: [28373.440012] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 02:14:59 isis kernel: [28395.820448] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 02:14:59 isis kernel: [28395.820456] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 02:14:59 isis kernel: [28395.820462] xfs_force_shutdown(md1,0x1) called from line 420 of file /build/buildd/linux-2.6.27/fs/xfs/xfs_rw.c.  Return address = 0xffffffffa04decc3
Dec  1 02:14:59 isis kernel: [28395.820466] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 02:14:59 isis kernel: [28395.820468] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 02:14:59 isis kernel: [28395.820471] xfs_force_shutdown(md1,0x1) called from line 420 of file /build/buildd/linux-2.6.27/fs/xfs/xfs_rw.c.  Return address = 0xffffffffa04decc3
Dec  1 02:14:59 isis kernel: [28396.669470] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 02:14:59 isis kernel: [28396.669487] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 02:14:59 isis kernel: [28396.669517] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 02:14:59 isis kernel: [28396.669525] Filesystem "md1": xfs_log_force: error 5 returned.
Dec  1 02:14:59 isis kernel: [28396.669635] Filesystem "md1": xfs_log_force: error 5 returned.

Revision history for this message

q (qr7atgwu) wrote on 2008-12-01:

sorry, forgot to put in the kernel ver that i upgraded to:
Linux isis 2.6.27-9-server #1 SMP Thu Nov 20 22:56:07 UTC 2008 x86_64 GNU/Linux

Revision history for this message

Kytrix (kytrix) wrote on 2009-01-09:

I get it work on with my sata2 drive on nforce4 by disabling disk write cache and NCQ

look here for details:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/301893/comments/10

Revision history for this message

Dan Helfman (witten-torsion) wrote on 2009-02-04:

I just wanted to point out that this issue appears to have been fixed by a one-line change to the sata_mv kernel source by Mark Lord. Discussion is here towards the bottom of the bug report page:

https://bugzilla.redhat.com/show_bug.cgi?id=462425

And the patch itself is here:

https://bugzilla.redhat.com/attachment.cgi?id=329048

I haven't confirmed that the fix works yet. Reportedly with the fix in place, you can safely re-enable your disk write cache.

Bug Watch Updater (bug-watch-updater) on 2009-02-22

Changed in linux:
status:	Unknown → In Progress

Revision history for this message

Pitabred (ubuntu-pitabred) wrote on 2009-02-27:

I just wanted to add that I'm also seeing this bug with 4 Hitachi drives in a RAID5 array on an ATI SB700/SB800 chipset (64bit Intrepid Mythbuntu, fully updated, generic kernel). So it's not chipset specific. I'm going to compile a kernel with the above mentioned, but I can cause the error at will with a large data copy, so it will be apparent whether the fix works or not. I can provide any logs anyone needs for debugging, and will be watching changes to this bug.

Revision history for this message

Pitabred (ubuntu-pitabred) wrote on 2009-02-28:

#10

Just wanted to comment that after getting the kernel compiling, generic except for the patch mentioned by Dan Helfman above, the drives do not crash under a load that they previously would have. I'll continue testing, but I have high hopes for it.

Revision history for this message

TJ (tj) wrote on 2009-03-16:

#11

The patch from Mark Lord is included in Jaunty as commit

c42fae333255b08b8d4bc03e5853023145208d45 sata_mv: fix 8-port timeouts on 508x/6081 chips

Other non-marvel chipsets may be affected by similar bugs in other drivers.

Changed in linux (Ubuntu):
importance:	Undecided → Medium
status:	New → Fix Released

Revision history for this message

TJ (tj) wrote on 2009-03-16:

#12

Stefan, is this a candidate for back-porting to Hardy/Intrepid?

Revision history for this message

Stefan Bader (smb) wrote on 2009-03-17:

#13

Patch fixes a real bug, is isolated to only sata_mv. So thumbs up. It has to go through the paperwork, though. I will see this gets done.

Stefan Bader (smb) on 2009-03-17

Changed in linux (Ubuntu Hardy):
assignee:	nobody → cooloney
importance:	Undecided → Medium
status:	New → Triaged
Changed in linux (Ubuntu Intrepid):
assignee:	nobody → cooloney
importance:	Undecided → Medium
status:	New → Triaged

Revision history for this message

krot (ubuntu-communitare) wrote on 2009-04-16:

#14

Same problem with an ATI SB700/SB800 sata controller running a RAID-5 on Ubuntu 8.10. Under heavy load the system stalls and I see these errors

[255670.268058] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[255670.268082] ata5.00: cmd b0/da:00:00:4f:c2/00:00:00:00:00/00 tag 0
[255670.268085] res 40/00:00:2f:7b:a8/00:00:ae:00:00/e0 Emask 0x4 (timeout)
[255670.268092] ata5.00: status: { DRDY }
[255670.268103] ata5: hard resetting link
[255670.752537] ata5: softreset failed (device not ready)
[255670.752551] ata5: failed due to HW bug, retry pmp=0
[255670.916053] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[255671.611176] ata5.00: configured for UDMA/133
[255671.611226] ata5: EH complete

Driver used seems to be libata, not sata_mv.

If needed I can provide more information.

Revision history for this message

Bryan Wu (cooloney) wrote on 2009-04-29:

#15

It needs big change to Hardy kernel, so we won't fix this issue in Hardy kernel.

-Bryn

Changed in linux (Ubuntu Hardy):
status:	Triaged → Won't Fix

Revision history for this message

Bryan Wu (cooloney) wrote on 2009-04-29:

#16

Stefan already check in the patch into Intrepid kernel. Changed the status.

-Bryan

Changed in linux (Ubuntu Intrepid):
status:	Triaged → Fix Committed

Revision history for this message

Jonathan Heard (jon-launchpad-jeh) wrote on 2009-05-11:

#17

I am seeing a very similar issue with a Via VT6421 SATA Controller (non-RAID BIOS).
Jaunty with Kernel: 2.6.28-11-server on 32-bit i386 (Pentium 4)
I have two disks: WDC WD3200AAJS-00L7A0 configured in RAID 1 using 'md' software RAID.
Linear operations, like rebuilding the RAID Mirror work like a dream with no errors, but random access causes lots of errors like above (*both* drives give lots of errors). The easiest way to reproduce is simply to apt-get install a package, even for just a few megs of data, the disks go nuts.

This is brand new hardware, new hba, new disks, new SATA cards. I've tried two different PSUs and refuse to believe that this is a power issue when a brand new 280W PSU has only the Pentium 4 Motherboard and the two disks attached. Maye the Via driver has adopted the broken code from the Marvel driver and needs fixing too?

Sure enough - If I disable the write cache on the disks, the problem is gone. As it happens I want the write cache disabled anyway but this was very concerning when I first installed the box.

Example of Errors:
-------------------------
[316730.629755] ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6
[316730.629793] ata4.00: BMDMA stat 0x5
[316730.629820] ata4: SError: { UnrecovData Proto TrStaTrns }
[316730.629853] ata4.00: cmd c8/00:18:af:f1:51/00:00:00:00:00/e0 tag 0 dma 12288 in
[316730.629855] res 51/84:07:c0:f1:51/84:01:00:00:00/e0 Emask 0x12 (ATA bus error)
[316730.629948] ata4.00: status: { DRDY ERR }
[316730.629974] ata4.00: error: { ICRC ABRT }
[316730.630021] ata4: hard resetting link
[316730.980054] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[316731.040809] ata4.00: configured for UDMA/33
[316731.040825] ata4: EH complete
[316731.045995] sd 3:0:0:0: [sdb] 625142448 512-byte hardware sectors: (320 GB/298 GiB)
[316731.046472] sd 3:0:0:0: [sdb] Write Protect is off
[316731.046475] sd 3:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[316731.046649] sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[316762.000273] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[316762.000315] ata3.00: cmd c8/00:20:2f:f3:51/00:00:00:00:00/e0 tag 0 dma 16384 in
[316762.000317] res 40/00:00:56:f1:51/00:00:00:00:00/e0 Emask 0x4 (timeout)
[316762.000409] ata3.00: status: { DRDY }
[316762.000442] ata3: hard resetting link
[316762.350049] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[316762.390408] ata3.00: configured for UDMA/133
[316762.390422] ata3: EH complete
[316762.412082] sd 2:0:0:0: [sda] 625142448 512-byte hardware sectors: (320 GB/298 GiB)
[316762.412267] sd 2:0:0:0: [sda] Write Protect is off
[316762.412271] sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
[316762.438540] sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Sure enough - If I disable the write cache on the disks, the problem is gone. As it happens I want the write cache disabled anyway but this was very concerning when I first installed the box.

Example of Errors:
-------------------------
[316730.629755] ata4.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6
[316730.629793] ata4.00: BMDMA stat 0x5
[316730.629820] ata4: SError: { UnrecovData Proto TrStaTrns }
[316730.629853] ata4.00: cmd c8/00:18:af:f1:51/00:00:00:00:00/e0 tag 0 dma 12288 in
[316730.629855]          res 51/84:07:c0:f1:51/84:01:00:00:00/e0 Emask 0x12 (ATA bus error)
[316730.629948] ata4.00: status: { DRDY ERR }
[316730.629974] ata4.00: error: { ICRC ABRT }
[316730.630021] ata4: hard resetting link
[316730.980054] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[316731.040809] ata4.00: configured for UDMA/33
[316731.040825] ata4: EH complete
[316731.045995] sd 3:0:0:0: [sdb] 625142448 512-byte hardware sectors: (320 GB/298 GiB)
[316731.046472] sd 3:0:0:0: [sdb] Write Protect is off
[316731.046475] sd 3:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[316731.046649] sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[316762.000273] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[316762.000315] ata3.00: cmd c8/00:20:2f:f3:51/00:00:00:00:00/e0 tag 0 dma 16384 in
[316762.000317]          res 40/00:00:56:f1:51/00:00:00:00:00/e0 Emask 0x4 (timeout)
[316762.000409] ata3.00: status: { DRDY }
[316762.000442] ata3: hard resetting link
[316762.350049] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[316762.390408] ata3.00: configured for UDMA/133
[316762.390422] ata3: EH complete
[316762.412082] sd 2:0:0:0: [sda] 625142448 512-byte hardware sectors: (320 GB/298 GiB)
[316762.412267] sd 2:0:0:0: [sda] Write Protect is off
[316762.412271] sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
[316762.438540] sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Revision history for this message

draknet (n638jl66) wrote on 2009-06-02:

#18

I still see this problem on Ubuntu 9.04 running 2.6.28-11-generic with an ATI SB700/SB800 sata controller running a RAID-5.

This is not fixed in Jaunty.

Revision history for this message

Andrew Davison (darkinnit) wrote on 2009-06-14:

#19

Download full text (3.9 KiB)

I also still see this problem on Ubuntu Server 9.04 running 2.6.28-11. Is this the patched Intrepid kernel, or do I need to enable backports or do some other thing to resolve this issue?

I have a RAID 5 running with 3 disks across two controllers:
VIA Technologies, Inc. VIA VT6420 SATA RAID Controller (rev 80)
Silicon Image, Inc. SiI 3512 [SATALink/SATARaid] Serial ATA Controller (rev 01)

/var/log/messages:
[ 590.274538] ata1: hard resetting link
[ 590.594161] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 590.819220] ata1.00: configured for UDMA/100
[ 590.819266] ata1: EH complete
[ 590.848089] sd 0:0:0:0: [sda] 1250263728 512-byte hardware sectors: (640 GB/596 GiB)
[ 590.854831] sd 0:0:0:0: [sda] Write Protect is off
[ 590.861069] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 607.360924] ata1: hard resetting link
[ 607.679134] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 607.695647] ata1.00: configured for UDMA/100
[ 607.695693] ata1: EH complete
[ 607.700588] sd 0:0:0:0: [sda] 1250263728 512-byte hardware sectors: (640 GB/596 GiB)
[ 607.700662] sd 0:0:0:0: [sda] Write Protect is off
[ 607.700764] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 637.277044] ata2: hard resetting link
[ 637.596839] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 637.829733] ata2.00: configured for UDMA/100
[ 637.829780] ata2: EH complete
[ 637.844617] sd 1:0:0:0: [sdb] 1250263728 512-byte hardware sectors: (640 GB/596 GiB)
[ 637.853532] sd 1:0:0:0: [sdb] Write Protect is off
[ 637.858736] ata2: hard resetting link
[ 638.176875] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 638.194087] ata2.00: configured for UDMA/100
[ 638.194139] ata2: EH complete
[ 638.198664] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 638.201640] sd 1:0:0:0: [sdb] 1250263728 512-byte hardware sectors: (640 GB/596 GiB)
[ 638.201737] sd 1:0:0:0: [sdb] Write Protect is off
[ 638.201967] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

*However*, disabling the write cache does not fix this for me:

/var/log/messages:
[ 1387.362851] sd 0:0:0:0: [sda] 1250263728 512-byte hardware sectors: (640 GB/596 GiB)
[ 1387.363165] sd 0:0:0:0: [sda] Write Protect is off
[ 1387.363330] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 1387.363638] sd 0:0:0:0: [sda] 1250263728 512-byte hardware sectors: (640 GB/596 GiB)
[ 1387.363748] sd 0:0:0:0: [sda] Write Protect is off
[ 1387.363897] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 1416.321442] ata2: hard resetting link
[ 1416.641237] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 1416.874078] ata2.00: configured for UDMA/100
[ 1416.874124] ata2: EH complete
[ 1416.914095] ata2.00: configured for UDMA/100
[ 1416.914149] ata2: EH complete
[ 1416.928420] sd 1:0:0:0: [sdb] 1250263728 512-byte hardware sectors: (640 GB/596 GiB)
[ 1416.938974] sd 1:0:0:0: [sdb] Write Protect is off
[ 1416.939283] sd 1:0:0:0: [sdb] Write cache: di...

Duplicates of this bug

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

sata_errors.txt Edit

Add attachment

Remote bug watches

redhat-bugs #462425
[CLOSED ERRATA] Edit

Bug watches keep track of this bug in other bug trackers.

Changed in linux (Fedora):
status:	In Progress → Fix Released

Changed in linux (Ubuntu Intrepid):
assignee:	Bryan Wu (cooloney) → nobody
status:	Fix Committed → Invalid

Changed in linux (Fedora):
importance:	Unknown → Critical

Ubuntulinux package

hardy / ibex - raid5 - ata#: hard resetting link

Bug Description

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux package