Comment 39 for bug 550559

Edwin Chiu (edwin-chiu) wrote :

Tried booting 2.6.31-22-server (from karmic) on a maverick install and same error. I'm not entirely convinced this is a software bug, seems to target the same drive. I have 5 identical drives, and switching them around, so they are on different ports/cables, etc. doesn't seem to make the problem shift. Seems to be the drive...

On an individual basis, I'd say I had some bad drives, but when taking into account other reports, seems to be more than just a bad drive, but on a single system basis, it doesn't add up? If this was a software or hardware (non HD) bug, why does the problem follow the bad drive around? Why don't I get the problem on other drives?

Below is the output from 2.6.31-22, looks like the ata code isn't as robust, as it fails the drive and kicks it. Maverick seems better at recovering the drive so that it's usable.

My "reliable" way of triggering this is to launch a kvm process (tried virtio and ide emulation, same trigger). On the LV that hosts the kvm guest, I was able to dd the entire volume to /dev/null without and read issues...

[ 286.010222] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 286.010242] ata5.00: cmd 25/00:08:20:b4:4d/00:00:78:00:00/e0 tag 0 dma 4096 in
[ 286.010246] res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
[ 286.010253] ata5.00: status: { DRDY }
[ 286.010262] ata5: hard resetting link
[ 291.580170] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 291.580180] ata5.00: link online but device misclassifed
[ 296.580129] ata5.00: qc timeout (cmd 0xec)
[ 296.580166] ata5.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 296.580172] ata5.00: revalidation failed (errno=-5)
[ 296.580181] ata5: hard resetting link
[ 302.150170] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 302.150179] ata5.00: link online but device misclassifed
[ 312.150066] ata5.00: qc timeout (cmd 0xec)
[ 312.150103] ata5.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 312.150109] ata5.00: revalidation failed (errno=-5)
[ 312.150116] ata5: limiting SATA link speed to 1.5 Gbps
[ 312.150124] ata5: hard resetting link
[ 317.720136] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 317.720145] ata5.00: link online but device misclassifed
[ 347.720098] ata5.00: qc timeout (cmd 0xec)
[ 347.720135] ata5.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 347.720142] ata5.00: revalidation failed (errno=-5)
[ 347.720148] ata5.00: disabled
[ 347.720162] ata5.00: device reported invalid CHS sector 0
[ 347.720176] ata5: hard resetting link
[ 353.290169] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 353.290178] ata5.00: link online but device misclassifed
[ 353.290198] ata5: EH complete
[ 353.290224] sd 4:0:0:0: [sdd] Unhandled error code
[ 353.290229] sd 4:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 353.290237] end_request: I/O error, dev sdd, sector 2018358304
[ 353.290245] raid10: sdd4: rescheduling sector 1542176
[ 353.290271] sd 4:0:0:0: [sdd] Unhandled error code
[ 353.290275] sd 4:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 353.290282] end_request: I/O error, dev sdd, sector 2263685000
[ 353.290293] end_request: I/O error, dev sdd, sector 2263685000
[ 353.290299] md: super_written gets error=-5, uptodate=0
[ 353.290306] raid10: Disk failure on sdd4, disabling device.