Comment 6 for bug 256316

Revision history for this message
wateenellende (fpbeekhof) wrote : Re: [Bug 256316] Re: kernel 2.6.26 reports massive filesystem errors on RAID5 device; but on 2.6.24 it is fine

Well, no luck.

$ uname -a
Linux DeathStar 2.6.27-1-generic #1 SMP Sat Aug 23 23:19:01 UTC 2008
x86_64 GNU/Linux
$ cat */*.avi > /dev/null
cat: Amadeus (DVDivX)/Amadeus.The_Directors_Cut_(1984).AC3.CD1.MoonAge.ShareReacto.avi:
Input/output error
cat: Auberge Espagnol/AubergeEspagnol-CD2.avi: Input/output error
... from here on I stopped the test.

From syslog:
Aug 29 09:02:00 DeathStar kernel: [ 1207.904014] attempt to access
beyond end of device
Aug 29 09:02:00 DeathStar kernel: [ 1207.904014] md0: rw=0,
want=7771191736, limit=2929692160
Aug 29 09:02:00 DeathStar kernel: [ 1207.910440] attempt to access
beyond end of device
Aug 29 09:02:00 DeathStar kernel: [ 1207.911275] md0: rw=0,
want=7771191736, limit=2929692160
Aug 29 09:03:31 DeathStar kernel: [ 1299.012020] attempt to access
beyond end of device
Aug 29 09:03:31 DeathStar kernel: [ 1299.012020] md0: rw=0,
want=11222412024, limit=2929692160
Aug 29 09:03:31 DeathStar kernel: [ 1299.022440] attempt to access
beyond end of device
Aug 29 09:03:31 DeathStar kernel: [ 1299.023546] md0: rw=0,
want=11222412024, limit=2929692160

As I posted in the corresponding kernel bugzilla, my suspicion is that
the hpt374 driver in libata is returning bogus data when reading from
disk. This includes bogus file-system metadata, which in turn induces
the kernel to start reading from places on disk that do not exist -
which causes the error that we observe. This theory hasn't exactly
been proven, it's just my best guess.