EXT3-fs error corruption

Bug #200747 reported by Fred
12
Affects Status Importance Assigned to Milestone
Debian
Invalid
Undecided
Unassigned
linux (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

I am running Ubuntu 8.04 "Hardy Heron" (alpha).

MSI P35 Neo
Samsung SpinPoint T166 500 gb (7200 rpm, SATA-2, 16 mb cache)
Samsung SpinPoint T166 250 gb (7200 rpm, SATA-2, 16 mb cache)

sda1 = Windows XP
sdb1 = Linux

So I am listening to MP3 music in Rhythmbox 0.11.4, and surfing with Mozilla Firefox 3.0b3 and perhaps doing some other stuff, like maybe running apt-update && upgrade && clean && autoclean && autoremove or something in a Terminal or something.

And then my computer becomes unresponsive. The HDD LED shine all the time on the computer case.

Then I do Ctrl+Alt+F2 and try to login, and I type my username and press enter, and it waits there for a while, then it start spit some error messages, and wont let me in.

Like:
EXT3-fs error (device sdb1): ext3_get_inode_loc: unable to read node block - inode=1599522, block=3211298
And it can spam the console with that kind of stuff.

Then I shut down the computer (by PSU switch), restart and it borked out again.
Shut down computer by PSU switch again, and runned HUTIL v2.10 by Samsung, a disk diagnostic tool that checks SMART, does disk spin up, spin down, read surface scan, and couple other diagnostic stuff. It reported no errors, and said the disk were okay.

Then I rebooted into my old Ubuntu 7.04 "Gutsy Gibbon" LiveCD and run some commands to fix it.
$ sudo e2fsck -pcf /dev/sdb1
$ sudo badblocks -sv /dev/sdb1

After that, computer worked again.
But today (next day), my computer borked out again.
In the same way and started spitting out error messages...

[20061.478996] journal commit I/O error
[numbers] EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2399041 offset 0

and stuff like:
[numbers] EXT3-fs error (device sdb1): ext3_reserve_inode_write: Journal has aborted

It happened when I was using the 2.6.24-11-386 kernel.
I am scared, I don't know if my hard disk maybe is broken. Samsung's HUTIL said the hard disk was okay.
But now it happened more than once, and maybe its a bug in the kernel, or ext3 file system? I don't know. =/

Revision history for this message
Fred (eldmannen+launchpad) wrote :

Mar 9 16:46:06 ubuntu kernel: [ 25.735768] ext3_orphan_cleanup: deleting unreferenced inode 1485173
Mar 9 16:46:06 ubuntu kernel: [ 25.735786] ext3_orphan_cleanup: deleting unreferenced inode 1501809
Mar 9 16:46:06 ubuntu kernel: [ 25.735794] ext3_orphan_cleanup: deleting unreferenced inode 1501680
Mar 9 16:46:06 ubuntu kernel: [ 25.735799] ext3_orphan_cleanup: deleting unreferenced inode 1501678
Mar 9 16:46:06 ubuntu kernel: [ 25.735803] ext3_orphan_cleanup: deleting unreferenced inode 1501677
Mar 9 16:46:06 ubuntu kernel: [ 25.735808] ext3_orphan_cleanup: deleting unreferenced inode 1501676
Mar 9 16:46:06 ubuntu kernel: [ 25.735812] ext3_orphan_cleanup: deleting unreferenced inode 1501675
Mar 9 16:46:06 ubuntu kernel: [ 25.735816] ext3_orphan_cleanup: deleting unreferenced inode 1501674
Mar 9 16:46:06 ubuntu kernel: [ 25.735821] ext3_orphan_cleanup: deleting unreferenced inode 1501673
Mar 9 16:46:06 ubuntu kernel: [ 25.735825] ext3_orphan_cleanup: deleting unreferenced inode 1501672
Mar 9 16:46:06 ubuntu kernel: [ 25.735830] ext3_orphan_cleanup: deleting unreferenced inode 1501671
Mar 9 16:46:06 ubuntu kernel: [ 25.735834] ext3_orphan_cleanup: deleting unreferenced inode 1501669
Mar 9 16:46:06 ubuntu kernel: [ 25.735838] ext3_orphan_cleanup: deleting unreferenced inode 1501668
Mar 9 16:46:06 ubuntu kernel: [ 25.735843] ext3_orphan_cleanup: deleting unreferenced inode 1501667
Mar 9 16:46:06 ubuntu kernel: [ 25.746330] ext3_orphan_cleanup: deleting unreferenced inode 1501666
Mar 9 16:46:06 ubuntu kernel: [ 25.746336] ext3_orphan_cleanup: deleting unreferenced inode 1338406
Mar 9 16:46:06 ubuntu kernel: [ 25.750996] ext3_orphan_cleanup: deleting unreferenced inode 1504210
Mar 9 16:46:06 ubuntu kernel: [ 25.751001] ext3_orphan_cleanup: deleting unreferenced inode 1504199
Mar 9 16:46:06 ubuntu kernel: [ 25.751006] ext3_orphan_cleanup: deleting unreferenced inode 1504191
Mar 9 16:46:06 ubuntu kernel: [ 25.756010] ext3_orphan_cleanup: deleting unreferenced inode 2872358
Mar 9 16:46:06 ubuntu kernel: [ 25.756019] ext3_orphan_cleanup: deleting unreferenced inode 1338610
Mar 9 16:46:06 ubuntu kernel: [ 25.756024] ext3_orphan_cleanup: deleting unreferenced inode 1338441
Mar 9 16:46:06 ubuntu kernel: [ 25.762751] ext3_orphan_cleanup: deleting unreferenced inode 2415376

Revision history for this message
Jos van Hees (jos-vanhees) wrote :

I've got the exact same thing on a freshly installed debian server (stable).

Revision history for this message
Fred (eldmannen+launchpad) wrote :

ata2.00: revalidation failed (errno=-5)

Buffer I/O error on device sdb1, logical block 3948797

ext3_abort called

Revision history for this message
Fred (eldmannen+launchpad) wrote :

ubuntu@ubuntu:~$ sudo e2fsck -pcfv /dev/sdb1

  208452 inodes used (6.83%)
    3681 non-contiguous inodes (1.8%)
         # of inodes with ind/dind/tind blocks: 10598/124/0
 1374766 blocks used (22.53%)
       0 bad blocks
       1 large file

  161219 regular files
   25946 directories
     132 character device files
      26 block device files
       3 fifos
     575 links
   21073 symbolic links (19614 fast symbolic links)
      44 sockets
--------
  209018 files

Revision history for this message
Fred (eldmannen+launchpad) wrote :

ubuntu@ubuntu:~$ sudo badblocks -sv /dev/sdb1
Checking blocks 0 to 24410735
Checking for bad blocks (read-only test): done
Pass completed, 0 bad blocks found.

Revision history for this message
Fred (eldmannen+launchpad) wrote :
Download full text (6.1 KiB)

Some stuff I found in syslog or something.

Mar 13 15:18:38 ubuntu kernel: [ 21.754929] libata version 3.00 loaded.

Mar 13 15:18:38 ubuntu kernel: [ 22.303875] ata_piix 0000:00:1f.2: version 2.12
Mar 13 15:18:38 ubuntu kernel: [ 22.303880] ata_piix 0000:00:1f.2: MAP [ P0 -- P1 -- ]

Mar 13 15:18:38 ubuntu kernel: [ 22.303964] scsi0 : ata_piix
Mar 13 15:18:38 ubuntu kernel: [ 22.303998] scsi1 : ata_piix
Mar 13 15:18:38 ubuntu kernel: [ 22.304923] ata1: SATA max UDMA/133 cmd 0xc000 ctl 0xbc00 bmdma 0xb480 irq 20
Mar 13 15:18:38 ubuntu kernel: [ 22.304925] ata2: SATA max UDMA/133 cmd 0xb880 ctl 0xb800 bmdma 0xb488 irq 20
Mar 13 15:18:38 ubuntu kernel: [ 22.483700] ata1.00: ATA-8: SAMSUNG HD501LJ, CR100-10, max UDMA7
Mar 13 15:18:38 ubuntu kernel: [ 22.483704] ata1.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 0/32)
Mar 13 15:18:38 ubuntu kernel: [ 22.491698] ata1.00: configured for UDMA/133
Mar 13 15:18:38 ubuntu kernel: [ 22.671491] ata2.00: ATA-8: SAMSUNG HD252KJ, CM100-11, max UDMA7
Mar 13 15:18:38 ubuntu kernel: [ 22.671494] ata2.00: 488397168 sectors, multi 16: LBA48 NCQ (depth 0/32)
Mar 13 15:18:38 ubuntu kernel: [ 22.679489] ata2.00: configured for UDMA/133
Mar 13 15:18:38 ubuntu kernel: [ 22.679581] scsi 0:0:0:0: Direct-Access ATA SAMSUNG HD501LJ CR10 PQ: 0 ANSI: 5
Mar 13 15:18:38 ubuntu kernel: [ 22.679686] scsi 1:0:0:0: Direct-Access ATA SAMSUNG HD252KJ CM10 PQ: 0 ANSI: 5
Mar 13 15:18:38 ubuntu kernel: [ 22.679732] ata_piix 0000:00:1f.5: MAP [ P0 -- P1 -- ]

Mar 13 15:18:38 ubuntu kernel: [ 22.679796] scsi2 : ata_piix
Mar 13 15:18:38 ubuntu kernel: [ 22.679829] scsi3 : ata_piix
Mar 13 15:18:38 ubuntu kernel: [ 22.680672] ata3: SATA max UDMA/133 cmd 0xb000 ctl 0xac00 bmdma 0xa480 irq 20
Mar 13 15:18:38 ubuntu kernel: [ 22.680674] ata4: SATA max UDMA/133 cmd 0xa880 ctl 0xa800 bmdma 0xa488 irq 20
Mar 13 15:18:38 ubuntu kernel: [ 22.950905] usb 1-1: new low speed USB device using uhci_hcd and address 3
Mar 13 15:18:38 ubuntu kernel: [ 22.999018] ata3.00: ATAPI: TSSTcorpCD/DVDW SH-S183A, SB02, max UDMA/33
Mar 13 15:18:38 ubuntu kernel: [ 22.999021] ata3.00: applying bridge limits
Mar 13 15:18:38 ubuntu kernel: [ 23.130099] usb 1-1: configuration #1 chosen from 1 choice
Mar 13 15:18:38 ubuntu kernel: [ 23.170836] ata3.00: configured for UDMA/33
Mar 13 15:18:38 ubuntu kernel: [ 23.337835] scsi 2:0:0:0: CD-ROM TSSTcorp CD/DVDW SH-S183A SB02 PQ: 0 ANSI: 5

Mar 13 15:18:38 ubuntu kernel: [ 23.355624] Driver 'sd' needs updating - please use bus_type methods
Mar 13 15:18:38 ubuntu kernel: [ 23.356389] sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
Mar 13 15:18:38 ubuntu kernel: [ 23.356399] sd 0:0:0:0: [sda] Write Protect is off
Mar 13 15:18:38 ubuntu kernel: [ 23.356401] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Mar 13 15:18:38 ubuntu kernel: [ 23.356414] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 13 15:18:38 ubuntu kernel: [ 23.356452] sd 0:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB)
Mar 13 15:18:38 ubuntu kernel: [ 23.356460] sd 0:0:0:0: [sda] Write Protect is ...

Read more...

Revision history for this message
Fred (eldmannen+launchpad) wrote :
Download full text (4.4 KiB)

Now, I was listening to music in Rhythm box.
The music was even on sda5, not sdb1.

I was a little bit afk, then came back, and saw applications dim as I sat down.

System was going haywire again.

[numbers] EXT3-fs error (device sdb1): ext3_journal_start_sb: Detected aborted journal
[numbers] Remounting filesystem read-only
[numbers] EXT3-fs error (device sdb1) in ext3_reserve_inode_write: Journal has aborted

Then I boot the 7.10 LiveCD.
ubuntu@ubuntu:/var/log$ cat debug | grep -i "ext3"
Mar 14 17:44:26 ubuntu kernel: [ 207.614203] ext3_orphan_cleanup: deleting unreferenced inode 1485348
Mar 14 17:44:26 ubuntu kernel: [ 207.614227] ext3_orphan_cleanup: deleting unreferenced inode 1485353
ubuntu@ubuntu:/var/log$

ubuntu@ubuntu:/var/log$ cat kern.log | grep -i "ext3"
Mar 14 17:44:26 ubuntu kernel: [ 46.415302] EXT3-fs: INFO: recovery required on readonly filesystem.
Mar 14 17:44:26 ubuntu kernel: [ 46.415305] EXT3-fs: write access will be enabled during recovery.
Mar 14 17:44:26 ubuntu kernel: [ 207.614198] EXT3-fs: sdb1: orphan cleanup on readonly fs
Mar 14 17:44:26 ubuntu kernel: [ 207.614203] ext3_orphan_cleanup: deleting unreferenced inode 1485348
Mar 14 17:44:26 ubuntu kernel: [ 207.614227] ext3_orphan_cleanup: deleting unreferenced inode 1485353
Mar 14 17:44:26 ubuntu kernel: [ 207.614251] EXT3-fs error (device sdb1): ext3_free_branches: Read failure, inode=1485353, block=3000002
Mar 14 17:44:26 ubuntu kernel: [ 207.617633] EXT3-fs error (device sdb1): ext3_get_inode_loc: unable to read inode block - inode=1485246, block=2981893
Mar 14 17:44:26 ubuntu kernel: [ 207.621596] EXT3-fs warning (device sdb1): ext3_orphan_get: bad orphan inode 1485246! e2fsck was run?
Mar 14 17:44:26 ubuntu kernel: [ 207.621600] ext3_test_bit(bit=125, block=2981889) = -1
Mar 14 17:44:26 ubuntu kernel: [ 207.621607] EXT3-fs: sdb1: 2 orphan inodes deleted
Mar 14 17:44:26 ubuntu kernel: [ 207.621608] EXT3-fs: recovery complete.
Mar 14 17:44:26 ubuntu kernel: [ 207.621811] EXT3-fs: mounted filesystem with ordered data mode.
Mar 14 17:44:26 ubuntu kernel: [ 207.622146] EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2 offset 0

ubuntu@ubuntu:/var/log$ cat messages | grep -i "ext3"
Mar 14 17:44:26 ubuntu kernel: [ 46.415302] EXT3-fs: INFO: recovery required on readonly filesystem.
Mar 14 17:44:26 ubuntu kernel: [ 46.415305] EXT3-fs: write access will be enabled during recovery.
Mar 14 17:44:26 ubuntu kernel: [ 207.614198] EXT3-fs: sdb1: orphan cleanup on readonly fs
Mar 14 17:44:26 ubuntu kernel: [ 207.621596] EXT3-fs warning (device sdb1): ext3_orphan_get: bad orphan inode 1485246! e2fsck was run?
Mar 14 17:44:26 ubuntu kernel: [ 207.621600] ext3_test_bit(bit=125, block=2981889) = -1
Mar 14 17:44:26 ubuntu kernel: [ 207.621607] EXT3-fs: sdb1: 2 orphan inodes deleted
Mar 14 17:44:26 ubuntu kernel: [ 207.621608] EXT3-fs: recovery complete.
Mar 14 17:44:26 ubuntu kernel: [ 207.621811] EXT3-fs: mounted filesystem with ordered data mode.

ubuntu@ubuntu:/var/log$ cat syslog | grep -i "ext3"
Mar 14 17:44:26 ubuntu kernel: [ 46.415302] EXT3-fs: INFO: recovery required on readonly filesystem.
Mar 14 17:44:26 ubuntu kernel: [ ...

Read more...

Revision history for this message
Jos van Hees (jos-vanhees) wrote : Re: [Bug 200747] Re: EXT3-fs error corruption
Download full text (7.3 KiB)

That doesn't sound too good.

I've got no idea what's wrong with it, it's working fine here now for over
two days :x.

Have you tried to just reconnect all the cables and/or check if they're nog
broken?

Though that sounds unlikely to me since you're using sata disks.
Also try a e2fsck -f <disk>

This will force a check on your disk. Do the error's stop after one or two
times?
Jos

On Fri, Mar 14, 2008 at 6:55 PM, Fred <email address hidden> wrote:

> Now, I was listening to music in Rhythm box.
> The music was even on sda5, not sdb1.
>
> I was a little bit afk, then came back, and saw applications dim as I
> sat down.
>
> System was going haywire again.
>
> [numbers] EXT3-fs error (device sdb1): ext3_journal_start_sb: Detected
> aborted journal
> [numbers] Remounting filesystem read-only
> [numbers] EXT3-fs error (device sdb1) in ext3_reserve_inode_write: Journal
> has aborted
>
> Then I boot the 7.10 LiveCD.
> ubuntu@ubuntu:/var/log$ cat debug | grep -i "ext3"
> Mar 14 17:44:26 ubuntu kernel: [ 207.614203] ext3_orphan_cleanup:
> deleting unreferenced inode 1485348
> Mar 14 17:44:26 ubuntu kernel: [ 207.614227] ext3_orphan_cleanup:
> deleting unreferenced inode 1485353
> ubuntu@ubuntu:/var/log$
>
> ubuntu@ubuntu:/var/log$ cat kern.log | grep -i "ext3"
> Mar 14 17:44:26 ubuntu kernel: [ 46.415302] EXT3-fs: INFO: recovery
> required on readonly filesystem.
> Mar 14 17:44:26 ubuntu kernel: [ 46.415305] EXT3-fs: write access will
> be enabled during recovery.
> Mar 14 17:44:26 ubuntu kernel: [ 207.614198] EXT3-fs: sdb1: orphan
> cleanup on readonly fs
> Mar 14 17:44:26 ubuntu kernel: [ 207.614203] ext3_orphan_cleanup:
> deleting unreferenced inode 1485348
> Mar 14 17:44:26 ubuntu kernel: [ 207.614227] ext3_orphan_cleanup:
> deleting unreferenced inode 1485353
> Mar 14 17:44:26 ubuntu kernel: [ 207.614251] EXT3-fs error (device sdb1):
> ext3_free_branches: Read failure, inode=1485353, block=3000002
> Mar 14 17:44:26 ubuntu kernel: [ 207.617633] EXT3-fs error (device sdb1):
> ext3_get_inode_loc: unable to read inode block - inode=1485246,
> block=2981893
> Mar 14 17:44:26 ubuntu kernel: [ 207.621596] EXT3-fs warning (device
> sdb1): ext3_orphan_get: bad orphan inode 1485246! e2fsck was run?
> Mar 14 17:44:26 ubuntu kernel: [ 207.621600] ext3_test_bit(bit=125,
> block=2981889) = -1
> Mar 14 17:44:26 ubuntu kernel: [ 207.621607] EXT3-fs: sdb1: 2 orphan
> inodes deleted
> Mar 14 17:44:26 ubuntu kernel: [ 207.621608] EXT3-fs: recovery complete.
> Mar 14 17:44:26 ubuntu kernel: [ 207.621811] EXT3-fs: mounted filesystem
> with ordered data mode.
> Mar 14 17:44:26 ubuntu kernel: [ 207.622146] EXT3-fs error (device sdb1):
> ext3_find_entry: reading directory #2 offset 0
>
> ubuntu@ubuntu:/var/log$ cat messages | grep -i "ext3"
> Mar 14 17:44:26 ubuntu kernel: [ 46.415302] EXT3-fs: INFO: recovery
> required on readonly filesystem.
> Mar 14 17:44:26 ubuntu kernel: [ 46.415305] EXT3-fs: write access will
> be enabled during recovery.
> Mar 14 17:44:26 ubuntu kernel: [ 207.614198] EXT3-fs: sdb1: orphan
> cleanup on readonly fs
> Mar 14 17:44:26 ubuntu kernel: [ 207.621596] EXT3-fs warning (device
> sdb1): ext3_orphan_get: bad ...

Read more...

Revision history for this message
Fred (eldmannen+launchpad) wrote :

Seems things gone from bad to worse.

This happened couple times, but I always could reboot to LiveCD and run fsck and badblocks.

ubuntu@ubuntu:~$ sudo e2fsck -pcfv /dev/sdb1
e2fsck: Attempt to read block from filesystem resulted in short read while trying to open /dev/sdb1
Could this be a zero-length partition?

ubuntu@ubuntu:~$ badblocks -sv /dev/sdb1
...
24410733
24410734
24410735
done
Pass completed, 24410736 bad blocks found.

It says ALL my blocks are bad! :(

Revision history for this message
Fred (eldmannen+launchpad) wrote :

Things gone from strange to stranger.

You would think if my file system is so fsck'ed up it wont even run fsck, and badblocks marks every single block bad, then it must be really bad.

The strange thing is, I popped out the LiveCD, restarted the computer, and it boots Ubuntu.
If my system can't run fsck, if every single block on the disk is bad, why can I run Ubuntu?

Revision history for this message
Fred (eldmannen+launchpad) wrote :

Jos van Hees,
No, I haven't tried to reconnect the cables on the SATA disks.

Yes, I always ran e2fsck with the -f parameter from the LiveCD.
I ran with the -pcfv parameters.

Revision history for this message
Fred (eldmannen+launchpad) wrote :

Now the system crashed again.

I restarted to the LiveCD.
This time however [unlike last time], e2fsck actually works.

ubuntu@ubuntu:~$ sudo e2fsck -pcfv /dev/sdb1

  208340 inodes used (6.83%)
    3797 non-contiguous inodes (1.8%)
         # of inodes with ind/dind/tind blocks: 10564/121/0
 1356469 blocks used (22.23%)
       0 bad blocks
       1 large file

  160919 regular files
   26099 directories
     132 character device files
      26 block device files
       3 fifos
     576 links
   21116 symbolic links (19658 fast symbolic links)
      36 sockets
--------
  208907 files

Revision history for this message
Fred (eldmannen+launchpad) wrote :
Download full text (9.5 KiB)

smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model: SAMSUNG HD252KJ
Serial Number: S0NJJDPP900694
Firmware Version: CM100-11
User Capacity: 250,059,350,016 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: Not recognized. Minor revision code: 0x52
Local Time is: Sat Mar 15 00:49:22 2008 CET

==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for details.

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
     was never started.
     Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
     without error or no self-test has ever
     been run.
Total time to complete Offline
data collection: (4461) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
     Auto Offline data collection on/off support.
     Suspend Offline collection upon new
     command.
     Offline surface scan supported.
     Self-test supported.
     No Conveyance Self-test supported.
     Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
     power-saving mode.
     Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
     General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 76) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x000f 100 100 051 Pre-fail Always - 0
  3 Spin_Up_Time 0x0007 100 100 015 Pre-fail Always - 5440
  4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 396
  5 Reallocated_Sector_Ct 0x0033 253 253 010 Pre-fail Always - 0
  7 Seek_Error_Rate 0x000f 253 253 051 Pre-fail Always - 0
  8 Seek_Time_Performance 0x0025 253 253 015 Pre-fail Offline - 0
  9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 1754
 10 Spin_Retry_Count 0x0033 253 253 051 Pre-fail Always - 0
 11 Calibration_Retry_Count 0x0012 253 253 000 Old_age Always - 0
 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 207
 13 Read_Soft_Error_Rate 0x000e 100 100 000 Old_age Always - 332734940
187 Unknown_Attribute 0x0032 253 253 000 Old_age Always - 0
188 Unknown_Attribute 0x0032 100 100 000 Old_age Always ...

Read more...

Revision history for this message
drdaz (drdaz7) wrote :

I've started experiencing the same recently; I have a machine that's always-on, and freezes in the same manner (with lit HD light), usually after between 2-5 days uptime. The inode errors are then fixed (either on bootup or manually via a bootable Ubuntu SD card) and the process repeats.

My disks are ~6 month old 500GB WD Caviar drives and the / partition (sadly the one exhibiting the problem - others all come up clean post-crash) is also ext3 format. I *believe* the issue began after the most recent kernel upgrade (current: 2.6.22-14-generic), so I'll try downgrading again to see if that makes any difference.

Revision history for this message
drdaz (drdaz7) wrote :

It just occurred to me that the kernel version I provided is probably a bit useless - it's 2.6.22-14.52.

Revision history for this message
Jos van Hees (jos-vanhees) wrote :

It happened to me twice - with one day in between. After the second time, I switched the drives again, found out that the drive was working and gave it another try.

Ther server now has over three weeks of uptime without getting that error again. The kernel version is still the same ( 2.6.18-6-686). I never had it with the other drive, now mounted as secondary, running debian sid.

Let's hope for me that the problem stays away.

Revision history for this message
Fred (eldmannen+launchpad) wrote :

I haven't experienced this problem in a while...
Maybe some update fixed it, I don't know...
I had also opened the box, and pressed on the cable a little, maybe that fixed it... I don't know...

Maybe the problem is just hiding and will bother me again soon... lets hope not...

Revision history for this message
Thiarley (thiarley) wrote :

I have the same problem....the problem started today...
I was copying some torrents and then the azureus started to show error messages....
I dont know what to do.

Revision history for this message
Jos van Hees (jos-vanhees) wrote :

I haven't had this problem ever since.

I therefore do not know the solution. You might want to reconnect your harddrives, as I did that too while testing for the problem.

Good luck!

Jos

Revision history for this message
00arthuryu (you-talkingto-me-43) wrote :

I have had this problem ever since upgrading to hardy
It has the same sort of symtoms
but the error messages that come up are slightly different

ata validation failed
fat: fat read failed blocknr
buffer I/O error on device sda11 logicalblock 0 (lots of these)
buffer I/O error on device sda10 logicalblock 0 (lots of these)

Revision history for this message
kahuuna (kahuuna) wrote :

I can attest to having this same bug too. Haven't been able to boot for a few days, I hope resocketing sata cables and doing some fsck's from the livecd actually allow me to boot :(

Revision history for this message
drdaz (drdaz7) wrote :

FYI, after pulling my machine apart and reassembling it (thus resocketing
the SATA cables), I have not experienced this problem.

On Wed, Jul 16, 2008 at 8:15 AM, kahuuna <email address hidden> wrote:

> I can attest to having this same bug too. Haven't been able to boot for
> a few days, I hope resocketing sata cables and doing some fsck's from
> the livecd actually allow me to boot :(
>
>

Revision history for this message
Ian Weisser (ian-weisser) wrote :

Marking 'Invalid' on Debian - no evidence this is an upstream bug, and not linked to a Debian bug.. Happy to reopen if the evidence appears.

Revision history for this message
Ian Weisser (ian-weisser) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. You reported this bug a while ago and there hasn't been any activity in it recently.
We were wondering is this still an issue for you?
Can you try with latest Ubuntu release?
What steps have you taken to ensure it's not defective hardware or a bad connector?
Do you have any ideas how we can reliably reproduce this bug?
Thanks in advance.

Revision history for this message
Fred (eldmannen+launchpad) wrote :

I do not have this problem anymore. I am running Ubuntu 8.10 "Intrepid Ibex" (beta) and havent had these problems for a very long time.
Strangely enough, I read somewhere that someone tried the beta and experienced ext3-fs errors...

Revision history for this message
drdaz (drdaz7) wrote :

I have not experienced this issue since I reassembled my machine - it
seems likely that it was, in my case at least, hardware-related.

On Oct 11, 2008, at 9:59 PM, IanW wrote:

> Thank you for taking the time to report this bug and helping to make
> Ubuntu better. You reported this bug a while ago and there hasn't
> been any activity in it recently.
> We were wondering is this still an issue for you?
> Can you try with latest Ubuntu release?
> What steps have you taken to ensure it's not defective hardware or a
> bad connector?
> Do you have any ideas how we can reliably reproduce this bug?
> Thanks in advance.
>
> ** Changed in: linux (Ubuntu)
> Sourcepackagename: None => linux
> Status: New => Incomplete
>
> --
> EXT3-fs error corruption
> https://bugs.launchpad.net/bugs/200747
> You received this bug notification because you are a direct subscriber
> of the bug.

Revision history for this message
Ian Weisser (ian-weisser) wrote :

This bug report is being closed due to your last comment regarding this being fixed with an update. For future reference you can manage the status of your own bugs by clicking on the current status in the yellow line and then choosing a new status in the revealed drop down box. You can learn more about bug statuses at https://wiki.ubuntu.com/Bugs/Status . Thank you again for taking the time to report this bug and helping to make Ubuntu better. Feel free to submit any future bugs you may find.

Changed in linux:
status: Incomplete → Invalid
Revision history for this message
Fred (eldmannen+launchpad) wrote :

Today, I experienced this today again (after a long time of not having this problem).

ata2.00: revalidation failed (errno=-5)
and something about SRST failed (errno=-16) or something.

Running Jaunty Jackalope (beta).

Linux darkstar 2.6.28-11-generic

Maybe it is a kernel regression or something?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.