Repetitive massive filesystem corruption
This problem has been ongoing since Kubuntu 9.10 but I was unable to take the time to properly diagnose the error during that product cycle (and I hoped a newer kernel would solve the problem). I can also report that the problem is reproducible in openSUSE 11.2, so this is possibly even a mainline kernel problem.
In short, I experience massive filesystem corruption on ext4. In 9.10 I could also reproduce the corruption using ext3, but I have not yet had a chance to test ext3 under 10.04. I was also able to reproduce the problem on 9.10 by mounting from Live CD and creating a filesystem on an internal hard drive. The symptions on 10.04 are so far identical, but I haven't yet been able to perform all the testing I did under 9.10. I have attached an example fsck log from current 10.04.
Corruption is sometimes detected when booting; other times the filesystem switches to read-only while the system is running. The latter is what happened on the first boot of my current installation of 10.04. A subsequent boot left me able to perform all system updates. Then after rebooting the system the filesystem switched to read-only within a few minutes of logging in. The only cure for the read-only filesystem is to boot from CD and run e2fsck manually.
Since October, I have performed some hardware troubleshooting (many, many times):
1) Hitachi Drive Fitness Test, which always reports Disposition Code 00: no errors;
2) badblocks: always reports no bad blocks found, regardless of the blocksize, etc.
SMART reports the drive status as good, with no serious errors reported. Memtest 86+ says the RAM is good. The machine runs flawlessly under 9.04 or earlier using ext3, and NTFS produces no similar corruption under either Windows XP or Windows 7.
As I enter this bug report, I've been running with no errors for the last 30 minutes, and there's no way to predict if/when corruption will reappear. As a further troubleshooting step I have turned write caching off using the Hitachi Feature Tool; I previously did this using hdparm.conf, but IMHO write caching shouldn't be blamed for any corruption during normal system operation, but only after an unclean shutdown.
I should add that I've seen bug reports with quite similar symptoms on 64-bit systems, but I should emphasize that I'm running 32-bit on this system, with an ICH4-M controller. The corruption problem hasn't surfaced on any of my either systems, which include both Intel and VIA controllers.
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
USER PID ACCESS COMMAND
CRDA: Error: [Errno 2] No such file or directory
Card hw:0 'I82801DBICH4'
Mixer name : 'SigmaTel STAC9752,53'
Components : 'AC97a:83847652'
Controls : 38
Simple ctrls : 24
Date: Sat Feb 27 07:45:15 2010
DistroRelease: Ubuntu 10.04
Frequency: This has only happened once.
InstallationMedia: Kubuntu 10.04 "Lucid Lynx" - Alpha i386 (20100225)
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 002: ID 07cc:0301 Carry Computer Eng., Co., Ltd 6-in-1 Card Reader
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Gateway Gateway M350WVN
no product info available
0: phy0: Wireless LAN
Soft blocked: no
Hard blocked: no
Uname: Linux 2.6.32-14-generic i686
dmi.board.name: Gateway M350WVN
dmi.board.version: Rev 1.0
dmi.product.name: Gateway M350WVN
|tags:||added: kernel-fs kernel-needs-review|
|Changed in linux (Ubuntu):|
|importance:||Undecided → High|
|status:||Incomplete → Triaged|
added: kernel-candidate kernel-reviewed
|Changed in linux (Ubuntu Lucid):|
|assignee:||Andy Whitcroft (apw) → nobody|
|Changed in linux (Ubuntu Lucid):|
|status:||Incomplete → Confirmed|