Repetitive massive filesystem corruption
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
High
|
Andy Whitcroft | ||
Lucid |
Won't Fix
|
High
|
Unassigned |
Bug Description
This problem has been ongoing since Kubuntu 9.10 but I was unable to take the time to properly diagnose the error during that product cycle (and I hoped a newer kernel would solve the problem). I can also report that the problem is reproducible in openSUSE 11.2, so this is possibly even a mainline kernel problem.
In short, I experience massive filesystem corruption on ext4. In 9.10 I could also reproduce the corruption using ext3, but I have not yet had a chance to test ext3 under 10.04. I was also able to reproduce the problem on 9.10 by mounting from Live CD and creating a filesystem on an internal hard drive. The symptions on 10.04 are so far identical, but I haven't yet been able to perform all the testing I did under 9.10. I have attached an example fsck log from current 10.04.
Corruption is sometimes detected when booting; other times the filesystem switches to read-only while the system is running. The latter is what happened on the first boot of my current installation of 10.04. A subsequent boot left me able to perform all system updates. Then after rebooting the system the filesystem switched to read-only within a few minutes of logging in. The only cure for the read-only filesystem is to boot from CD and run e2fsck manually.
Since October, I have performed some hardware troubleshooting (many, many times):
1) Hitachi Drive Fitness Test, which always reports Disposition Code 00: no errors;
2) badblocks: always reports no bad blocks found, regardless of the blocksize, etc.
SMART reports the drive status as good, with no serious errors reported. Memtest 86+ says the RAM is good. The machine runs flawlessly under 9.04 or earlier using ext3, and NTFS produces no similar corruption under either Windows XP or Windows 7.
As I enter this bug report, I've been running with no errors for the last 30 minutes, and there's no way to predict if/when corruption will reappear. As a further troubleshooting step I have turned write caching off using the Hitachi Feature Tool; I previously did this using hdparm.conf, but IMHO write caching shouldn't be blamed for any corruption during normal system operation, but only after an unclean shutdown.
I should add that I've seen bug reports with quite similar symptoms on 64-bit systems, but I should emphasize that I'm running 32-bit on this system, with an ICH4-M controller. The corruption problem hasn't surfaced on any of my either systems, which include both Intel and VIA controllers.
ProblemType: Bug
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: i386
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
Card hw:0 'I82801DBICH4'
Mixer name : 'SigmaTel STAC9752,53'
Components : 'AC97a:83847652'
Controls : 38
Simple ctrls : 24
Date: Sat Feb 27 07:45:15 2010
DistroRelease: Ubuntu 10.04
Frequency: This has only happened once.
HibernationDevice: RESUME=
InstallationMedia: Kubuntu 10.04 "Lucid Lynx" - Alpha i386 (20100225)
Lsusb:
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 002: ID 07cc:0301 Carry Computer Eng., Co., Ltd 6-in-1 Card Reader
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Gateway Gateway M350WVN
Package: linux-image-
PccardctlIdent:
Socket 0:
no product info available
PccardctlStatus:
Socket 0:
no card
ProcCmdLine: BOOT_IMAGE=
ProcEnviron:
LANGUAGE=
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcVersionSign
Regression: No
RelatedPackageV
Reproducible: No
RfKill:
0: phy0: Wireless LAN
Soft blocked: no
Hard blocked: no
SourcePackage: linux
TestedUpstream: No
Uname: Linux 2.6.32-14-generic i686
dmi.bios.date: 04/23/2004
dmi.bios.vendor: Gateway
dmi.bios.version: 34.01.00
dmi.board.name: Gateway M350WVN
dmi.board.vendor: Gateway
dmi.board.version: Rev 1.0
dmi.chassis.
dmi.chassis.type: 10
dmi.chassis.vendor: Gateway
dmi.chassis.
dmi.modalias: dmi:bvnGateway:
dmi.product.name: Gateway M350WVN
dmi.product.
dmi.sys.vendor: Gateway
tags: | added: karmic |
tags: | removed: needs-upstream-testing |
tags: | added: kernel-fs kernel-needs-review |
Changed in linux (Ubuntu): | |
importance: | Undecided → High |
status: | Incomplete → Triaged |
tags: |
added: kernel-candidate kernel-reviewed removed: kernel-needs-review |
Changed in linux (Ubuntu Lucid): | |
assignee: | Andy Whitcroft (apw) → nobody |
Changed in linux (Ubuntu Lucid): | |
status: | Incomplete → Confirmed |
Once again, the filesystem switched to read-only while the system was in use. The resulting fsck log is attached. This is using the standard 2.6.32-14-generic kernel immediately after a fresh install. Total uptime before the problem was discovered was under five minutes.