in-place corruption of large files *without fsck or reboot* reported with linux 2.6.31-14.46 on ext4
- Karmic (9.10)
- Bug #453579
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Linux |
Invalid
|
Undecided
|
Unassigned | ||
Release Notes for Ubuntu |
Fix Released
|
Undecided
|
Unassigned | ||
linux (Ubuntu) |
Invalid
|
Critical
|
Surbhi Palande | ||
Karmic |
Invalid
|
Critical
|
Unassigned |
Bug Description
There are worrying reports of filesystem corruption on ext4 in karmic. Scott says:
12:36 < Keybuk> this whole ext4 thing is worrying me
12:36 < Keybuk> I just downloaded an iso image, md5sum didn't match
12:36 < Keybuk> downloaded it into an ext3 partition, matched just fine
12:59 < Keybuk> and I know mvo has seen bugs with corrupted .debs in /var/cache/
12:59 < Keybuk> which seems to imply its any file large enough to use lots of extents
I'm opening this bug report so that this bug gets tracked & triaged for karmic. If we're unable to isolate the issue, we should consider rolling back to ext3 as the default filesystem in the installer.
ProblemType: Bug
Architecture: amd64
ArecordDevices:
**** List of CAPTURE Hardware Devices ****
card 0: Intel [HDA Intel], device 0: AD198x Analog [AD198x Analog]
Subdevices: 1/1
Subdevice #0: subdevice #0
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/
Card0.Amixer.info:
Card hw:0 'Intel'/'HDA Intel at 0xee240000 irq 17'
Mixer name : 'Analog Devices AD1981'
Components : 'HDA:11d41981,
Controls : 20
Simple ctrls : 11
Date: Fri Oct 16 16:01:26 2009
DistroRelease: Ubuntu 9.10
HibernationDevice: RESUME=
MachineType: LENOVO 6371CTO
Package: linux-image-
PccardctlIdent:
Socket 0:
no product info available
PccardctlStatus:
Socket 0:
no card
ProcCmdLine: root=/dev/
ProcEnviron:
PATH=(custom, user)
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcVersionSign
RelatedPackageV
SourcePackage: linux
Uname: Linux 2.6.31-13-generic x86_64
WpaSupplicantLog:
dmi.bios.date: 12/27/2006
dmi.bios.vendor: LENOVO
dmi.bios.version: 7IET23WW (1.04 )
dmi.board.name: 6371CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.
dmi.modalias: dmi:bvnLENOVO:
dmi.product.name: 6371CTO
dmi.product.
dmi.sys.vendor: LENOVO
ls
Steve Langasek (vorlon) wrote : | #1 |
- AlsaDevices.txt Edit (461 bytes, text/plain; charset="utf-8")
- AplayDevices.txt Edit (273 bytes, text/plain; charset="utf-8")
- BootDmesg.txt Edit (51.4 KiB, text/plain; charset="utf-8")
- CRDA.txt Edit (257 bytes, text/plain; charset="utf-8")
- Card0.Amixer.values.txt Edit (2.2 KiB, text/plain; charset="utf-8")
- Card0.Codecs.codec.0.txt Edit (7.6 KiB, text/plain; charset="utf-8")
- CurrentDmesg.txt Edit (70.9 KiB, text/plain; charset="utf-8")
- Dependencies.txt Edit (1.5 KiB, text/plain; charset="utf-8")
- IwConfig.txt Edit (613 bytes, text/plain; charset="utf-8")
- Lspci.txt Edit (13.0 KiB, text/plain; charset="utf-8")
- Lsusb.txt Edit (376 bytes, text/plain; charset="utf-8")
- PciMultimedia.txt Edit (601 bytes, text/plain; charset="utf-8")
- ProcCpuinfo.txt Edit (1.4 KiB, text/plain; charset="utf-8")
- ProcInterrupts.txt Edit (1.5 KiB, text/plain; charset="utf-8")
- ProcModules.txt Edit (4.0 KiB, text/plain; charset="utf-8")
- RfKill.txt Edit (128 bytes, text/plain; charset="utf-8")
- UdevDb.txt Edit (110.8 KiB, text/plain; charset="utf-8")
- UdevLog.txt Edit (236.7 KiB, text/plain; charset="utf-8")
- WifiSyslog.txt Edit (15.0 KiB, text/plain; charset="utf-8")
- XsessionErrors.txt Edit (2.9 KiB, text/plain; charset="utf-8")
Changed in linux (Ubuntu): | |
importance: | Undecided → Critical |
milestone: | none → ubuntu-9.10 |
Steve Langasek (vorlon) wrote : | #2 |
Changed in linux (Ubuntu Karmic): | |
status: | New → Triaged |
Steve Beattie (sbeattie) wrote : apport-collect data | #3 |
AplayDevices: aplay: device_list:223: no soundcards found...
Architecture: amd64
ArecordDevices: arecord: device_list:223: no soundcards found...
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/dsp', '/dev/snd/by-path', '/dev/snd/
CRDA: Error: [Errno 2] No such file or directory
CheckboxSubmission: 138b721e3738d95
CheckboxSystem: 558fbfb2a125871
DistroRelease: Ubuntu 9.10
HibernationDevice: RESUME=
IwConfig:
lo no wireless extensions.
eth0 no wireless extensions.
MachineType: Shuttle Inc SA76
Package: linux (not installed)
ProcCmdLine: BOOT_IMAGE=
ProcEnviron:
SHELL=bash
PATH=(custom, user)
LANG=en_US.UTF-8
ProcVersionSign
RelatedPackageV
linux-
linux-firmware 1.23
RfKill:
Uname: Linux 2.6.31-14-generic x86_64
UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare
dmi.bios.date: 05/04/2009
dmi.bios.vendor: Phoenix Technologies, LTD
dmi.bios.version: 6.00 PG
dmi.board.name: FA76
dmi.board.vendor: Shuttle Inc
dmi.board.version: V10
dmi.chassis.type: 3
dmi.chassis.vendor: Shuttle Inc
dmi.chassis.
dmi.modalias: dmi:bvnPhoenixT
dmi.product.name: SA76
dmi.product.
dmi.sys.vendor: Shuttle Inc
Steve Beattie (sbeattie) wrote : AlsaDevices.txt | #4 |
Steve Beattie (sbeattie) wrote : BootDmesg.txt | #5 |
Steve Beattie (sbeattie) wrote : Card0.Amixer.info.txt | #6 |
Steve Beattie (sbeattie) wrote : Card0.Amixer.values.txt | #7 |
Steve Beattie (sbeattie) wrote : Card0.Codecs.codec.0.txt | #8 |
Steve Beattie (sbeattie) wrote : CurrentDmesg.txt | #9 |
Steve Beattie (sbeattie) wrote : Lspci.txt | #10 |
Steve Beattie (sbeattie) wrote : Lsusb.txt | #11 |
Steve Beattie (sbeattie) wrote : PciMultimedia.txt | #12 |
Steve Beattie (sbeattie) wrote : ProcCpuinfo.txt | #13 |
Steve Beattie (sbeattie) wrote : ProcInterrupts.txt | #14 |
Steve Beattie (sbeattie) wrote : ProcModules.txt | #15 |
Steve Beattie (sbeattie) wrote : UdevDb.txt | #16 |
Steve Beattie (sbeattie) wrote : UdevLog.txt | #17 |
Steve Beattie (sbeattie) wrote : WifiSyslog.txt | #18 |
Steve Beattie (sbeattie) wrote : XsessionErrors.txt | #19 |
tags: | added: apport-collected |
Steve Beattie (sbeattie) wrote : Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #20 |
I did a fresh install from the karmic alt amd64 cd build 20091016 onto ext4 on LVM. Post install update, and installation of a limited amount of additional software, I ran a debsums -a on the system, and noticed the following things:
- debsums claims that the following packages don't have an md5sums at all: bogofilter,
- the following files were reported as failing their debsums check:
/var/lib/
/usr/share/
/var/lib/
the last is expected (I believe) but not the first two.
Steve Langasek (vorlon) wrote : Re: [Bug 453579] Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #21 |
On Sat, Oct 17, 2009 at 05:08:49PM -0000, Steve Beattie wrote:
> - the following files were reported as failing their debsums check:
> /var/lib/
> /usr/share/
> /var/lib/
> the last is expected (I believe) but not the first two.
The second is a bug in gnome-pilot, I guess it hasn't been rebuilt since we
fixed the translations-
The first could have any number of other explanations besides filesystem
corruption.
The missing .md5sums files are interesting/
--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://
<email address hidden> <email address hidden>
Colin Watson (cjwatson) wrote : Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #22 |
I don't think the missing .md5sums files are intrinsically worrying. I've looked at several of them and they're genuinely missing. installation-
Since md5sums files are created in debian/rules rather than by dpkg-deb, they're merely very widespread rather than actually universal ...
Scott James Remnant (Canonical) (canonical-scott) wrote : | #23 |
I'm just using plain old ext4 on SSD
Scott James Remnant (Canonical) (canonical-scott) wrote : | #24 |
Here's an example of what I mean:
warcraft scott% wget -q http://
warcraft scott% md5sum karmic-
91e4f415767a456
warcraft root# sync
warcraft scott% md5sum karmic-
26c3177ae594a37
I assume the change is that the file is no longer in the page cache
Scott James Remnant (Canonical) (canonical-scott) wrote : apport-collect data | #25 |
Architecture: amd64
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
Card hw:0 'Intel'/'HDA Intel at 0xf6ffc000 irq 21'
Mixer name : 'SigmaTel STAC9228'
Components : 'HDA:83847616,
Controls : 29
Simple ctrls : 19
DistroRelease: Ubuntu 9.10
HibernationDevice: RESUME=
MachineType: Dell Inc. XPS M1330
Package: linux (not installed)
ProcCmdLine: BOOT_IMAGE=
ProcEnviron:
LANG=en_GB.UTF-8
PATH=(custom, user)
SHELL=/bin/zsh
LC_COLLATE=C
ProcVersionSign
RelatedPackageV
linux-
linux-firmware 1.24
Uname: Linux 2.6.31-14-generic x86_64
UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare
dmi.bios.date: 12/26/2008
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A15
dmi.board.name: 0U8042
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.
dmi.product.name: XPS M1330
dmi.sys.vendor: Dell Inc.
Scott James Remnant (Canonical) (canonical-scott) wrote : AlsaDevices.txt | #26 |
Scott James Remnant (Canonical) (canonical-scott) wrote : AplayDevices.txt | #27 |
Scott James Remnant (Canonical) (canonical-scott) wrote : ArecordDevices.txt | #28 |
Scott James Remnant (Canonical) (canonical-scott) wrote : BootDmesg.txt | #29 |
Scott James Remnant (Canonical) (canonical-scott) wrote : Card0.Amixer.values.txt | #30 |
Scott James Remnant (Canonical) (canonical-scott) wrote : Card0.Codecs.codec.0.txt | #31 |
Scott James Remnant (Canonical) (canonical-scott) wrote : CurrentDmesg.txt | #32 |
Scott James Remnant (Canonical) (canonical-scott) wrote : IwConfig.txt | #33 |
Scott James Remnant (Canonical) (canonical-scott) wrote : Lspci.txt | #34 |
Scott James Remnant (Canonical) (canonical-scott) wrote : Lsusb.txt | #35 |
Scott James Remnant (Canonical) (canonical-scott) wrote : PciMultimedia.txt | #36 |
Scott James Remnant (Canonical) (canonical-scott) wrote : ProcCpuinfo.txt | #37 |
Scott James Remnant (Canonical) (canonical-scott) wrote : ProcInterrupts.txt | #38 |
Scott James Remnant (Canonical) (canonical-scott) wrote : ProcModules.txt | #39 |
Scott James Remnant (Canonical) (canonical-scott) wrote : RfKill.txt | #40 |
Scott James Remnant (Canonical) (canonical-scott) wrote : UdevDb.txt | #41 |
Scott James Remnant (Canonical) (canonical-scott) wrote : UdevLog.txt | #42 |
Scott James Remnant (Canonical) (canonical-scott) wrote : WifiSyslog.txt | #43 |
Scott James Remnant (Canonical) (canonical-scott) wrote : XsessionErrors.txt | #44 |
Nick Lowe (nick-int-r) wrote : Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #45 |
http://
--- inode.c.orig 2009-10-05 18:18:51.000000000 +0200
+++ inode.c 2009-10-18 13:16:45.728112813 +0200
@@ -5164,6 +5164,9 @@
} else {
struct ext4_iloc iloc;
+ if (inode-
+ return 0;
+
err = ext4_get_
if (err)
return err;
Scott James Remnant (Canonical) (canonical-scott) wrote : | #46 |
Nick: what was the context of that Bugzilla reference, there's no bug# included
Eamonn Sullivan (eamonn-sullivan) wrote : | #47 |
(In case Nick doesn't respond quickly) The patch is referenced in http://
Near the end.
Tomás Reyes (trcecilio) wrote : | #48 |
The reference to that patch is in Comment #90
John Johansen (jjohansen) wrote : | #49 |
- patch reworked for 2.6.31 kernel Edit (448 bytes, text/plain)
The code path that is being patched in the ext4_write_inode() function is new to 2.6.32 and does not exist in Karmic.
It may be possible (though unlikely), that the read only non-journaled case calling ext4_force_commit is causing the corruption as in the 2.6.32 patch this case is short circuited returning without doing anything, however in 2.6.32 this code path is short circuiting on sync_dirty_
I have attached the patch reworked for 2.6.31 short circuiting the read only non-journaled case but I need to evaluate the code more.
John Johansen (jjohansen) wrote : | #50 |
Since I haven't reproduced this error yet I would like to get a better handle on what people are seeing here. Is it file system corruption (errors that show up in fsck), or file corruption where fsck does not report any errors. Also for files that are corrupted do they have the correct size and is possible to run a compare between a corrupt file and a good file so we can get a handle on the location the corruption is starts.
John Johansen (jjohansen) wrote : | #51 |
I have placed a test kernel with the above patch at
http://
It would be good to know if this clears up the corruption problems, and/or if the warning and stack trace shows up in the logs whether or not the corruption is problem is fixed.
papukaija (papukaija) wrote : | #52 |
Just to confirm, is this just an issue with 2.6.31-14.46 kernel or is it also with the 2.6.31-14.48 kernel?
Scott James Remnant (Canonical) (canonical-scott) wrote : Re: [Bug 453579] Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #53 |
On Wed, 2009-10-21 at 20:07 +0000, John Johansen wrote:
> Since I haven't reproduced this error yet I would like to get a better
> handle on what people are seeing here. Is it file system corruption
> (errors that show up in fsck), or file corruption where fsck does not
> report any errors. Also for files that are corrupted do they have the
> correct size and is possible to run a compare between a corrupt file and
> a good file so we can get a handle on the location the corruption is
> starts.
>
The corruption is not detected by fsck.
In my testing, the files maintained the same size, but the data changed
in them. The data started being different at around 512MB into the
file.
Scott
--
Scott James Remnant
<email address hidden>
Carl Englund (englundc) wrote : Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #54 |
I'm a litte worried about this one too, so I gave testing it a shot. Created a ~20GiB ext4 filesystem and copied a 1.2GiB file there. Compared with md5sum and the checksum was the same. Running Karmic RC with 2.6.31-14.
mabovo (mabovo) wrote : | #55 |
Sory if I hijack this bug but seems that there is something related if I am not totally wrong:
I am using 9.10 with Mac2,1, ext4 on sda3.
When trying to copy a DVD like Snow or Leopard.iso (aprox. 7.5 GB) into an external HD *fat32), Nautilus stop the process in the middle displaying an error message with the following "Error writing the file: File too big"
The same iso can be copied into another partition of my internal hd like sda4 (NTFS/Windows7) without errors.
mabovo (mabovo) wrote : apport-collect data | #56 |
Architecture: i386
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
Card hw:0 'Intel'/'HDA Intel at 0x90440000 irq 22'
Mixer name : 'SigmaTel STAC9221 A1'
Components : 'HDA:83847680,
Controls : 21
Simple ctrls : 13
DistroRelease: Ubuntu 9.10
HibernationDevice: RESUME=
MachineType: Apple Inc. MacBook2,1
Package: linux (not installed)
ProcCmdLine: BOOT_IMAGE=
ProcEnviron:
SHELL=/bin/bash
LANG=pt_BR.UTF-8
LANGUAGE=
ProcVersionSign
RelatedPackageV
linux-
linux-firmware 1.24
Uname: Linux 2.6.31-14-generic i686
UserGroups: adm admin cdrom dialout lpadmin netdev plugdev sambashare
dmi.bios.date: 06/27/07
dmi.bios.vendor: Apple Inc.
dmi.bios.version: MB21.88Z.
dmi.board.
dmi.board.name: Mac-F4208CAA
dmi.board.vendor: Apple Inc.
dmi.board.version: PVT
dmi.chassis.
dmi.chassis.type: 10
dmi.chassis.vendor: Apple Inc.
dmi.chassis.
dmi.modalias: dmi:bvnAppleInc
dmi.product.name: MacBook2,1
dmi.product.
dmi.sys.vendor: Apple Inc.
mabovo (mabovo) wrote : AlsaDevices.txt | #57 |
mabovo (mabovo) wrote : AplayDevices.txt | #58 |
mabovo (mabovo) wrote : ArecordDevices.txt | #59 |
mabovo (mabovo) wrote : BootDmesg.txt | #60 |
mabovo (mabovo) wrote : Card0.Amixer.values.txt | #61 |
mabovo (mabovo) wrote : Card0.Codecs.codec.0.txt | #62 |
mabovo (mabovo) wrote : CurrentDmesg.txt | #63 |
mabovo (mabovo) wrote : IwConfig.txt | #64 |
mabovo (mabovo) wrote : Lspci.txt | #65 |
mabovo (mabovo) wrote : Lsusb.txt | #66 |
mabovo (mabovo) wrote : PciMultimedia.txt | #67 |
mabovo (mabovo) wrote : ProcCpuinfo.txt | #68 |
mabovo (mabovo) wrote : ProcInterrupts.txt | #69 |
mabovo (mabovo) wrote : ProcModules.txt | #70 |
mabovo (mabovo) wrote : RfKill.txt | #71 |
mabovo (mabovo) wrote : UdevDb.txt | #72 |
mabovo (mabovo) wrote : UdevLog.txt | #73 |
mabovo (mabovo) wrote : WifiSyslog.txt | #74 |
mabovo (mabovo) wrote : XsessionErrors.txt | #75 |
Steve Langasek (vorlon) wrote : Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #76 |
mabovo, your issue is completely unrelated.
Changed in linux (Ubuntu Karmic): | |
milestone: | ubuntu-9.10 → karmic-updates |
Manuel Bua (manuel-bua) wrote : | #77 |
Since it seems the fix has been planned for karmic-updates, should we expect ext3 to be used as the default fs when installing Karmic?
I'm quite worried about the impact this bug could have on new users migrating to Ubuntu.
Steve Langasek (vorlon) wrote : | #78 |
Setting the milestone does not mean that a fix is planned for karmic-updates. So far, it doesn't appear that Scott's original problem is reproducible for anyone else. We will stay on this bug to try to confirm it and find a fix, but we aren't going to change the default fs for a bug that only one person is seeing.
Steven Post (redalert-commander) wrote : | #79 |
@mabovo: a regular fat32 filesystem only supports files up to about 4GB, wich explains your problem.
I haven't expirienced this on ext4 yet, but I did notice some corruption on ext3 a while back, nothing important, but it could have been corruptions introduced with the transfer of the file. I don't know how you downloaded it, but it might be a clue (http, ftp, bittorrent,..?), not every protocol has the same level of corruption checking.
Although I'm afraid it is in the filesystem, in my case with ext3 it was a torrent, first hash checking passed, a month later it didn't.
bert (xbert) wrote : | #80 |
You can add me as a second user seeing the problem. My original report is here:
https:/
I've seen the bug with two independent installations to ext4. In my case, a fsck does seem to repair the problem, making a non-bootable system bootable again. The occurrence of disk errors is sporadic.
You asked about RAID early in the thread. I have a raid controller on my mobo, which is currently not being used. The SATA drive is plugged directly into the main connections, and is reported as /dev/sda
I wouldn't mention this at all, but for the fact that some live CD versions of linux (gnuparted LiveCD, for example) gave me fits when they recognized the RAID controller, tried to associate the drive with the RAID device, and therefore prevented me from reformatting the drive. Made me wonder if there might be some quirky interplay deep in the device stack leading to false positive RAID detections.
As reported in the original bug, I am running Kubuntu 9.10 rc 64-bit on intel quad core machine and an intel x25-m ssd. (I don't think this is one of the infamous intel SSD bugs because an alternative OS ran w/o problems)
Steve Langasek (vorlon) wrote : | #81 |
> I've seen the bug with two independent installations to ext4. In my
> case, a fsck does seem to repair the problem, making a non-bootable
> system bootable again. The occurrence of disk errors is sporadic.
That doesn't sound at all like the bug Scott has described.
Scott Kitterman (kitterman) wrote : | #82 |
Proposed release note:
There have been some reports of data corruption with fresh (not upgraded) ext4 file systems with large files (over 512MB). The issue is under investigation. Users who routinely manipulate large files may want to consider using ext3 file systems until this issue is resolved.
Neumarke (nospam1-neumarke) wrote : | #83 |
On the issue how many people are seeing this problem, and I hope I'm not misunderstanding the relationships between bugs here:
This bug is "assigned to" linux-kernel-bugs #14354 in which Linus Torvalds himself claims to be seeing filesystem corruption, starting here:
http://
Are these bugs related or not?
Steve Langasek (vorlon) wrote : | #84 |
Neumarke,
The relation to that upstream bug is tenuous at best. The upstream bug:
- is reported against a newer kernel than the one we're shipping
- is reported to only happen when ext4 is on top of the DM layer, whereas Scott's case was ext4 on a raw device
- is reported in connection with an unclean shutdown and subsequent fsck, whereas Scott reported corruption of files without an unclean shutdown (but no mention in this bug of whether the corruption requires an intervening reboot/fsck to appear - Scott, please clarify)
So that upstream bug link should be dropped; it really doesn't look like the same bug.
Changed in linux: | |
importance: | Unknown → Undecided |
status: | Unknown → New |
Steve Langasek (vorlon) wrote : | #85 |
Documented at <https:/
There have been some reports of data corruption with fresh (not upgraded) ext4 file systems using the Ubuntu 9.10 kernel when writing to large files (over 512MB). The issue is under investigation, and if confirmed will be resolved in a post-release update. Users who routinely manipulate large files may want to consider using ext3 file systems until this issue is resolved. (453579)
Changed in ubuntu-release-notes: | |
status: | New → Fix Released |
Kai Blin (kai.blin) wrote : | #86 |
Steve, I can confirm that in my setup.
Test is easy, as described by Scott.
I've copied over the first iso I found on my PC to my fileserver running an ext4 /data partition. Then I had some fun with md5sum:
kai@woodstock:
138468d380b84e6
kai@woodstock:
d6b2bc09fc4df18
kai@woodstock:
kai@woodstock:
91cf62eee1e1597
kai@woodstock:
d11a45c61466f2b
kai@woodstock:
f8e682be3590d48
kai@woodstock:
d11a45c61466f2b
kai@woodstock:
d11a45c61466f2b
Note that it seems to stabilized on d11a45c61466f2b
That's certainly a fun one.
Kai Blin (kai.blin) wrote : | #87 |
Oh, I forgot to mention that d11a45c61466f2b
Bob McElrath (bob+ubuntu) wrote : | #88 |
I have seen problems like this with large files on multiple fs's and ultimately it was a RAM problem. Scott, can you run memtester and/or memtest86 at bootup to verify that you don't have bad RAM? Is your CPU overclocked? CPU errors can also be detected with burn* programs (cpuburn package). A rare RAM problem can cause bitflips that you wouldn't notice except in large files.
Lemmiwinks (lemmiwinks) wrote : | #89 |
A few days ago, a video file in my home folder, which was over 300MB large, became unusable. Nautilus says the file has 0 bytes. When I try to open it, every player reports, that the stream does not contain any data.
Unfortunately I can not tell when exactly or what the file corruption caused.
When there are more reports like mine, I would suggest to withdraw karmic completely...
Lemmiwinks (lemmiwinks) wrote : | #90 |
Forgot to mention, that I've got actually an Ext3 file system, which I updated to Ext4 soon after Jaunty was released, with no problems at all.
Martin Pitt (pitti) wrote : Re: [Bug 453579] Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #91 |
Lemmiwinks [2009-10-29 18:02 -0000]:
> Forgot to mention, that I've got actually an Ext3 file system, which I
> updated to Ext4 soon after Jaunty was released, with no problems at all.
Scott, did you also upgrade your's to ext4, or was that a clean
mkfs.ext4?
Martin Jackson (mhjacks) wrote : Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #92 |
I have an ext4 fs that I created in jaunty as a fresh ext4 fs (during the jaunty beta cycle).
The fs is on lvm and is close to 1 TB in size...it's 92% full with mp4 files in frequent use, and I have not yet seen this issue.
I upgraded this machine to karmic just over a week ago.
Mackenzie Morgan (maco.m) wrote : | #93 |
Lemmiwinks: Sounds more like the old 0-byte bug that was in Jaunty's ext4. Scott's bug keeps the files the same size.
I'm getting this too with the .isos I downloaded today. Mine is not a ext3 --> ext4 conversion. It was formatted as ext4 by the Karmic alpha 3 or 4 installer. Unlike Scott and Ian, I am not using an SSD.
Zsync'd iso:
098824768ee3d46
Torrented iso:
290ef766fdef0bd
Should be:
5a996e0d794e355
according to http://
Axos (sancroff) wrote : | #94 |
I just finished a clean install of 9.10 (new default partitions.
I ran the following commands:
openssl rand -out foo 629145600
md5sum foo
sync
md5sum foo
cp foo bar
md5sum foo bar
sync
md5sum foo bar
openssl rand -out foo2 1073741824
md5sum foo2
sync
md5sum foo2
cp foo2 bar2
md5sum foo2 bar2
md5sum foo bar foo2 bar2 > sums
# rebooted the system
md5sum foo bar foo2 bar2
All the sums were consistent. No variation. Either my system doesn't have the problem -or- there is something else which triggers it. For instance, maybe the files need to be some odd size rather than a clean multiple of 1 MB. The sizes I used above were 600 * 1024 * 1024 and 1024 * 1024 * 1024. I'll retry the test with an additional 17 bytes added to the file sizes to see if that makes any difference. I'll post again if it does.
David Warde-Farley (david-warde-farley) wrote : | #95 |
@Kai Blin: Can you please confirm the kernel version this was happening with?
Kai Blin (kai.blin) wrote : | #96 |
I'm seeing this on 2.6.31.4 of the beagleboard armel kernel from Launchpad. However, this might be a false alarm on my side, pointing at a hardware issue instead. I've reformatted the partition to ext3 and I'm still seeing similar effects. This is an external USB drive, which might be one part of the issue.
Sorry about the noise, this bug looked like a perfect match.
Changed in linux: | |
importance: | Undecided → Unknown |
status: | New → Unknown |
Changed in linux: | |
importance: | Unknown → Undecided |
status: | Unknown → New |
importance: | Undecided → Unknown |
status: | New → Unknown |
nutznboltz (nutznboltz-deactivatedaccount) wrote : | #97 |
J. Antonio Romero (nsdragon) wrote : | #98 |
I am confused about this bug. All comments speak about freshly-created ext4 filesystems, as well as the Karmic Release Notes. But what about already-present filesystems? Right now my / is ext3 and /home is ext4 on Jaunty. If I do a dist-upgrade to Karmic, will I be affected? What about converting / later from ext3 to ext4? And what if I install Karmic from scratch, but leaving /home untouched?
Mackenzie Morgan (maco.m) wrote : Re: [Bug 453579] Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #99 |
We don't know the cause, so that's hard to answer. So far it seems that ext3
--> ext4 conversions are safe. Kind of makes sense, since the on-disk system
is a bit different. As to whether created-by-Karmic or in-use-by-Karmic is
the trouble here, we don't know yet. I think only 3 people so far have hit
this, and we all were running unstable for development reasons, so we had
created-by-Karmic filesystems. It's going to take more people reproducing it
to find out if created-
fimbulvetr (fimbulvetr) wrote : Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #100 |
I tried to reproduce this on my Latitude D630 with an intel x25-m, 9.10 fresh format/install mounted raw, and was unable to.
Immediately after grabbing ubuntu-
Starcraftmazter (starcraftmazter) wrote : | #101 |
fimbulvetr - did you try editing the file?
On that note, I'm doing a fresh install of 9.10 with ext4 on my laptop around the start of next week, and I'm wondering if anyone can suggest some methods to try and reproduce the bug. So far I'm thinking about obtaining a very large file, copying it around the HD, and modifying it.
I'm wondering if changing the file around the 536870912th byte would be a useful thing to do?
tellapu (tellapu) wrote : | #102 |
Thanks so much for working on this critical issue. I wait to install Karmic till it is fixed, so please hurry up :-) As I often have large files (around 1 GB).
GonzO (gonzo) wrote : | #103 |
I think Steve was right at post #84: the link to the linux kernel bug should be dropped, as all of the circumstances of this bug are different from the one in the link. How did this upstream link get re-established?
Changed in linux: | |
status: | Unknown → Confirmed |
Axos (sancroff) wrote : | #104 |
OK, the bug is in kernel 2.6.32. Kosmic, er, Karmic Koala is 2.6.31. No wonder I wasn't able to reproduce it.
Changed in linux: | |
importance: | Unknown → Undecided |
status: | Confirmed → New |
aldebx (aldebx) wrote : | #105 |
@ Kai Blin,
it should be made clear that testing with an external USB drive is not at all a reliable test. I've got through _several_ USB drives that systematically corrupted large files regardless to the HardDisk, filesystem and host computer used. This happens especially with large capacity harddisks plugged into cheap usb controllers (although that also happened to me once with an average one).
@Starcraftmazter
since the MD5 sum as all other hashes are conceived to ensure files have not been tampered with or corrupted it would _definitely_ change if you edit them! The hash file (read MD5/SHA, etc) HAVE TO change after you edit the file! Otherwise you would have been so lucky to have found a weakness in the hash algorithm.
Kai Blin (kai.blin) wrote : | #106 |
@aldebx
Dunno, connecting all of my drives to all of my other boxes, I don't see any issues like that. However, I think I've already identified the system used as the real cause of my particular issue.
Bryan Quigley (bryanquigley) wrote : | #107 |
Did everyone affected do a memtest like the suggestion earlier (https:/
I have had the EXACT same symptoms ('cept it was a 4.7 GB ISO), on ext3 and it was a SINGLE bad line in memtest. So please run memtest Scott. Or anyone else affected by the changing md5sums. Thanks!
(Why would it work on ext3 and not ext4? maybe because ext4 reads faster and would be more likely to trip the bad memory)
Starcraftmazter (starcraftmazter) wrote : | #108 |
@aldebx
Of course I realise this, perhaps I need to elaborate my idea. I mean, since the error apparently occurs when large files are edited, a test should be devised whereby changes are made to a large file, saved, and then un-done and saved - and the before and after checksums compared, to see if there in fact is a problem with writing large files.
Furthermore, since the problem allegedly happens around bits at the 512MB mark, so my idea is to write a program to take an X number of blocks before and after this bit, and swap them. X must be even to ensure every block is swapped with another. I am thinking of swapping 1000 blocks before the point with 1000 points after the point. Using fsync and running the program twice should ensure that both changes are written, and the second undoes the first - thus if two hashes of the file are taken, one before and one after the experiment, they will either be identical if no problems occured or different if there is in fact a problem.
So my question is, would this be a good test to do? I will probably have time to do it tomorrow.
unggnu (unggnu) wrote : | #109 |
I also had this issue but I can't really nail it down. The explanation that it has only something to do with Kernel 2.6.32-rc* makes sense. I used it several times on my two systems. I got different md5 for the same file and if I played a video which was affected the player stops and the hard disks runs all the time and the systems hangs for a minute or more. It looks like a part was missing and the driver was searching for it.
I even shot down my whole testing system. I have a small testing partition to test Linux outside the VM. After some xorg edgers installation and restart I got an fsck problem which asks for confirmation, some inode problem. After confirmation it asks again several times so I run it with -y which delete a lot of files. Afterwards the system didn't boot anymore.
I have installed grub on the partition instead of the MBR, maybe that resulted under some special circumstances in this problem but I have done this since a long time and never had problem with it.
So actually using of 2.6.32 makes more sense since I use ext4 since quite some time. I haven't had a problem since sticking with the default Kernel but this problems doesn't just pop up. You don't realize that a huge file has changed until actually checking/using the whole thing.
Matt (twister-vertex-cc) wrote : | #110 |
This might be a stupid question, but Karmic does ship with the Kernel 2.6.31-14.48 and not 14.46 right? Can anybody elaborate?
I did a fresh install with newly created ext4 partitions and have not yet encountered anything. Well I didnt really try to produce an error since this is my production machine. I have md5sumed entire folders with Gigs of data after moving them ... no errors.
Ramon (ram130-gmail) wrote : | #111 |
Well I am not sure but heres a copy of what mine said after a clean install "Linux 2.6.31-
Starcraftmazter (starcraftmazter) wrote : | #112 |
Hello.
I have wrote a C program to implement the test I described above. Currently, it checks 100x 8K blocks around the 512MB mark, by swapping them with each other, back to front. Running the program twice should thus result in an identical file. Using this program, you can check that read/write ability of your filesystem, in particular around with 512MB mark in a file (both before, at, and after).
http://
If you don't have gcc installed (it isn't by default):
sudo apt-get install build-essential
To compile:
gcc -o tester tester.c
Further, I wrote a perl script to simplify testing:
http://
To run:
chmod +x tester.pl
./tester.pl
If those links disappear at some stage, the programs can be found here:
http://
Both me and my friend have ran the test on the Ubuntu iso itself. I am using a 64bit install of 9.10 and he is using a 32bit install of 9.10. The kernel used for our tests is the default 2.6.31-14-generic. We are both on ext4.
Both of our tests came up fine, and read/write works perfectly and the before/after hashes are the same, hence we could not observe any problem.
I would encourage anyone experiencing this problem to run the above tests and see what happens, in an effort to isolate the problem.
Cheers
black (blackborn) wrote : | #113 |
I have a freshly installed ubuntu 9.10 with 2 newly created ext4 partitions (45GiB for / and 870GiB for /home). I did not encounter any problem so far. (The /home drive contains ~500GiB of films) Also the tester program of comment #112 doesn't reveal any problems. So I'm lucky for now and will report if the problem emerges.
Starcraftmazter (starcraftmazter) wrote : | #114 |
Furthermore, I should state the full kernel version that the final version of Ubuntu (which we did our testing on) is 2.6.31-14.48 and not 14-46. Is there a fix from 46 to 48?
Steve Langasek (vorlon) wrote : Re: [Bug 453579] Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #115 |
On Tue, Nov 03, 2009 at 11:01:04AM -0000, Starcraftmazter wrote:
> Furthermore, I should state the full kernel version that the final
> version of Ubuntu (which we did our testing on) is 2.6.31-14.48 and not
> 14-46. Is there a fix from 46 to 48?
No. The bug title reflects the version of the kernel on which the error was
first seen.
--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://
<email address hidden> <email address hidden>
Starcraftmazter (starcraftmazter) wrote : Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #116 |
@Steve Langasek
I understand that, though what I'm wondering is whether there was any change from 46 to 48, which could have fixed this issue.
Mackenzie Morgan (maco.m) wrote : Re: [Bug 453579] Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #117 |
I had -14.48 when I hit it.
grof (grofardel) wrote : Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #118 |
I have a fresh installation of Karmic Koala, and I've already have two times corrupted fs.
Ubuntu does not boot and complain about fs it cannot mount.
I have to do fsck in order to repair the things.
But the perl script above (of Starcraftmazter) said that hashes are equal.
muadnem (brownj23-deactivatedaccount) wrote : | #119 |
Is this not fixed via https:/
If so, will the iso images for Karmic be updated anytime soon, or will this only be available post-install?
Maybe off base, sorry if so.
muadnem (brownj23-deactivatedaccount) wrote : | #120 |
I guess it would help if I pasted the right link... Please ignore the previous link.
muadnem (brownj23-deactivatedaccount) wrote : | #121 |
Here is my reasoning..
"One change that we did make between 2.6.31 and 2.6.32 is that we enable journal checksums by default."
"by default" suggests that a 2.6.31 could be built with journal checksums enabled?
And maybe I'm reading wrong but it doesn't look DM specific..
Starcraftmazter (starcraftmazter) wrote : | #122 |
Hmmm are the two issues (corrupted fs and corrupted large files) related?
Steve Langasek (vorlon) wrote : Re: [Bug 453579] Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #123 |
On Tue, Nov 03, 2009 at 11:29:09PM -0000, Starcraftmazter wrote:
> Hmmm are the two issues (corrupted fs and corrupted large files)
> related?
No.
--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://
<email address hidden> <email address hidden>
muadnem (brownj23-deactivatedaccount) wrote : Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #124 |
Woops. I missed the 'corruption is not detected by fsck', part. Seems like, with the elusive nature of this bug, everyone should be reporting their memtest and fsck status.
Steve Langasek (vorlon) wrote : Re: [Bug 453579] Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #125 |
On Wed, Nov 04, 2009 at 01:06:08AM -0000, muadnem wrote:
> Woops. I missed the 'corruption is not detected by fsck', part. Seems
> like, with the elusive nature of this bug, everyone should be reporting
> their memtest and fsck status.
In general, people who aren't actually seeing the bug described here should
not be reporting anything. All that does is make the bug log harder to
extract information from.
--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://
<email address hidden> <email address hidden>
Leonardo Montecchi (lmontecchi) wrote : Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #126 |
Is this bug related somewhat to the following:
http://
http://
?
pdecat (pdecat) wrote : | #127 |
I too would recommend a full memtest to anyone encountering data corruption.
unggnu (unggnu) wrote : | #128 |
I could confirm it again with the default Ubuntu Kernel. I was downloading a compilation of files with Bittorrent while the battery went out. There was no problem afterwards and the files seem to be downloaded fine later on but I got errors. So I started a rehash of the compilation and it found around 40 defective chunks which needed to be redownloaded. Afterwards the file check works fine so I guess the problem might have something to do with crashes/blackout in combination with ext4.
There is no problem that some recently saved data is gone after a crash but at least it should be recognized through the journal and marked as such.
Brian Rogers (brian-rogers) wrote : | #129 |
The BitTorrent crash scenario doesn't indicate a bug. The only way for an application to know about uncommitted writes is to scan the file (for example by rehashing in this instance). To avoid doing this every time it's started, it saves a record of what parts have been downloaded. In a crash, this record may be more up to date than what's actually saved to the disk.
Bryan Quigley (bryanquigley) wrote : | #130 |
For bittorrent (I'm assuming using transmision) check out this bug:
https:/
Ramon (ram130-gmail) wrote : Re: [Bug 453579] Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #131 |
I restarted my system the hard way and it would not boot. So i checked it
from another install with fsck and it found some errors. I'm beginning to
worry about a possible future corruption. What is the status of this
problem?
Mackenzie Morgan (maco.m) wrote : | #132 |
Ramon:
That's not what this bug is about. Improperly rebooting runs the risk
of breaking your system on *any* filesystem. This bug is about
*individual files* which are very large becoming corrupt and NOT
having any effect on fsck.
Ramon (ram130-gmail) wrote : | #133 |
yea thats true. Have you experience any more corruption since?
On Sat, Nov 7, 2009 at 3:20 PM, Mackenzie Morgan <email address hidden> wrote:
> Ramon:
> That's not what this bug is about. Improperly rebooting runs the risk
> of breaking your system on *any* filesystem. This bug is about
> *individual files* which are very large becoming corrupt and NOT
> having any effect on fsck.
>
> --
> corruption of large files reported with linux 2.6.31-14.46 on ext4
> https:/
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in The Linux Kernel: New
> Status in Ubuntu Release Notes: Fix Released
> Status in “linux” package in Ubuntu: Triaged
> Status in “linux” source package in Karmic: Triaged
>
> Bug description:
> There are worrying reports of filesystem corruption on ext4 in karmic.
> Scott says:
>
> 12:36 < Keybuk> this whole ext4 thing is worrying me
> 12:36 < Keybuk> I just downloaded an iso image, md5sum didn't match
> 12:36 < Keybuk> downloaded it into an ext3 partition, matched just fine
> 12:59 < Keybuk> and I know mvo has seen bugs with corrupted .debs in
> /var/cache/
> 12:59 < Keybuk> which seems to imply its any file large enough to use lots
> of extents
>
> I'm opening this bug report so that this bug gets tracked & triaged for
> karmic. If we're unable to isolate the issue, we should consider rolling
> back to ext3 as the default filesystem in the installer.
>
> ProblemType: Bug
> Architecture: amd64
> ArecordDevices:
> **** List of CAPTURE Hardware Devices ****
> card 0: Intel [HDA Intel], device 0: AD198x Analog [AD198x Analog]
> Subdevices: 1/1
> Subdevice #0: subdevice #0
> AudioDevicesInUse:
> USER PID ACCESS COMMAND
> /dev/snd/controlC0: vorlon 3350 F.... pulseaudio
> Card0.Amixer.info:
> Card hw:0 'Intel'/'HDA Intel at 0xee240000 irq 17'
> Mixer name : 'Analog Devices AD1981'
> Components : 'HDA:11d41981,
> Controls : 20
> Simple ctrls : 11
> Date: Fri Oct 16 16:01:26 2009
> DistroRelease: Ubuntu 9.10
> HibernationDevice: RESUME=
> MachineType: LENOVO 6371CTO
> Package: linux-image-
> PccardctlIdent:
> Socket 0:
> no product info available
> PccardctlStatus:
> Socket 0:
> no card
> ProcCmdLine: root=/dev/
> ProcEnviron:
> PATH=(custom, user)
> LANG=en_US.UTF-8
> SHELL=/bin/bash
> ProcVersionSign
> RelatedPackageV
> SourcePackage: linux
> Uname: Linux 2.6.31-13-generic x86_64
> WpaSupplicantLog:
>
> dmi.bios.date: 12/27/2006
> dmi.bios.vendor: LENOVO
> dmi.bios.version: 7IET23WW (1.04 )
> dmi.board.name: 6371CTO
> dmi.board.vendor: LENOVO
> dmi.board.version: Not Available
> dmi.chassis.
> dmi.chassis.type: 10
> dmi.chassis.vendor: LENOVO
> dmi.chassis.
> dmi.modalias:
> dmi:bvnLENOVO:
> dmi.product.name: 6371CTO
> dmi.product.
Mackenzie Morgan (maco.m) wrote : Re: [Bug 453579] Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #134 |
I haven't downloaded anymore large files since then on the basis that it'd be a
waste of bandwidth
On Saturday 07 November 2009 4:12:05 pm Ramon wrote:
> yea thats true. Have you experience any more corruption since?
>
> On Sat, Nov 7, 2009 at 3:20 PM, Mackenzie Morgan <email address hidden>
>
> wrote:
> > Ramon:
> > That's not what this bug is about. Improperly rebooting runs the risk
> > of breaking your system on *any* filesystem. This bug is about
> > *individual files* which are very large becoming corrupt and NOT
> > having any effect on fsck.
> >
> > --
> > corruption of large files reported with linux 2.6.31-14.46 on ext4
> > https:/
> > You received this bug notification because you are a direct subscriber
> > of the bug.
> >
> > Status in The Linux Kernel: New
> > Status in Ubuntu Release Notes: Fix Released
> > Status in “linux” package in Ubuntu: Triaged
> > Status in “linux” source package in Karmic: Triaged
> >
> > Bug description:
> > There are worrying reports of filesystem corruption on ext4 in karmic.
> > Scott says:
> >
> > 12:36 < Keybuk> this whole ext4 thing is worrying me
> > 12:36 < Keybuk> I just downloaded an iso image, md5sum didn't match
> > 12:36 < Keybuk> downloaded it into an ext3 partition, matched just fine
> > 12:59 < Keybuk> and I know mvo has seen bugs with corrupted .debs in
> > /var/cache/
> > 12:59 < Keybuk> which seems to imply its any file large enough to use
> > lots of extents
> >
> > I'm opening this bug report so that this bug gets tracked & triaged for
> > karmic. If we're unable to isolate the issue, we should consider rolling
> > back to ext3 as the default filesystem in the installer.
> >
> > ProblemType: Bug
> > Architecture: amd64
> > ArecordDevices:
> > **** List of CAPTURE Hardware Devices ****
> > card 0: Intel [HDA Intel], device 0: AD198x Analog [AD198x Analog]
> > Subdevices: 1/1
> > Subdevice #0: subdevice #0
> > AudioDevicesInUse:
> > USER PID ACCESS COMMAND
> > /dev/snd/controlC0: vorlon 3350 F.... pulseaudio
> > Card0.Amixer.info:
> > Card hw:0 'Intel'/'HDA Intel at 0xee240000 irq 17'
> > Mixer name : 'Analog Devices AD1981'
> > Components : 'HDA:11d41981,
> > Controls : 20
> > Simple ctrls : 11
> > Date: Fri Oct 16 16:01:26 2009
> > DistroRelease: Ubuntu 9.10
> > HibernationDevice: RESUME=
> > MachineType: LENOVO 6371CTO
> > Package: linux-image-
> > PccardctlIdent:
> > Socket 0:
> > no product info available
> > PccardctlStatus:
> > Socket 0:
> > no card
> > ProcCmdLine: root=/dev/
> > ProcEnviron:
> > PATH=(custom, user)
> > LANG=en_US.UTF-8
> > SHELL=/bin/bash
> > ProcVersionSign
> > RelatedPackageV
> > SourcePackage: linux
> > Uname: Linux 2.6.31-13-generic x86_64
> > WpaSupplicantLog:
> >
> > dmi.bios.date: 12/27/2006
> > dmi.bios.vendor: LENOVO
> > dmi.bios.version: 7IET23WW (1.04 )
> > dmi.board.name: 6371CTO
> > dmi.board.vendor: LENOVO
> > dmi.board.version: Not Available
> > dmi.chassis.
Pete Graner (pgraner) wrote : Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #135 |
Looks like the is mostly solved upstream by reverting patch d0646f7b636d067
@apw could you or csurbhi build a test kernel and post here so folks can test?
Thanks
~pete
Changed in linux: | |
importance: | Undecided → Unknown |
status: | New → Unknown |
Steve Langasek (vorlon) wrote : | #136 |
No, this is not upstream bug #14354. There is no overlap between the described problems.
Changed in linux: | |
importance: | Unknown → Undecided |
status: | Unknown → New |
summary: |
- corruption of large files reported with linux 2.6.31-14.46 on ext4 + in-place corruption of large files *without fsck or reboot* reported + with linux 2.6.31-14.46 on ext4 |
Piscium (piscium) wrote : | #137 |
I am running Karmic 9.10 on a old Pentium 4 computer with PATA drives. I copied a 2.6 GByte file from an ext4 partition to another ext4 partition, then to a ext2, then to the original ext4. No problem. All files have the same md5.
unggnu (unggnu) wrote : | #138 |
"Improperly rebooting runs the risk of breaking your system on *any* filesystem."
Sorry, but this is not true imho. I have never had a similar problem with ext3. Yes, there are some file systems like XFS which just deletes the data of a whole file if the computer crashes or is restarted but I guess the goal should be especially with a journaling file system to prevent errors like this. Not to mention that the size of the file is zero with XFS so you don't assume that everything is fine.
Like I said there is no problem that after a hard reboot some shortly changed data is lost but this has to be diagnosed and dealt with through the file system so that no corrupt files are saved without even realizing it.
Scott James Remnant (Canonical) (canonical-scott) wrote : Re: [Bug 453579] Re: corruption of large files reported with linux 2.6.31-14.46 on ext4 | #139 |
On Thu, 2009-10-29 at 14:57 +0000, Bob McElrath wrote:
> I have seen problems like this with large files on multiple fs's and
> ultimately it was a RAM problem. Scott, can you run memtester and/or
> memtest86 at bootup to verify that you don't have bad RAM? Is your CPU
> overclocked? CPU errors can also be detected with burn* programs
> (cpuburn package). A rare RAM problem can cause bitflips that you
> wouldn't notice except in large files.
>
Running memtest was one of the first things I did ;-) Likewise I
performed a read/write test of the drive and it was fine
Scott
--
Scott James Remnant
<email address hidden>
Scott James Remnant (Canonical) (canonical-scott) wrote : | #140 |
On Thu, 2009-10-29 at 18:33 +0000, Martin Pitt wrote:
> Lemmiwinks [2009-10-29 18:02 -0000]:
> > Forgot to mention, that I've got actually an Ext3 file system, which I
> > updated to Ext4 soon after Jaunty was released, with no problems at all.
>
> Scott, did you also upgrade your's to ext4, or was that a clean
> mkfs.ext4?
>
Clean ext4
Scott
--
Scott James Remnant
<email address hidden>
The Loeki (the-loeki) wrote : | #141 |
I see there's a lot of discussion going on about this bug, I'll just drop in my 5c:
I cleanly installed/updated a x64 Karmic w/ext4 filesystems on my MacBook 5,1 (no other OSes installed).
Due to the nature of my work I downloaded over a dozen ISO's of different HTTP/FTPs, none of which failed their MD5 sums.
Yesterday, I cleanly installed/updated a x32 Karmic on some measly Centrino/Pentium M 1,6 Ghz laptop and copied, amongst others, 7 DVD ISO's to it from an external NTFS harddrive, with no apparent issue or corruption (though I'd have to check the MD5's to make absolutely sure).
So for me, this issue doesn't seem to exist. Then again, both filesystems are clean-and-simple 3 primary partition layouts (10GB /,RAM*1~1,5 swap, remainder /home, handmade during install), no hw/sw RAID/LVM or whatever.
Andrew M. (ender-neo) wrote : | #142 |
I think I actually *have* seen this...
My setup:
Fileserver, 1.5TB array on JFS bunch of big files (videos mostly, some ISO's and the like - but none of these files *ever* change)
one client with 400GB internal SATA HDD on ext4, running Karmic with 2.6.31-15-generic AMD64 kernel
one client with 1.5TB external USB HDD on ext4, running Karmic with 2.6.31-15-generic i386 kernel
I invoke rsync from the clients as
rsync -axvc root@server:
on *both* clients, i had to run rsync about 3 times until i no longer saw changes.
that is to say that there existed differences in the files even during 2nd rsync, which simply shouldn't be.
Also worth noting that another Karmic machine with ext3 and a ppc kernel doesn't see this problem. Ran rsync once, and then the second pass didn't change any files.
There doesn't seem to be any rhyme or reason to what files were corrupt or wrong on the clients after the first sync, they weren't the same between the two clients and there didn't seem to be any correlation between increased size and frequency or anything. The "giantest" files (8 GB's or so) transferred correctly the first time.
I've still got this elaborate test set up in place, and I'm *very*VERY* keen to get this worked out so that I can move to Karmic on the server (no way in hell I'm upgrading until this gets sorted out!!!)
Please let me know if any more specific tests would help or anything.
Andrew M. (ender-neo) wrote : | #143 |
more info from my setup:
i have done memtest on all boxes, everything is fine
network is a wired network (not thinking it should matter, SSH would barf if packets were coming in with errors,)
the smallest file for which i saw corruption was about 120MB,
the incidence seemed to be about 6 corrupt files every {delete everything on client, reboot client, run rsync, run rsync again to see what checksums were wrong} iteration, during which minimally 400GB and about 10,000 files were transferred
Andrew M. (ender-neo) wrote : | #144 |
oh yes, one more thing,
all fs's were created & formatted by the Karmic installer, using the release media
Ramon (ram130-gmail) wrote : Re: [Bug 453579] Re: in-place corruption of large files *without fsck or reboot* reported with linux 2.6.31-14.46 on ext4 | #145 |
Wow now that's a test!!! I think karmic corrupted my windows7 and two data
partitions. I installed karmic on a brand new 21days old 500GB hard drive.
Been transfering files for 2 weeks from my failing 320GB. After that was
done I tried booting back into windows7, failed. Karmic crashes occasionally
for no reason!! Decided to run start up repair, no problems then I ran
chkdsk, all of corrupted files on each partition! ..to top it off disk
utility is reporting my hard drive now has a bad sector!! Sumone help me
before I go insane
On Nov 11, 2009 9:26 PM, "Andrew M." <email address hidden> wrote:
oh yes, one more thing,
all fs's were created & formatted by the Karmic installer, using the release
media
-- in-place corruption of large files *without fsck or reboot* reported with
linux 2.6.31-14.46 on...
Mackenzie Morgan (maco.m) wrote : Re: [Bug 453579] Re: in-place corruption of large files *without fsck or reboot* reported with linux 2.6.31-14.46 on ext4 | #146 |
Er...this is only for ext4. Win7 does not run on ext4. Sounds like that bad
sector is to blame. Just because it's new doesn't mean it's not broken.
Ramon (ram130-gmail) wrote : Re: [Bug 453579] Re: in-place corruption of large files *without fsck or reboot* reported with linux 2.6.31-14.46 on ext4 | #147 |
I used Karmic to create the partitions, 2 NTFS, SWAP & an EXT4. The bad
sector didn't show up until I was doing copying files...so far it says One
bad sector. It just seems ironic this corruption problem is here then this
happens.
On Thu, Nov 12, 2009 at 1:39 AM, Mackenzie Morgan <email address hidden> wrote:
> Er...this is only for ext4. Win7 does not run on ext4. Sounds like that
> bad
> sector is to blame. Just because it's new doesn't mean it's not broken.
>
> --
> in-place corruption of large files *without fsck or reboot* reported with
> linux 2.6.31-14.46 on ext4
> https:/
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in The Linux Kernel: New
> Status in Ubuntu Release Notes: Fix Released
> Status in “linux” package in Ubuntu: Triaged
> Status in “linux” source package in Karmic: Triaged
>
> Bug description:
> There are worrying reports of filesystem corruption on ext4 in karmic.
> Scott says:
>
> 12:36 < Keybuk> this whole ext4 thing is worrying me
> 12:36 < Keybuk> I just downloaded an iso image, md5sum didn't match
> 12:36 < Keybuk> downloaded it into an ext3 partition, matched just fine
> 12:59 < Keybuk> and I know mvo has seen bugs with corrupted .debs in
> /var/cache/
> 12:59 < Keybuk> which seems to imply its any file large enough to use lots
> of extents
>
> I'm opening this bug report so that this bug gets tracked & triaged for
> karmic. If we're unable to isolate the issue, we should consider rolling
> back to ext3 as the default filesystem in the installer.
>
> ProblemType: Bug
> Architecture: amd64
> ArecordDevices:
> **** List of CAPTURE Hardware Devices ****
> card 0: Intel [HDA Intel], device 0: AD198x Analog [AD198x Analog]
> Subdevices: 1/1
> Subdevice #0: subdevice #0
> AudioDevicesInUse:
> USER PID ACCESS COMMAND
> /dev/snd/controlC0: vorlon 3350 F.... pulseaudio
> Card0.Amixer.info:
> Card hw:0 'Intel'/'HDA Intel at 0xee240000 irq 17'
> Mixer name : 'Analog Devices AD1981'
> Components : 'HDA:11d41981,
> Controls : 20
> Simple ctrls : 11
> Date: Fri Oct 16 16:01:26 2009
> DistroRelease: Ubuntu 9.10
> HibernationDevice: RESUME=
> MachineType: LENOVO 6371CTO
> Package: linux-image-
> PccardctlIdent:
> Socket 0:
> no product info available
> PccardctlStatus:
> Socket 0:
> no card
> ProcCmdLine: root=/dev/
> ProcEnviron:
> PATH=(custom, user)
> LANG=en_US.UTF-8
> SHELL=/bin/bash
> ProcVersionSign
> RelatedPackageV
> SourcePackage: linux
> Uname: Linux 2.6.31-13-generic x86_64
> WpaSupplicantLog:
>
> dmi.bios.date: 12/27/2006
> dmi.bios.vendor: LENOVO
> dmi.bios.version: 7IET23WW (1.04 )
> dmi.board.name: 6371CTO
> dmi.board.vendor: LENOVO
> dmi.board.version: Not Available
> dmi.chassis.
> dmi.chassis.type: 10
> dmi.chassis.vendor: LENOVO
> dmi.chassis.
> dmi.modalias:
> dmi:bvnLENOVO:
unggnu (unggnu) wrote : | #148 |
@Ramon
If you use Karmic you can check your whole hard disk with SMART. Check the bad sector count after an extended test. If there are some it is more likely that this was the cause then Karmic.
I guess this should be done by everyone who have problems and have already run memtest.
unggnu (unggnu) wrote : | #149 |
The tool to check S.M.A.R.T in Karmic is called "Disk Utility" (gnome-
Ramon (ram130-gmail) wrote : | #150 |
That's what I had use to know I got a bad sector in the first place. I tried
runing the short test and the other one that wasn't extended. All failed
before they could complete bout 90%. Tried just awhile ago and it failed to
continue runing after 10sec, saying cannot read. I jus ran a memtest, its
clean. Windows 7 finally booted, but saying it cannot access my desktop and
a lot of errors now, sumting about read error and corrupted files. Its just
problem after problem. I guess everyone else here is better off than me
right now?
On Nov 12, 2009 3:20 AM, "unggnu" <email address hidden> wrote:
The tool to check S.M.A.R.T in Karmic is called "Disk Utility" (gnome-
disk-utility) and it is also possible before with the smartmontools.
-- in-place corruption of large files *without fsck or reboot* reported with
linux 2.6.31-14.46 on...
unggnu (unggnu) wrote : | #151 |
So it looks like your hard disc is defective if even the SMART tests fail. Only the extended tests checks every sector so it should be preferred. This has nothing to do with this bug if the hardware fails.
Nicky (nickygillette) wrote : | #152 |
The different checksums affects me too, I'm using the final release version of 9.10 on an Intel Celeron on a laptop.
I did RAM tests with a live copy of 8.04, for many hours with no errors, so I don't think it's bad memory.
8.04 works (with default ext3), 8.04 alt w/ full disk encryption works, also with normal checksums.
9.10 gives errors (with default ext4), w/ home directory encryption, I also get bad checksums.
9.10 live CD also gives me errors when I haven't even installed it when I use:
sudo shred -xvfz -n0 /dev/sdb # A USB flash disk I wanted to wipe
It failed at 4.7GB and started saying something about sync errors.
This same file operation works fine with the 8.04 Live CD.
I wonder if it's not just an ext4 problem, but a problem in the way 9.10 handles large files.
Nicky (nickygillette) wrote : | #153 |
I meant that it happens with or without encryption on 9.10 in the comment above, if it was unclear.
Ramon (ram130-gmail) wrote : | #154 |
i just installed karmic on a 1TB sata....and my flash drive 16GB...lets see
how it goes! both EXT4..if it lets me down, then yall got sum serious
problems..
On Wed, Nov 18, 2009 at 1:13 AM, Nicky Gillette <email address hidden>wrote:
> I meant that it happens with or without encryption on 9.10 in the
> comment above, if it was unclear.
>
> --
> in-place corruption of large files *without fsck or reboot* reported with
> linux 2.6.31-14.46 on ext4
> https:/
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in The Linux Kernel: New
> Status in Ubuntu Release Notes: Fix Released
> Status in “linux” package in Ubuntu: Triaged
> Status in “linux” source package in Karmic: Triaged
>
> Bug description:
> There are worrying reports of filesystem corruption on ext4 in karmic.
> Scott says:
>
> 12:36 < Keybuk> this whole ext4 thing is worrying me
> 12:36 < Keybuk> I just downloaded an iso image, md5sum didn't match
> 12:36 < Keybuk> downloaded it into an ext3 partition, matched just fine
> 12:59 < Keybuk> and I know mvo has seen bugs with corrupted .debs in
> /var/cache/
> 12:59 < Keybuk> which seems to imply its any file large enough to use lots
> of extents
>
> I'm opening this bug report so that this bug gets tracked & triaged for
> karmic. If we're unable to isolate the issue, we should consider rolling
> back to ext3 as the default filesystem in the installer.
>
> ProblemType: Bug
> Architecture: amd64
> ArecordDevices:
> **** List of CAPTURE Hardware Devices ****
> card 0: Intel [HDA Intel], device 0: AD198x Analog [AD198x Analog]
> Subdevices: 1/1
> Subdevice #0: subdevice #0
> AudioDevicesInUse:
> USER PID ACCESS COMMAND
> /dev/snd/controlC0: vorlon 3350 F.... pulseaudio
> Card0.Amixer.info:
> Card hw:0 'Intel'/'HDA Intel at 0xee240000 irq 17'
> Mixer name : 'Analog Devices AD1981'
> Components : 'HDA:11d41981,
> Controls : 20
> Simple ctrls : 11
> Date: Fri Oct 16 16:01:26 2009
> DistroRelease: Ubuntu 9.10
> HibernationDevice: RESUME=
> MachineType: LENOVO 6371CTO
> Package: linux-image-
> PccardctlIdent:
> Socket 0:
> no product info available
> PccardctlStatus:
> Socket 0:
> no card
> ProcCmdLine: root=/dev/
> ProcEnviron:
> PATH=(custom, user)
> LANG=en_US.UTF-8
> SHELL=/bin/bash
> ProcVersionSign
> RelatedPackageV
> SourcePackage: linux
> Uname: Linux 2.6.31-13-generic x86_64
> WpaSupplicantLog:
>
> dmi.bios.date: 12/27/2006
> dmi.bios.vendor: LENOVO
> dmi.bios.version: 7IET23WW (1.04 )
> dmi.board.name: 6371CTO
> dmi.board.vendor: LENOVO
> dmi.board.version: Not Available
> dmi.chassis.
> dmi.chassis.type: 10
> dmi.chassis.vendor: LENOVO
> dmi.chassis.
> dmi.modalias:
> dmi:bvnLENOVO:
> dmi.product.name: 6371CTO
> dmi.product.
Ramon (ram130-gmail) wrote : | #155 |
and yes i am doing large transfers of files over 16GB....how do you check
the checksums?
On Wed, Nov 18, 2009 at 2:50 AM, Ram'on McNally <email address hidden> wrote:
> i just installed karmic on a 1TB sata....and my flash drive 16GB...lets
> see how it goes! both EXT4..if it lets me down, then yall got sum serious
> problems..
>
>
> On Wed, Nov 18, 2009 at 1:13 AM, Nicky Gillette <email address hidden>wrote:
>
>> I meant that it happens with or without encryption on 9.10 in the
>> comment above, if it was unclear.
>>
>> --
>> in-place corruption of large files *without fsck or reboot* reported with
>> linux 2.6.31-14.46 on ext4
>> https:/
>> You received this bug notification because you are a direct subscriber
>> of the bug.
>>
>> Status in The Linux Kernel: New
>> Status in Ubuntu Release Notes: Fix Released
>> Status in “linux” package in Ubuntu: Triaged
>> Status in “linux” source package in Karmic: Triaged
>>
>> Bug description:
>> There are worrying reports of filesystem corruption on ext4 in karmic.
>> Scott says:
>>
>> 12:36 < Keybuk> this whole ext4 thing is worrying me
>> 12:36 < Keybuk> I just downloaded an iso image, md5sum didn't match
>> 12:36 < Keybuk> downloaded it into an ext3 partition, matched just fine
>> 12:59 < Keybuk> and I know mvo has seen bugs with corrupted .debs in
>> /var/cache/
>> 12:59 < Keybuk> which seems to imply its any file large enough to use lots
>> of extents
>>
>> I'm opening this bug report so that this bug gets tracked & triaged for
>> karmic. If we're unable to isolate the issue, we should consider rolling
>> back to ext3 as the default filesystem in the installer.
>>
>> ProblemType: Bug
>> Architecture: amd64
>> ArecordDevices:
>> **** List of CAPTURE Hardware Devices ****
>> card 0: Intel [HDA Intel], device 0: AD198x Analog [AD198x Analog]
>> Subdevices: 1/1
>> Subdevice #0: subdevice #0
>> AudioDevicesInUse:
>> USER PID ACCESS COMMAND
>> /dev/snd/controlC0: vorlon 3350 F.... pulseaudio
>> Card0.Amixer.info:
>> Card hw:0 'Intel'/'HDA Intel at 0xee240000 irq 17'
>> Mixer name : 'Analog Devices AD1981'
>> Components : 'HDA:11d41981,
>> Controls : 20
>> Simple ctrls : 11
>> Date: Fri Oct 16 16:01:26 2009
>> DistroRelease: Ubuntu 9.10
>> HibernationDevice: RESUME=
>> MachineType: LENOVO 6371CTO
>> Package: linux-image-
>> PccardctlIdent:
>> Socket 0:
>> no product info available
>> PccardctlStatus:
>> Socket 0:
>> no card
>> ProcCmdLine: root=/dev/
>> ProcEnviron:
>> PATH=(custom, user)
>> LANG=en_US.UTF-8
>> SHELL=/bin/bash
>> ProcVersionSign
>> RelatedPackageV
>> SourcePackage: linux
>> Uname: Linux 2.6.31-13-generic x86_64
>> WpaSupplicantLog:
>>
>> dmi.bios.date: 12/27/2006
>> dmi.bios.vendor: LENOVO
>> dmi.bios.version: 7IET23WW (1.04 )
>> dmi.board.name: 6371CTO
>> dmi.board.vendor: LENOVO
>> dmi.board.version: Not Available
>> dmi.chassis.
>> dmi.chassis.type: 10
>> dmi.chassis.vendor: LENOVO
>> dmi...
Starcraftmazter (starcraftmazter) wrote : | #156 |
@Ramon
md5sum filename
or
sha256sum filename
Jens Janssen (jayjay) wrote : | #157 |
for checking whole drive and a lot of files I use md5deep:
md5deep -lrk ./* > data.md5
sort -k 2 data.md5 > data.md5.sort
diff data.md5.sort data.md5.sort.old
Phương Võ (clarious) wrote : | #158 |
I don't know if this is related or not: from a conversation I had before, ext4 divide the files into chunks with power of two size to prevent long term free space fragmentation, so a 800 MB files would be written as a 512mb chunk, then 256mb chunk and so on...
Scott James Remnant (Canonical) (canonical-scott) wrote : | #159 |
I've not been able to reproduce this with the most recent kernel packages
Scott James Remnant (Canonical) (canonical-scott) wrote : | #160 |
well, and andy kernel
Donald Ray Crocker Jr. (dcrockerjr) wrote : | #161 |
i wonder if some of the problems people are experiencing are due to a documentation bug.
http://
makes reference to the ext4 wiki
http://
the ext4 wiki under "For people who are running Ubuntu " recommends modified util-linux packages from
ftp://ftp.
or
ftp://ftp.
since those referenced files date back to 07/17.2008 it seems like some of the problems could be from people installing old packages.
Mackenzie Morgan (maco.m) wrote : Re: [Bug 453579] Re: in-place corruption of large files *without fsck or reboot* reported with linux 2.6.31-14.46 on ext4 | #162 |
I have not installed those packages, and I really doubt Scott has either since
I think he maintains them.
Oliver Seemann (os-oebs) wrote : | #163 |
I believe I might have hit this bug. I copied a 3GB iso file from NFS to a local EXT4 partition and noticed that the sha1sum is off (I only checked because the burned dvd behaved strange). I copied the file again and then it got the correct sum.
I still have both files and will keep them for a while in case they can be of help in analyzing the issue.
I just finished a full memtest86 run and it passed fine.
Some more info:
- Upgraded from jaunty, the fs was created as ext4 by jaunty
- Kernel: 2.6.31-14-generic #48-Ubuntu SMP Fri Oct 16 14:05:01 UTC 2009 x86_64 GNU/Linux
- SATA hdd, no SSD
- Nothing related in dmesg
- Booted from karmic live disc and ran e2fsck /dev/sda1, no errors found.
- ('e2fsck -n /dev/sda1' on mounted fs does report errors, but I assume that is because it is mounted?)
Later I updated the kernel and I have 2.6.31-15-generic #50 running now and copied a number of 3gb isos again. Now again one of the 4 files has an incorrect hash. So this update did not fix the bug, but I did not see anything related in the change log anyway.
Scott, what kernel versions are you referring to, that you cannot reproduce this anymore?
Let me know if I can provide any further information.
cybernet (cybernet2u) wrote : | #164 |
the bug was solved ?
Desh Danz (nicoluno) wrote : | #165 |
I'd like to know it too.....
Ramon (ram130-gmail) wrote : Re: [Bug 453579] Re: in-place corruption of large files *without fsck or reboot* reported with linux 2.6.31-14.46 on ext4 | #166 |
i believe not...some recent updates must have corrected the problem. You
guys experiencing any corruption lately?
On Sat, Dec 12, 2009 at 7:41 AM, Desh Danz <email address hidden> wrote:
> I'd like to know it too.....
>
> --
> in-place corruption of large files *without fsck or reboot* reported with
> linux 2.6.31-14.46 on ext4
> https:/
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in The Linux Kernel: New
> Status in Ubuntu Release Notes: Fix Released
> Status in “linux” package in Ubuntu: Triaged
> Status in “linux” source package in Karmic: Triaged
>
> Bug description:
> There are worrying reports of filesystem corruption on ext4 in karmic.
> Scott says:
>
> 12:36 < Keybuk> this whole ext4 thing is worrying me
> 12:36 < Keybuk> I just downloaded an iso image, md5sum didn't match
> 12:36 < Keybuk> downloaded it into an ext3 partition, matched just fine
> 12:59 < Keybuk> and I know mvo has seen bugs with corrupted .debs in
> /var/cache/
> 12:59 < Keybuk> which seems to imply its any file large enough to use lots
> of extents
>
> I'm opening this bug report so that this bug gets tracked & triaged for
> karmic. If we're unable to isolate the issue, we should consider rolling
> back to ext3 as the default filesystem in the installer.
>
> ProblemType: Bug
> Architecture: amd64
> ArecordDevices:
> **** List of CAPTURE Hardware Devices ****
> card 0: Intel [HDA Intel], device 0: AD198x Analog [AD198x Analog]
> Subdevices: 1/1
> Subdevice #0: subdevice #0
> AudioDevicesInUse:
> USER PID ACCESS COMMAND
> /dev/snd/controlC0: vorlon 3350 F.... pulseaudio
> Card0.Amixer.info:
> Card hw:0 'Intel'/'HDA Intel at 0xee240000 irq 17'
> Mixer name : 'Analog Devices AD1981'
> Components : 'HDA:11d41981,
> Controls : 20
> Simple ctrls : 11
> Date: Fri Oct 16 16:01:26 2009
> DistroRelease: Ubuntu 9.10
> HibernationDevice: RESUME=
> MachineType: LENOVO 6371CTO
> Package: linux-image-
> PccardctlIdent:
> Socket 0:
> no product info available
> PccardctlStatus:
> Socket 0:
> no card
> ProcCmdLine: root=/dev/
> ProcEnviron:
> PATH=(custom, user)
> LANG=en_US.UTF-8
> SHELL=/bin/bash
> ProcVersionSign
> RelatedPackageV
> SourcePackage: linux
> Uname: Linux 2.6.31-13-generic x86_64
> WpaSupplicantLog:
>
> dmi.bios.date: 12/27/2006
> dmi.bios.vendor: LENOVO
> dmi.bios.version: 7IET23WW (1.04 )
> dmi.board.name: 6371CTO
> dmi.board.vendor: LENOVO
> dmi.board.version: Not Available
> dmi.chassis.
> dmi.chassis.type: 10
> dmi.chassis.vendor: LENOVO
> dmi.chassis.
> dmi.modalias:
> dmi:bvnLENOVO:
> dmi.product.name: 6371CTO
> dmi.product.
> dmi.sys.vendor: LENOVO
>
> To unsubscribe from this bug, go to:
> https:/
tags: | added: 2.6.31.8 |
Leonardo Montecchi (lmontecchi) wrote : | #167 |
I have not encountered this bug so far, with the latest Karmic updates.
I have tried to reproduce it, making about 20 copies of a file which is ~2 GB, but all the md5sums matched perfectly. I have also tried with some copies of a ~3.5 GB file, with same results.
Leonardo Montecchi (lmontecchi) wrote : | #168 |
I forgot to mention that I'm using Ubuntu 64bit and that my ext4 partions were created under Jaunty
Goffi (goffi) wrote : | #169 |
I experience the same problem. But when I copy files on local disk (I made a quick script to copy 100 times a file and check md5sum) everything is fine. But I have really often corrupted files (bad md5) since a while, and memcheck is OK.
I run karmic with 2.6.31-15-generic kernel, my data partition is an encrypted ext4 one.
Goffi (goffi) wrote : | #170 |
I forgot: fsck is OK, and my corrupted files come from network
Ernst (ernst-blaauw) wrote : | #171 |
Goffi, I think your problem is related to the network part. I have a bug
report about file corruption using samba:
File corruption after copying files via samba from Karmic to Karmic
https:/
Maybe your problem is similar?
On Wed, Dec 23, 2009 at 15:26, Goffi <email address hidden> wrote:
> I forgot: fsck is OK, and my corrupted files come from network
>
> --
> in-place corruption of large files *without fsck or reboot* reported with
> linux 2.6.31-14.46 on ext4
> https:/
> You received this bug notification because you are a direct subscriber
> of the bug.
>
Goffi (goffi) wrote : | #172 |
Ernst> yes probably, but I don't use Samba at all (I only have my gnu/linux netbook here), and as the problem described here is really simillar to mine, I was wondering if ext4 was not implicated (maybe something different happen when files come from network ?).
Michael Lazarev (milaz) wrote : | #173 |
@Goffi: so, all files in which you noticed corruption, come from
network? If not Samba, how do you actually get them? Rsync? Torrent? I
believe these details could help investigate the problem.
Goffi (goffi) wrote : | #174 |
I got them by downloading throught wget, firefox and chromium (no error during download).
I had issues for exemple with qtmoko (~ 90 Mb, I had to download it 3 times from 3 different server before having the good MD5), or the navit australian map of cloudmade.com (~45 Mb, downloaded 3 times from the same server, each time with a different md5 and corrupted zip checksum, I finally wrote a small python script to have a clean file from the 3 corrupted one).
Goffi (goffi) wrote : | #175 |
By the way, I made a cmp of the qtmoko files, I have 2 bytes which differ:
% md5sum qtmoko-debian-v15*
5381503d377dc27
80d61c5c70f982f
b35355cccc3e93a
% cmp -l qtmoko-
16070033 64 164
16070039 257 255
% cmp -l qtmoko-
12701898 0 100
12701904 47 45
and for the navit map (the first one is the file I got with my script):
% md5sum australia.
bb13d0594e67cbb
1d365eecbb2d7df
73d37d697826735
f5987b008cc9b51
% cmp -l australia.
4113279 316 336
4113281 320 331
4113282 5 7
4113287 220 260
4113291 206 205
4113292 174 14
8754187 154 54
8754193 371 373
% cmp -l australia.
11878799 121 21
11878801 312 172
11878803 340 200
% cmp -l australia.
29183695 341 101
29183699 327 365
Michael Lazarev (milaz) wrote : | #176 |
I tried to reproduce this bug with australia.
# I got the first copy with firefox, and the second with wget
> wget http://
> diff -bq australia.
> md5sum australia.
77fe45bf71779e9
77fe45bf71779e9
By the way, if you downloaded that file after 15 December 2009, md5sum
should be like above.
Jakob Unterwurzacher (jakobunt) wrote : Re: [Bug 453579] Re: in-place corruption of large files *without fsck or reboot* reported with linux 2.6.31-14.46 on ext4 | #177 |
Am 2009-12-24 01:17, schrieb Goffi:
> By the way, I made a cmp of the qtmoko files, I have 2 bytes which
> differ:
Excellent! Scott, could you also post a cmp -l of a corrupted vs a good
file?
Now, let's have some binary... (Note that cmp -l output is octal)
% cmp -l qtmoko-
16070033 64 164
16070039 257 255
064 = 00110100
164 = 01110100
257 = 10101111
255 = 10101101
% cmp -l qtmoko-
12701898 0 100
12701904 47 45
000 = 00000000
100 = 01000000
047 = 00100111
045 = 00100101
You have single bit flips on the second and on the seventh bit.
This looks so much like broken memory it virtually has to be broken memory.
OTOH, if everybody sees the second and the seventh bit flip then
probably ext4 is doing something very stupid. Again, please post your
cmp -l results!
Goffi (goffi) wrote : | #178 |
Well, for my case I can exonerate ext4: I made two new DL for australia map, one on my data partition (encrypted ext4), and one on my root partitition (ext3), the 2 were corrupted :(.
Michael Lazarev> I downloaded my first map before 15, and I think the version I obtained with my script is OK as I have no CRC check error when unzipping, and it works fine with navit on my freerunner.
Jakob Unterwurzacher> yeah, that sound like memory corruption, but I ran memtest (was my first reaction) for 3 hours and it told me that my memory is ok. I will try to run it again during the night. But I don't understand why I have this issue when I download, but (apparently) not on my local disk. A memory corruption would affect all my system isn't it ?
Goffi (goffi) wrote : | #179 |
I ran memtest86+ for 9 hours, 10 pass, and still no error...
Goffi (goffi) wrote : | #180 |
My issue seems to be related not to ext4 or my memory but to my swap. I tried to download 2 times the same 30 Mb files without swap, and this time it was the same md5.
In addition, I tried to fill my swap partition with zeros, and I have an error:
% sudo dd if=/dev/zero of=/dev/sda5 bs=1024
22293+0 records in
22293+0 records out
22828032 bytes (23 MB) copied, 16.7581 s, 1.4 MB/s
1331109+0 records in
1331109+0 records out
1363055616 bytes (1.4 GB) copied, 254.542 s, 5.4 MB/s
dd: writing `/dev/sda5': Input/output error
2931829+0 records in
2931828+0 records out
3002191872 bytes (3.0 GB) copied, 471.766 s, 6.4 MB/s
zsh: exit 1 sudo dd if=/dev/zero of=/dev/sda5 bs=1024
Is there any check done on swap partition ? Can the kernel detect errors on it ? Is there a way to avoid bad clusters with swap partitions ?
I had also an issue (scrambled screen when booting) which disappeared but I can't be sure it was solved by the swap deactivation, as I tried several things at the same time (replacing kdm by gdm, removing splash at boot, and maybe an upgrade solved the problem).
Goffi (goffi) wrote : | #181 |
The tests on my partitions don't seem to find any problem:
% sudo smartctl -l selftest /dev/sda5
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 1790
% sudo badblocks -sv /dev/sda5
Checking blocks 0 to 2931830
Checking for bad blocks (read-only test): done
Pass completed, 0 bad blocks found.
Z149 (graphics149) wrote : | #182 |
I am also having problems possibly ext4 related.
After clearing up some big files, the 'free space' reported in my ext4 / filesystem did not decrease.
Emptying the wastebasket and sensible checks have found no cause.
New files added and then deleted and emptied from the wastebasket used up some of my last precious free space and I've not got that space back.
'big' files includes a couple of zipped backups of 0.1 and 0.2 GB, and a monstrous 3.1GB archive.
Nothing ordinary cleared the space. fsck did not help.
could it be ext4?
============
Desktop Ubuntu 9 with linux 2.6.31-17
40GB IDE sata drive
Michael Lazarev (milaz) wrote : Re: [Bug 453579] Re: in-place corruption of large files *without fsck or reboot* reported with linux 2.6.31-14.46 on ext4 | #183 |
@Z149: try "Applications-
also can be run from command line as "baobab". Push "Scan Filesystem"
button to see where the space goes to.
TragicWarrior (bryan-christ) wrote : | #184 |
I have experienced data corruption on 2 different systems using ext4 on flash media. One of the drives was an Intel SSD drive and the other was a SanDisk Cruzer USB flash drive. I reproduced the problem several times with the both of these drives on two different hardware systems. Here's how I reproduced the problem:
1. Install Ubuntu 9.10 32-bit on USB flash drive...
- /boot ext2 500MB (primary)
- swap 500MB (primary)
- / ext4 rest-of-drive (primary)
2. Install latest updates with Update Manager.
3. Reboot and observe corruption.
I have repeated a similar experiment on Fedora 12 with no file-system corruption.
Changed in ubuntu-release-notes: | |
status: | Fix Released → Fix Committed |
status: | Fix Committed → Fix Released |
description: | updated |
Øyvind Stegard (oyvindstegard) wrote : | #185 |
There's a fair amount of ext4-fixes in the latest 2.6.31-18.55-kernel in karmic-proposed, according to the changelog. I suppose it would be worth testing with that kernel for the people who experience this bug.
Changed in ubuntu-release-notes: | |
status: | Fix Released → Incomplete |
Steve Langasek (vorlon) wrote : | #186 |
This is not incomplete. The issue is documented in the release notes.
Changed in ubuntu-release-notes: | |
status: | Incomplete → Fix Released |
TragicWarrior (bryan-christ) wrote : | #187 |
This is an unfortunate chicken-and-egg scenario. Assuming the latest kernel in karimc-proposed does fix the problem, how does one safely upgrade their system since there is a likelihood the very update itself will get corrupted? The only certain solution would be to (gasp) re-master the Karmic ISO images with a point-release so that fresh installs are guaranteed usable.
Ryan C. Underwood (nemesis-icequake) wrote : Re: [Bug 453579] Re: in-place corruption of large files *without fsck or reboot* reported with linux 2.6.31-14.46 on ext4 | #188 |
Why would the kernel update get corrupted unless the archive or any of
the files it contains are several hundred megabytes in size?
--
Ryan C. Underwood, <email address hidden>
TragicWarrior (bryan-christ) wrote : | #189 |
Ryan,
I believe the large file aspect of the bug is an incorrect characterization. If you take a look at comment #184, you will see that I have reproduced the bug on much smaller files.
Ryan C. Underwood (nemesis-icequake) wrote : | #190 |
You did not say anything about reproducing the bug on smaller files. To my knowledge this would be the first report of a file smaller than 100MB being corrupted by this bug.
Steve Langasek (vorlon) wrote : | #191 |
On Wed, Jan 27, 2010 at 07:49:06PM -0000, TragicWarrior wrote:
> I believe the large file aspect of the bug is an incorrect
> characterization. If you take a look at comment #184, you will see that
> I have reproduced the bug on much smaller files.
No, you have reproduced some *other* corruption problem that doesn't fit the
profile of the original bug report. Please file a separate bug.
--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://
<email address hidden> <email address hidden>
TragicWarrior (bryan-christ) wrote : | #192 |
Steve,
The original posting mentions files that are 512MB (comment #53). Later it is assessed at 300MB (comment #89). Then it was whittled down to 120MB (comment #143). Then it went to 45MB (comment #174). I don't think it would be ideal to open a new bug since so much data has been captured here. Why not just re-characterize the bug to match the collected data? In my case, the largest file that I think would have come down in Update Manager would be OpenOffice ~100MB.
Steve Langasek (vorlon) wrote : | #193 |
On Wed, Jan 27, 2010 at 08:29:43PM -0000, TragicWarrior wrote:
> Why not just re-characterize the bug to match the collected data?
Because the data is not related to the bug that was reported, and it's not
appropriate to hijack bug reports for unrelated issues.
--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://
<email address hidden> <email address hidden>
TragicWarrior (bryan-christ) wrote : | #194 |
Steve, I would hardly call changing the description
from:
"in-place corruption of large files *without fsck or reboot* reported with linux 2.6.31-14.46 on ext4"
to:
"in-place corruption of files *without fsck or reboot* reported with linux 2.6.31-14.46 on ext4"
hardly constitutes hijacking. it's not as if we are talking about night and day here. in this case, the original reporter simply didn't know the problem was manifest on smaller files < 512MB. perhaps it is easier to reproduce on larger files, but the evidence now shows that it is a problem on files 45+ MB files.
Roland (roland1979) wrote : | #195 |
I can confirm this bug with current karmic kernel:
2.6.31-17-generic #54-Ubuntu SMP Thu Dec 10 17:01:44 UTC 2009 x86_64 GNU/Linux
Steps to reproduce:
Download same file with 2 sources in parallel. I took Opera, and wget.
wget http://
Opera saved to http://
Results:
roland@pdbxe100:~$ md5sum ubuntu-
8790491bfa9d00f
roland@pdbxe100:~$ md5sum ubuntu-
3f979c279665cc7
roland@pdbxe100:~$ ls -l ubuntu-
-rw-r--r-- 1 roland roland 723488768 2009-10-28 22:14 ubuntu-
-rw-r--r-- 1 roland roland 723488768 2010-01-28 15:35 ubuntu-
roland@pdbxe100:~$
Using cmp I found that there were NO differences?!
roland@pdbxe100:~$ cmp ubuntu-
I wondered, and compared again via md5sum:
roland@pdbxe100:~$ md5sum ubuntu-
8790491bfa9d00f
roland@pdbxe100:~$ md5sum ubuntu-
8790491bfa9d00f
roland@pdbxe100:~$
So after accessing the files a second time, they seemed to have synced, flushed after delay .. or whatever.
This are my ext4 flags:
has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
I created the filesystem manually via mkfs.ext4 /dev/sda4.
Øyvind Stegard (oyvindstegard) wrote : | #196 |
I've yet to see any feedback about the 2.6.31-18 kernel
(karmic-proposed) in this critical bug report, and I find that rather
strange. The proposed -18-kernel has been out for while now and I count
80+ ext4-fixes in the changelog, including a fix for a data corruption
scenario.
https:/
http://
stable release from which 2.6.31-18 has patches)
https:/
Oliver Seemann (os-oebs) wrote : | #197 |
I believe I must withdraw my bug report. I have a test case that reproduces the problem, but it does not seems to be related to ext4, as turned out today.
I read about 2.6.31-18 here and updated yesterday. But also with the new kernel I could reproduce the problem. Wondering about that I created an XFS partition and repeated the test on it ... also positive. So my problem is somewhere else.
The test is copying 4 big files totaling 11gb via nfs from an old Dapper box to a local partition. One of the files always ended up with a mismatching sha1 sum. The "cmp -l" output is always a single contiguous 128 byte block at some random offset. The values don't seem to be affected by single bit flips (110010 -> 101010, 11111101 -> 11010101, 10100 -> 101010, 1110 -> 11010111, 11110001 -> 11010010). Memtest86 also ran fine, at least on this box, I did not yet test the Dapper one.
Changed in linux (Ubuntu): | |
assignee: | nobody → Surbhi Palande (csurbhi) |
Surbhi Palande (csurbhi) wrote : | #198 |
@scott, do you still see this bug ? I tested this by doing both an upgrade and a fresh install + updates and did not seem to run into it. The md5sum works just fine. If this is still a problem, then I will post a debug kernel if you are willing to try ?
Surbhi Palande (csurbhi) wrote : | #199 |
Can anyone else confirm that this is still a bug in Karmic which is reproducible by the following steps mentioned in the original report:
1) download an iso
2) compare the md5sum
Thanks !
Surbhi Palande (csurbhi) wrote : | #200 |
Also, the result of this quick test from anyone who sees this bug, would be appreciated. If you have a ext3 fs/any other fs on some partition(or a sufficiently large file which is formated as a fs other than ext4) then please do the following:
A) ensure that your blocksize if 4096 bytes by looking at the output of dumpe2fs -h <partition which has ext4>
B) from the same output see if you can find "extent" in the line which has "Filesystem features"
C) post the output dumpe2fs -h <partition which has ext4>
if blocksize is 4096 bytes then:
1) download the iso on this ext3/other filesystem
2) dd if=<iso name> of=/dev/<ext*4* partition>/<some file name> bs=512MB count=1
3) the md5sum should be: faf49ac5a653e33
(Note that bs=512MB writes 512000000 bytes... if you write 536870912 bytes (i.e 4096 * 131072) then the md5sum should be this:
bcbc14f5bfc9229
Surbhi Palande (csurbhi) wrote : | #201 |
Also forgot to mention that the above comments apply for the following iso image:
http://
8790491bfa9d00f
Jordan (jordanu) wrote : | #202 |
**WARNING** Do not run the dd command in comment #200 **WARNING**
The command should read "dd if=<iso name> of=/mountpoint/
Pointing of= to anything in /dev is wrong, and you should always be very careful when using dd. Though unlikely, trying to follow the instructions in comment #200 as currently written could lead you to accidentally overwrite the beginning of your ext4 partition with the contents of the iso, making all of the files on that partition difficult to recover, and overwriting many of them permanently.
Surbhi Palande (csurbhi) wrote : | #203 |
Please use the following safer command to dd:
dd if=<iso name> of=/<mount-
Do avoid using the dev partition, as pointed in #200.
Thanks Jordan :)
Surbhi Palande (csurbhi) wrote : | #204 |
@TragicWarrior, can you let me know if you encounter the bug with an iso image ? Also are you still encountering the bug of a corrupted update on a (i assume safe) reboot ?
Miklos Juhasz (mjuhasz) wrote : | #205 |
I have downloaded the iso and calculated the checksums with the current (2.6.31-19) and the proposed kernel (2.6.31-20) as well. Both of them matched.
$ wget http://
$ md5sum ubuntu-
8790491bfa9d00f
Scott James Remnant (Canonical) (canonical-scott) wrote : | #206 |
I'm going to mark this bug as Invalid (I'm the original reporter)
I've not been able to replicate it on production hardware, and not been able to replicate it on the hardware where I was originally able to replicate it with karmic as it existed at release time.
Therefore I can only conclude that the problem was with faulty hardware, exasperated by a kernel issue that was fixed before karmic was released.
If you are a user still experiencing problems with the ext4 (or any other) filesystem, including those resulting in fsck errors, then you don't have the same bug that I reported so should report a new bug. Don't open this one unless you've snuck into my house and stolen my laptop <g>
Changed in linux (Ubuntu): | |
status: | Triaged → Invalid |
Changed in linux (Ubuntu Karmic): | |
status: | Triaged → Invalid |
Changed in linux: | |
status: | New → Invalid |
MillenniumBug (millenniumbug) wrote : | #207 |
So the warning should be removed from the Release Notes...?
http://
MillenniumBug (millenniumbug) wrote : | #208 |
Seems to have been removed. Thankyou, someone.
Changed in ubuntu-release-notes: | |
status: | Fix Released → Incomplete |
Changed in linux (Ubuntu Karmic): | |
status: | Invalid → Incomplete |
Changed in linux (Ubuntu): | |
status: | Invalid → Confirmed |
Changed in linux (Ubuntu Karmic): | |
status: | Incomplete → Confirmed |
Changed in linux (Ubuntu): | |
status: | Confirmed → Incomplete |
Steve Langasek (vorlon) wrote : | #209 |
do not change the status of this bug.
Changed in ubuntu-release-notes: | |
status: | Incomplete → Fix Released |
Changed in linux (Ubuntu): | |
status: | Incomplete → Invalid |
Changed in linux (Ubuntu Karmic): | |
status: | Confirmed → Invalid |
DjznBR (djzn-br) wrote : | #210 |
* * * I JUST HIT THIS BUG * * *
Yes, I just did it...
I have bought a new SEAGATE HDD, part number ST3500418AS. Formatted as ext4, with / (40GB), swap (5GB), /home (220GB) and ntfs (220GB).
I installed ubuntu 10.04 and installed all updates.
Then I downloaded the ISO for 10.04.1 via Transmission bit-torrent client.
I burned the CD with Brasero.
Upon installation, it stuck in "Ubuntu" screen. Told it to check the CD and there were errors.
For my surprise, the ISO file MD5SUM was mismatching in the ext4 partition.
Then I turned on Transmission again and made it "RECHECK" the file, the file got the correct MD5SUM.
I believe I have hit this bug just now, because I was using ext3 for my home partition in my previous HDD. ext4 only for root partition. Now, problems have arised a couple of minutes JUST AFTER a ISO download and a fresh formatted ext4.
I would consider not marking this bug as invalid.
papukaija (papukaija) wrote : | #211 |
@DjznBR: No, you haven't reproduced this bug. Lucid is using 2.6.32 while this bug is about kernel 2.6.31. In addition, did you read comment 206; especially this:"Therefore I can only conclude that the problem was with faulty hardware, exasperated by a kernel issue that was fixed before karmic was released." ? Please open a new bug for your issue.
DjznBR (djzn-br) wrote : | #212 |
I believe the bug title has been changed once or twice, but let me re-quote here what Scott reported:
"There are worrying reports of filesystem corruption on ext4 in karmic. Scott says:
12:36 < Keybuk> this whole ext4 thing is worrying me
12:36 < Keybuk> I just downloaded an iso image, md5sum didn't match
12:36 < Keybuk> downloaded it into an ext3 partition, matched just fine
12:59 < Keybuk> and I know mvo has seen bugs with corrupted .debs in /var/cache/
12:59 < Keybuk> which seems to imply its any file large enough to use lots of extents"
Well, that's exactly what happened on a fresh Lucid install, using ext4 partition.
It may be neither an issue with ext4 itself, nor an issue with kernel version or patch.
I think this is related to "Transmission" application. Because reports are that the corruption takes place when torrents are downloaded. And this is what exactly happened. In some ways it may be that Transmission is not handling ext4 well. And it's very subtle, since a "file recheck" on finished torrents may just reconscrut the proper MD5SUM.
Mackenzie Morgan (maco.m) wrote : | #213 |
It's not Transmission's fault. I'm a KDE user (so, I use KTorrent),
and I was affected back when this bug was filed (no problems since
though).
Ben Lau (benlau) wrote : | #214 |
The bug should also affect 10.04. I have a fresh install 10.04 AMD64 (with data copy from old harddisk , stored on /home) . The result of md5sum on ubuntu-
$ md5sum ubuntu-
adbe2aa291535c9
$ md5sum ubuntu-
735b22e87a77e5c
$ md5sum ubuntu-
9fb810608e96ba3
$ uname -a
Linux benlau-desktop 2.6.32-24-generic #41-Ubuntu SMP Thu Aug 19 01:38:40 UTC 2010 x86_64 GNU/Linux
papukaija (papukaija) wrote : | #215 |
@Ben (and everyone else who thinks that he/she has reproduced this bug): No, you haven't reproduced this bug. Lucid is using 2.6.32 while this bug is about kernel 2.6.31. In addition, did you read comment 206; especially this:"Therefore I can only conclude that the problem was with faulty hardware, exasperated by a kernel issue that was fixed before karmic was released."? Please open a new bug for your issue.
papukaija: could you please update the bug description to point to pertinent bugs for other kernel versions? I'm seeing what I suspect to be ext4 corruption on multi-CPU systems (I think all amd64) or various kernels, on both small and large files. Where and how should I report this? So far, this bug seems the closest match.
Other ext4 bugs I have looked at: bug #438379 (pretty exclusively about suspend/resume problem), bug #317781 (seems to focus on 0-byte files; certainly seems closer to what I am looking at in bug #582341).
papukaija (papukaija) wrote : | #217 |
@era: Unfortunately I won't edit this bug's title nor reopen it due to reasosn mentioned in comment 215 which refers to comment 206. You should report your issue to Launchpad against the linux package. You can do so by running the following command from a Terminal (Applications-
ubuntu-bug linux
Please try to provide as much information as possible in the bug description:
1) The majority of kernel bug are hardware specific so be sure to note what hardware/device is being used.
2) Document any known steps to reproduce the bug.
3) Also note whether the bug exists in previous kernel versions of Ubuntu or if it's a regression from previous kernel versions.
4) Finally, it will help if you can test the latest development Ubuntu kernel version as well as the latest upstream mainline kernel[1].
More detailed instructions to file a bug are available at: https:/
[1]: https:/
Thanks in advance.
Emanem (em4n3m) wrote : | #218 |
I think I'm suffering from the same issue.
I was archiving my home in 1 tar file from 1 disk to another (all EXT4) and then it got stuck. I had to restart the computer and eventually it proceeded, but I have to say, after I copy large files, the chances I can't read/open other large files are high.
Basically as long as I don't copy/manipulate large files I don't have particular issues; as soon as I try to do such operations I have to restart my pc.
I'm using:
2.6.38-11-generic #50-Ubuntu SMP Mon Sep 12 21:17:25 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
I have to say I have a pretty vanilla Ubuntu, no customization. I'm thinking about Ext4 issue because 6 months ago all my disks were Ext2 and never had an issue, but now looks like an issue after another.
Did a memcheck and it seems definitely ok.
Unfortunately it's very hard to reproduce systematically.
Cheers
There are several open bug reports upstream regarding ext4 corruption, but it's not clear which, if any, are related to the problems being observed.
http:// bugzilla. kernel. org/show_ bug.cgi? id=14354 is one bug that appears to be linked to the use of the DM layer - if you're following up to this bug report, please indicate whether your ext4 fs is sitting on top of a dm-crypt, LVM, or RAID device.
That bug also mentions using auto_da_alloc=0 as a boot option to work around; we should check whether that boot option makes a difference for users seeing this bug.