Bug #330824 “Soft lockups (freezes) when deleting files from ext...” : Bugs : linux package : Ubuntu

Revision history for this message

Martin Vysny (vyzivus) wrote on 2009-02-27:

#1

I have exactly the same problem with 2.6.28-8.26. The problem started to appear only recently (2-4 days ago). The problem manifests only when deleting files - it never triggers when adding files. The problem occurs regardless of X running. Interesting is that the problem occurs on 32bit kernel only - 64bit 2.6.28-8.26 does not seem to be affected.

Revision history for this message

Michał Zając (quintasan) wrote on 2009-02-28:

#2

I've encountered it more than 7 times (today 3 times).
First time it happend while moving my /home (4GB) to /mnt/Data, I had to reset the computer and lost some data (not very important thankfully). Today I've tried to clean the pbuilder enviroment with "ARCH=amd64 DIST=jaunty sudo pbuilder --clean" and after restarting my .kde directory was gone.

It seems the freeze occurs when moving or deleting big portions of data. Anyone else can confirm it?

Linux nightwalker 2.6.28-8-generic #26-Ubuntu SMP Wed Feb 25 04:27:53 UTC 2009 x86_64 GNU/Linux

Revision history for this message

dnyaga (daniel-nyaga) wrote on 2009-03-14:

#3

I had experienced the same, and reported it at https://bugs.launchpad.net/ubuntu/+bug/334581. I have had to hard reset my computer four times today.

The circumstances were the same all 4 times: I was copying large directories between different ext4 partitions (using nautilus) when the system locked up. The directories in question have tens of thousands of small sized files. I am going to mark bug 334581 as a duplicate of this one so that we can focus our discussion and testing on one bug report.

Revision history for this message

dnyaga (daniel-nyaga) wrote on 2009-03-14:

#4

Alarming frequency of kernel freezes when working with directories that have lots of tiny files: see https://bugs.launchpad.net/ubuntu/+source/subversion/+bug/342164. That bug reporter's system froze, was hard reset, ext4 had not written the newest file to disk.

Question: what is causing all these freezes?

Changed in linux:
status:	New → Confirmed

Revision history for this message

Agent N2O (agentn2o) wrote on 2009-03-15:

#5

I have experienced something similar to the first poster: I installed ubuntu 9.04 alpha 5 last week on a newly formated ext4 partition. As I was setting the system up, I was updating the system with the latest package updates but I kept running into an error saying the drive was full (it was actually at 20% full of 160 GB). Tried moving and deleting files off the drive, nothing worked. Eventually a reboot solved this but I don't know why.

Upgraded to kernel 2.26.28-9 (alpha 6) on Friday. This weekend I went about converting 2 x 1 TB data drives to ext4 (from ext3) and all went initially well but I wanted to get the full extent (no pun intended) of ext4 file structure so I was cut and pasting data back and forth between the drives using nautilus but the OS kept freezing. Eventually I figured out that copying and pasting was fine but deleting was the culprit. I tried deleting in nautilus and that hung the OS. Tried in a terminal, same thing. Booting into recovery mode and down to the root prompt and went about deleting these files and got a series of these: "BUG: soft locking - CPU#0 stuck for 61s!"

In the end I managed to completely clear one drive off so I reformatted it and then transferred everything back and then reformatted the other. Now both TB drives have "native" ext4 partitions and I can delete from those drives without hangs or freezes.

Revision history for this message

dnyaga (daniel-nyaga) wrote on 2009-03-15:

#6

From Agent N20's comments above, it appears that the freezing occurs where ext3 partitions were converted to ext4 partitions. I have 3 converted ext4 partitions and one fresh/new one. Will try and test that theory a little tonight.

To the other reporters: were your ext4 partitions new or converted?

Revision history for this message

dnyaga (daniel-nyaga) wrote on 2009-03-15:

#7

Same behavior independently reported here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/340628

The reporter of bug 340628 provided a stack trace.

Revision history for this message

Agent N2O (agentn2o) wrote on 2009-03-15:

#8

Forgot to mention that the freezing didn't happen ALL the time with the ext4 deletes. When I was cutting and pasting it would get a few mins in and freeze, and elsewise I was able to delete some files but others made it freeze. I suspect it may have been LARGE files but I do not have solid proof of that.

Revision history for this message

Brian J. Murrell (brian-interlinx) wrote on 2009-03-15: Re: [Bug 330824] Re: ext4 or 2.6.28 is completely freeze my system

#9

On Sun, 2009-03-15 at 20:23 +0000, Agent N2O wrote:
> Forgot to mention that the freezing didn't happen ALL the time with the
> ext4 deletes. When I was cutting and pasting it would get a few mins in
> and freeze, and elsewise I was able to delete some files but others made
> it freeze. I suspect it may have been LARGE files but I do not have
> solid proof of that.

In my bug, 340628, duped to this bug, the cause was almost certainly a
race. I had multiple deletes going on in the filesystem at the same
time.

FWIW, this is not a problem with 2.6.27-12 from Intrepid which I am
currently using with Jaunty due to this issue.

Revision history for this message

Agent N2O (agentn2o) wrote on 2009-03-15:

#10

It looks like the converted vs native ext4 filesystem info I gave earlier was a RED HERRING! I just got another system freeze deleting files off my EXT4 partition that I had reformatted (using mkfs.ext4) yesterday. I have just dropped down to a root shell on the recovery mode to see if I can figure out which specific file (size?, type?) causes problems.

Revision history for this message

Agent N2O (agentn2o) wrote on 2009-03-15:

#11

Well, I could not reproduce the latest system freeze. Certainly the frequency of the system freezing from EXT4 deletes is much, much lower on this new native EXT4 partition as opposed to the converted version. I am going to do some more spring cleaning to see if it will freeze up again.

Revision history for this message

Agent N2O (agentn2o) wrote on 2009-03-16:

#12

3 more nautilus delete freezes to report (all from same "native" EXT4 partition):

1. deleted a folder with a bunch of video files totalling 7.5 GB
2. deleted 16 folders and files totalling 1.1 GB
3. deleted 6 folder and files totalling 1.6 GB

In all 3 cases it froze immediately after I said yes to the "are you sure prompt" and also I was able to carry out the exact same delete after the reboot, without issue.

Revision history for this message

dnyaga (daniel-nyaga) wrote on 2009-03-16:

#13

The freezes I initially reported occurred when I was moving large folders between ext4 partitions (moves between partitions involve deletes). When I am doing this kind of re-organizing, I usually have several move operations going on concurrently. Could it be that the bug is triggered more easily when there are multiple delete/move operations going on concurrently?

Last night I moved 70GB of data between 2 ext4 partitions. All the 70 GB was moved in one sequential operation. The computer did not freeze. I dropped one one of the ext4 partitions, re-created it, then moved the data back. The machine still did not freeze.

This evening I will "manufacture" some data that I can afford to lose then move it helter skelter between several ext4 partitions, making sure that there is a large number of moves active at any particular time.

Revision history for this message

dnyaga (daniel-nyaga) wrote on 2009-03-16:

#14

just had another freeze. I was deleting a virtual machine snapshot (relatively large file). when I rebooted, I was able to finish deleting.

Revision history for this message

Pauli Virtanen (pauli-virtanen) wrote on 2009-03-18:

#15

Photo of a stack trace from SysRq+L Edit (28.9 KiB, image/png)

Confirm that similar regular freezing occurs only on my machine, with ext4 FS converted from ext3. Typically the freeze occurs under high disk activity; I believe when the freeze has happened, I have had a rsync job traversing whole /home, which contains a large number of small files.

I managed to get a SysRq+L stack trace, when the freeze occurred. (Photo attached; the machine was unresponsive, so can't attach it as text.) The trace is quite similar to that reported in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/340628
It might be of note that the system did not initially respond to SysRq commands at first, but responded only after a few minutes.

These freezes occur very frequently, typically within a few hours of uptime. This bug severely affects viability of using ext4 partitions (if the problem really has to do with ext4).

Probably unrelated information: freezes occur both with the non-free Nvidia driver and the free Xorg nv driver.

Revision history for this message

yaztromo (tromo) wrote on 2009-03-23:

#16

Posting to confirm same bug. Happens when emptying lots of files from the recycle bin, or doing a big rm -r *

Message is something like "BUG: soft locking - CPU#0 stuck for 61s!"

Xubuntu 9.04 and ext4 file system

Revision history for this message

yaztromo (tromo) wrote on 2009-03-23:

#17

I should add that my file system is new and not a convert from ext3.

Revision history for this message

davidnottingham (david-hill-home) wrote on 2009-03-25:

#18

Have experienced this on a daily basis, whenever I try empty the Trash folder. There are several large files in the Trash. This is on a x86_64 system running Ubuntu, and as mentioned above, under gnome and via the comand line (using rm -rf)

Revision history for this message

Xavier Fung (xavier114fch) wrote on 2009-03-26:

#19

Same thing happened to me when I use kdesvn-build to build KDE SVN. Usually it truncates the .svn/entries file just like what has been reported before:

kde-devel@xavier:~$ cd kdesvn/kdesupport
kde-devel@xavier:~/kdesvn/kdesupport$ svn up
svn: Working copy '.' locked
svn: run 'svn cleanup' to remove locks (type 'svn help cleanup' for details)
kde-devel@xavier:~/kdesvn/kdesupport$ svn cleanup
svn: Can't read file 'soprano/includes/Error/.svn/entries': End of file found

Whole system lockup is the end result and need a hard reset.

Revision history for this message

Eric Sandeen (sandeen-ubuntu) wrote on 2009-03-26:

#20

When it freezes, attaching the output of sysrq-w, either via

# echo w > /proc/sysrq-trigger
# dmesg > dmesg.txt

or doing the keyboard combination, would probably be helpful for getting to the bottom of what appears to be a deadlock.

Revision history for this message

Andrius Štikonas (stikonas) wrote on 2009-03-27:

#21

Vanilla kernel 2.6.29-rc8 works well for me. So either this problem was fixed in kernel 2.6.29-rc8, or the problem is caused by Ubuntu kernel patches.

Revision history for this message

yaztromo (tromo) wrote on 2009-03-27:

#22

Reproducing this bug to get a trace corrupted my system so badly not even a ubuntu jaunty CD will boot without locking the system hard. I'm now stuck on my laptop since I have no way to rescue!

Revision history for this message

Theodore Ts'o (tytso) wrote on 2009-03-27:

#23

ext4: fix locking typo in mballoc which could cause soft lockup hangs Edit (1.4 KiB, text/plain)

I'm not sure this patch will fix the problem (since I haven't been able to reproduce it yet), but it is at least plausible that this reported "brown paper bag" bug might be responsible for this failure mode.

I've also had one person (irc handle SuperSquirrel) tell us on ext4 that when he went to a stock 2.6.29 kernel, he could no longer reproduce the problem which he could reproduce reliable before. If this is true, then the patch I've attached may not be the solution, and it may be caused by something else in the Ubuntu specific kernel. (Although there was one person who reported a problem very similar to the one reported here on the linux-ext4 list that I don't think was using an Ubuntu kernel, so I'm not sure what to make of this "I went to stock 2.6.29 and it went away" report.)

The patch which I've attached fixes a real bug, and it will be headed to the stable kernel series as soon as it gets accepted upstream, and I'd strongly encourage Ubuntu to pick up this patch. Whether this patch fixes the rm -rf --> soft lockup problem is a different story.

Revision history for this message

Theodore Ts'o (tytso) wrote on 2009-03-28:

#24

One more people for folks who can reproduce this to test, from the #ext4 IRC channel:

(09:27:06 PM) SuperSquirrel: I am using ubuntu stock kernel on jaunty now and hasnt frozen since i turned app armor off.
(09:29:18 PM) SuperSquirrel: i have deleted 30000 Files in one directory

Can anyone else confirm that if they disable apparmor, the problem goes away?

Revision history for this message

joijioj (fdjsio-deactivatedaccount-deactivatedaccount) wrote on 2009-03-28:

#25

Hello I am "SuperSquirrel" on the IRC.

I have compiled a 2.6.29 Kernel last night and my system has not hung yet. I have also tested the stock kernel in ubuntu jaunty alpha with app armor deleted and my system has not hung up yet. So i think the problem lies with apparmor somewhere as some ext4 developer said on IRC yesterday.

Revision history for this message

yaztromo (tromo) wrote on 2009-03-28:

#26

Simply unloading Apparmor service doesn't help. Is there a quick way to disable apparmor in the kernel too?

Vague guess but does this bug have any relevance? http://osdir.com/ml/file-systems.ext4/2008-01/msg00083.html

It seems to be something that was fixed in 2.6.29.

Revision history for this message

Theodore Ts'o (tytso) wrote on 2009-03-28:

#27

@21:
>Vanilla kernel 2.6.29-rc8 works well for me. So either this problem was fixed in kernel >2.6.29-rc8, or the problem is caused by Ubuntu kernel patches.

Any chance you can try a vanilla 2.6.28 kernel and see if you can reproduce the problem there? Other very interesting test points would be 2.6.28-rc5, and 2.6.28-rc7. Potential fixes that might have fixed this are:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ba4439165f0f0d25b2fe065cf0c1ff8130b802eb

and

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=7ce9d5d1f3c8736511daa413c64985a05b2feee3

The first patch, which I suspect is more likely the fix, was merged into 2.6.28.8 and 2.6.28-rc6. The second patch was merged into 2.6.28-rc8, and isn't yet in a 2.6.28.y series yet, although it is in the for_stable branch of the ext4 git tree.

Hence it would be interesting to see if the problem is present in 2.6.28-rc5, and fixed in 2.6.28-rc6. (And thanks to whoever can do the test, since I haven't been able to figure out how to replicate it on my systems yet.)

Revision history for this message

Theodore Ts'o (tytso) wrote on 2009-03-28:

#28

@26:
>Vague guess but does this bug have any relevance?
>http://osdir.com/ml/file-systems.ext4/2008-01/msg00083.html

I don't think so. The date on that is January 2008, and that patch was integrated long ago.

>It seems to be something that was fixed in 2.6.29.

So you've independently confirmed that it was fixed in stock 2.6.29? If so, then I think we have two people who have confirmed that it was fixed in 2.6.29, and one person who has reported it fixed in 2.6.28-rc8. (See my previous note for potential patches that might have fixed this issue.)

Revision history for this message

Theodore Ts'o (tytso) wrote on 2009-03-28:

#29

Apparmor seems less likely to be the cause, as does any of Ubuntu's "sauce" patches. I have a report from someone who is using a completely stock kernel who has seen this bug on 2.6.28, 2.6.28.4, and 2.6.29-rc6 (which if confirmed rules out my "most likely fix" in comment #27 above). Since apparmor isn't in a stock mainstream kernel, it now looks like the problem may have been fixed sometime between 2.6.28-rc6 and 2.6.28-rc8.

(I would appreciate if others could confirm this, though --- since at least some people seem to be able to trigger this very easily, others seem to only trigger this on order of once a month or so. So if one of you Gentle Readers who have been able to reliably reproduce this hang can check to see whether or not it is present in stock 2.6.29-rc6, and but is apparently fixed in 2.6.29-rc8, I would be most grateful for the independent confirmation.)

Thanks to all who have been helping to work this bug!

Revision history for this message

Gabriel Thörnblad (gabriel-thornblad) wrote on 2009-03-28:

#30

Just to make things absolutely clear:
the kernel versions you would like us to test is 2.6.29-rc6 and 2.6.29-rc8? There have been numerous references to 2.6.28-rc kernels as well above which has got me all confused.

Revision history for this message

Theodore Ts'o (tytso) wrote on 2009-03-28:

#31

@30: Gabriel,

Yes, that's correct; if you could test 2.6.29-rc6 and 2.6.29-rc8, I would be much obliged.

Sorry for the other references to other -rc kernels. I'm gathering information from other sources, including updates from this Launchpad comment stream, and each time I can get more information about "I can reproduce the problem on kernel <foo>" and "The problem seems to go away on kernel version <bar>", we get more information. The object here is to find out which patch actually solves the problem, so I can make a recommendation to the Ubuntu kernel devs to backport that individual patch --- since at this late date it is highly unlikely they will suddenly move Ubuntu Jaunty to use the just-released 2.6.29 kernel.

Thanks, regards,

Revision history for this message

yaztromo (tromo) wrote on 2009-03-28:

#32

@Theodore,

I justed tested 2.6.29-rc6 sourced from http://kernel.ubuntu.com/~kernel-ppa/mainline/

My usual test, which involved deleteing 40gig of video files, that reliably crashed 2.6.28 hasn't crashed 2.6.29-rc6 yet. Since I may have just gotten lucky I'll do some more testing tommorrow.

If I can't get rc6 to crash is there much point in testing rc8?

Revision history for this message

yaztromo (tromo) wrote on 2009-03-29:

#33

Update: After doing even more testing this morning, I'm 99% sure 2.6.29-rc6 isn't affected by this bug.

Revision history for this message

dpr (dpr-aha) wrote on 2009-03-29:

#34

Hi, I could not reproduce the bug in 2.6.29-rc6 or 2.6.29 final from the same source (http://kernel.ubuntu.com/~kernel-ppa/mainline/). But I can reproduce it in 2.6.28.9 as well as in the latest ubuntu kernel.

Revision history for this message

Theodore Ts'o (tytso) wrote on 2009-03-30:

#35

Hmm. So two people have said they haven't been able to reproduce the bug in 2.6.29-rc6. Unfortunately, one poster on the linux-ext4 claims that he experienced the problem (including getting his file system corrupted) while running that version, 2.6.29-rc6.
I'll have to ask him to confirm this. Also, all of the most likely bug fixes in 2.6.29-rc6 were forward ported to 2.6.28.8 (and thus would have been in 2.6.28.9).

So we have some contradictory data out there. I'm not sure how to reconcile these reports.

Can those folks who say they aren't seeing a problem with 2.6.29-rc6 try with 2.6.29-rc4 and 2.6.29-rc5, to see if they can trigger the problem there?

Revision history for this message

yaztromo (tromo) wrote on 2009-03-30:

#36

I haven't tried with rc5 but I can't trigger the lockup in rc3 or rc4 at all (after much trying too!). Going back to ubuntu 2.6.28 I can still trigger it almost immediately.

http://kernel.ubuntu.com/~kernel-ppa/mainline/ doesn't have any more built kernels lower than rc3 so I'm stuck unless someone can point me to a tutorial on compiling rc1 from source.

Revision history for this message

Andrius Štikonas (stikonas) wrote on 2009-03-30:

#37

@36
download tarball from kernel.org
tar xf linux-*.tar.bz2
fakeroot make-kpkg --initrd linux_image

Revision history for this message

Andrius Štikonas (stikonas) wrote on 2009-03-30:

#38

@36
I made mistake in instructions:
tar xf linux-2.6.29-rc*.tar.bz2
cd linux-2.6.29-rc*
make menuconfig
fakeroot make-kpkg --initrd kernel_image

I am now compiling rc2. Will tell the result in a few hours.

Revision history for this message

Theodore Ts'o (tytso) wrote on 2009-03-30:

#39

@yaztromo,

Can you tell me what you do to try to reproduce the problem? As I mentioned, I haven't been able to reproduce it myself, so I've had to rely other people's bug reports. If there's someone who is familiar with "git bisect", it would be really useful to try to do a "git bisect start v2.6.28 2.6.29 -- fs/ext4 fs/jbd2", reversing the sense of "git bisect good" and "git bisect bad" (i.e., if you can reproduce it, call it "git bisect good", and if you can't reproduce the soft lock, call it "git bisect bad"). It would probably require half a dozen builds or so but at the end of it, it would point us at a patch which apparently fixed the bug. (There are 91 commits invloving either the fs/ext4 or fs/jbd2 directories between .28 and .29, and log base 2 of 91 is about 6.5; so it will require approximately 7 git bisect tests in order to localize things down to a single commit.)

Again, this is mostly useful so we can tell the Ubuntu kernel devs which patch to backport for the official Ubunut Jaunty kernel. (Fedora 11 is going to be using 2.6.29, so they won't see this issue.) So unless someone can help me reproduce it on my test system (which is a 1Gig netbook with a 5400 rpm drive running Ubuntu 8.10 with an updated kernel), I really will need someone who can reproduce it and who knows how to drive git and do kernel builds out of a git source tree to localize this down.

Revision history for this message

Carey Underwood (cwillu) wrote on 2009-03-30: Re: [Bug 330824] Re: Soft lockups (freezes) when deleting files from ext4 partitions on 2.6.28

#40

hang.py Edit (612 bytes, text/x-python; charset=US-ASCII; name="hang.py")

I've been able to reproduce this consistently on my desktop (2.5gb
ram, amd@1.6ghz singlecore, 7200rpm drive) by writing half a meg to a
couple thousand different files sequentially, dropping the cache,
deleting them, and starting over. Usually the machine hardlocks
partway into the second cycle. Under 2.6.29, the test completes fine
with no intermittent hanging or otherwise. I haven't tried any other
kernels yet.

My laptop (1gb ram, intel@1.6ghz, 5400rpm drive) hangs intermittently
on the same workload, but doesn't hardlock consistently.

Tim Gardner (timg-tpi) on 2009-04-03

Changed in linux (Ubuntu):
assignee:	nobody → timg-tpi
importance:	Undecided → Medium
status:	Confirmed → In Progress

Saivann Carignan (oxmosys) on 2009-04-15

affects:

ubuntu-website → ubuntu-release-notes

Steve Langasek (vorlon) on 2009-04-16

Changed in ubuntu-release-notes:
status:	New → Fix Released

Carey Underwood (cwillu) on 2009-05-03

description:

updated

Saivann Carignan (oxmosys) on 2009-05-30

Changed in linux (Ubuntu Karmic):
status:	In Progress → Fix Released

Stefan Bader (smb) on 2009-07-03

Changed in linux (Ubuntu Jaunty):
status:	In Progress → Fix Committed

Martin Pitt (pitti) on 2009-07-08

tags:

added: verification-needed

Revision history for this message

martinm1000 (martinmiller-gmail) wrote on 2009-07-13:

#231

I am on .28-14;

I Didn't know about the python script; I'll try it after work.

Revision history for this message

Franz Dietzmann (tdk-le) wrote on 2009-07-13:

#232

I just read through all the comments (I hope), and did not find this mentioned, so I thought it might be helpful..

I had the problem for a long time, but didn't bother too much. Now it got annoying and after some searching I installed mainline 2.6.30 to see if it would work.
As has been mentioned here before it does, but unfortunatly my UMTS didn't work anymore, so I just deleted my Trash and went back to .28
After logging in I found I had 10GB more space on my Home-Partition (the Trash only had ~1GB in it) The partition is only 40 GB total, so that's a lot. I checked if something was missing, but didn't find anything, which was strange.

I ran baobab just out of curiosity and there I found 5GB in ~/.local/share/Trash/expunged/
On closer inspection these were all files I supposedly deleted a long time ago, when the freeze appeared afterwards. I have no idea how they got there, I'm just a user...but maybe that info can point someone into the right direction.

Revision history for this message

Derek (bugs-m8y) wrote on 2009-07-13: Re: [Bug 330824] Re: Soft lockups (freezes) when deleting files from ext4 partitions on 2.6.28

#233

On Mon, 13 Jul 2009, Franz Dietzmann wrote:

> I just read through all the comments (I hope), and did not find this
> mentioned, so I thought it might be helpful..
>
> I had the problem for a long time, but didn't bother too much. Now it got annoying and after some searching I installed mainline 2.6.30 to see if it would work.
> As has been mentioned here before it does, but unfortunatly my UMTS didn't work anymore, so I just deleted my Trash and went back to .28
> After logging in I found I had 10GB more space on my Home-Partition (the Trash only had ~1GB in it) The partition is only 40 GB total, so that's a lot. I checked if something was missing, but didn't find anything, which was strange.
>
> I ran baobab just out of curiosity and there I found 5GB in ~/.local/share/Trash/expunged/
> On closer inspection these were all files I supposedly deleted a long time ago, when the freeze appeared afterwards. I have no idea how they got there, I'm just a user...but maybe that info can point someone into the right direction.

I'm sure that this is just one of the many ways to trigger this ext4 thing, still, interested me even if not the cause of the bug.

http://ubuntuforums.org/showthread.php?t=1196171&page=2
Found this thread which seems to be same issue.

Appears that this is related to permission/ownership - so presumably you deleted read-only files.

I can imagine that might happen if, for example, the files were copied off a CD and had default read-only permissions.

I'm suprised nautilus doesn't handle this more gracefully.

Revision history for this message

Franz Dietzmann (tdk-le) wrote on 2009-07-13:

#234

I highly doubt that it had something to do with permissions, as there were really all kinds of files (audio, video, documents..) from different sources (downloads, self-made..).

I didn't mean this to be a cause of the bug, but rather a result and maybe an indicator to where things might be going wrong.

Revision history for this message

martinm1000 (martinmiller-gmail) wrote on 2009-07-14:

#235

Yep, I crashed using hang.py :

Linux lantea 2.6.28-14-generic #46-Ubuntu SMP Wed Jul 8 07:21:34 UTC 2009 i686 GNU/Linux

Filesystem Type Size Used Avail Use% Mounted on
/dev/sda5 ext4 90G 76G 9.9G 89% /

Didn't crash with 10GB of 100GB.

/dev/sda5 ext4 90G 82G 4.2G 96% /

Yep, crashed on round 3.

;-(

Revision history for this message

Stephan Frank (sfrank) wrote on 2009-07-15: still freezes with 2.6.28-14

#236

Hallo,

I'm sorry to say that my system still hard locks with the new 2.6.28-14
kernel in jaunty when I rsync my home partion (ext3) with my backup
partition (ext4). It does not matter wether I use 'rsync -av --delete
...' or only 'rync -av ...'. The latter one just takes a little longer
for the freeze to happen. This is on a AMD Athlon 64 Processor 3700+.

The weird thing is that a have access to another system with an Intel
Quad-Core CPU that is fully ext4 but runs without a hitch. I think that
suggests that we are really running into a timing/race problem here.

Best regards,
Stephan

Revision history for this message

Luke Maurer (luke-maurer) wrote on 2009-07-16:

#237

Huh. My system's also a single-core Athlon 64, and I'm getting it even worse (a single "rm" hangs). Is it possible that this is a race condition that's *more* likely on a single-core box? Seems like we've exhausted every other theory :-)

Revision history for this message

Jared Heath (jared-heath) wrote on 2009-07-16:

#238

It happened very frequently on my Dual Core i86 based system (never got more than 5 single rm commands off without a hang before I went to the higher kernel) so it certanly can happen on multi-core systems often.

Your theory on race conditions is interesting though--it certainly exhibits the behavior of a race that goes infinite and does not get caught.

Revision history for this message

Colin Sindle (csindle) wrote on 2009-07-16: Re: [Bug 330824] Re: Soft lockups (freezes) when deleting files from ext4 partitions on 2.6.28

#239

Apologies, this is a qualitative post --- but now that people are talking
about different processors, I'll contribute some fluffy info.

That said, I experienced many "freezes" per day on my Core Solo laptop when
doing "dangerous" operations (svn update, rsync, rm, etc.). Then I swapped
to a Core 2 Duo, and when doing these same operations, I got about the same
number "freezes", only now they recovered faultlessly (so far...) after
second or two.
After an upgrade to 2.6.30-020630-generic #020630 from the Ubuntu Kernel-ppa
mainline, (to solve unrelated HP laptop sound issues), I have not
experienced any more "freezes" temporary, or otherwise.

c.

2009/7/16 Jared Heath <email address hidden>

> It happened very frequently on my Dual Core i86 based system (never got
> more than 5 single rm commands off without a hang before I went to the
> higher kernel) so it certanly can happen on multi-core systems often.
>
> Your theory on race conditions is interesting though--it certainly
> exhibits the behavior of a race that goes infinite and does not get
> caught.
>
>

Revision history for this message

Stephan Frank (sfrank) wrote on 2009-07-16: Re: [Bug 330824] Re: Soft lockups (freezes) when deleting files from ext4 partitions on 2.6.28

#240

Colin Sindle wrote:
> After an upgrade to 2.6.30-020630-generic #020630 from the Ubuntu Kernel-ppa
> mainline, (to solve unrelated HP laptop sound issues), I have not
> experienced any more "freezes" temporary, or otherwise.

I have as well now manually switched to the 2.6.30-020630 kernel and the
freezes are gone...

Best regards,
Stephan

Revision history for this message

martinm1000 (martinmiller-gmail) wrote on 2009-07-16:

#241

I'm going to reboot, after installing 2.6.30 + newer (185.18.14) NVidia drivers
see https://bugs.launchpad.net/ubuntu/+source/nvidia-common/+bug/384639/comments/8
to do it with NVidia working ;-)

Hoping this will solve the crash problem. I would suggest to others to try the same, since the problem was
apparently solved and NOBODY decided to just backport the damn patches from the more recent kernels... I mean, its been MONTHS, and I'm not running Linux to have random crashes.

Revision history for this message

Borph (borph) wrote on 2009-07-17:

#242

Full acknowledgement!

For me, I installed Kubuntu Jaunty fresh with native ext4 and external backup drive, also ext4. Actually it was because of a system crash in which I lost my complete partition. So I want to have the backup-system working now before I proceed! But I was stuck because of this ext4-bug, system freezed very often!

I'm just a user and didn't want to experiment!! Ext4 is not the default fs on ubuntu I read above, ok but I really regret that I chose this during graphical installation! Sorry that I didn't read the full release notes, I had no idea that it is that experimental!

Anyway, now I'm stuck, as don't want to re-format my disks, especially not for an issue which doesn't occure in mainline kernel. So I decided to tweak the system and get the kernel 2.6.29 (the 2.6.30 seems to have other problems..), following:

http://www.ramoonus.nl/2009/03/24/linux-kernel-2629-installation-guide-for-ubuntu-and-debian-linux/

But this doesn't put it in GRUB, so you have to change your menu.lst and do update-grub and update-initramfs.

Well, no crashes so far, even copying about 30gig. I actually removed the "nodelalloc" mount option, still stable so far.

I really recommend to get a newer kernel ( >=.29), especially because this is just an Ubuntu problem and Ted Ts'o is probably busy fixing more important stuff :) But the ubuntu guys should provide indeed an automatic update for the _really_ unexperienced people!

Revision history for this message

JoseStefan (josestefan) wrote on 2009-07-17:

#243

I've also been using the Karmic kernels on Jaunty (and the new nvidia drivers) as suggested by martinm1000. Unfortunately, it seems to require also updating the graphics drivers, in my case nvidia.

I've applied this temporary fix a while back, seeing this is taking too long to fix. I also vote for a backport as a temporary fix, instead of having inexperienced users jump through hoops. Most of the solutions posted so far seem to mess with your 3d acceleration, either requiring an update to the video drivers or manual installation. Another reason why i think a backport would be preferred.

I understand package policy would make it difficult for kernel 2.6.29 or newer to make it into jaunty. But isn't that what "jaunty-backports" is for? Using mainline kernels or getting karmic packages is not exactly a 1 click installation, and in fact could break your system. A backport on the other hand can be enabled using the GUI. And could provide an easier fix for those who need it.

The solution i adopted is very similar to having a backport:
1) Add a pin, by editing /etc/apt/preferences
Package: *
Pin: release a=karmic
Pin-Priority: 50

2) Append karmic to your sources.list:
deb http://us.archive.ubuntu.com/ubuntu/ karmic main restricted

3) Update your repositories.
sudo apt-get update

4) Use apt or synaptic to get the packages you want.
linux-image-2.6.31-3-generic
linux-headers-2.6.31-3-generic
linux-headers-2.6.31-3
nvidia-glx-180
nvidia-kernel-common

Revision history for this message

Borph (borph) wrote on 2009-07-17: Re: [Bug 330824] Re: Soft lockups (freezes) when deleting files from ext4 partitions on 2.6.28

#244

2009/7/17 JoseStefan <email address hidden>:
> I've also been using the Karmic kernels on Jaunty (and the new nvidia
> drivers) as suggested by martinm1000. Unfortunately, it seems to require
> also updating the graphics drivers, in my case nvidia.

Because I'm using Nvidia, too, and read about some problems, I went
for 2.6.29 and it worked, I have 3D.

> I understand package policy would make it difficult for kernel 2.6.29 or
> newer to make it into jaunty. But isn't that what "jaunty-backports" is
> for? Using mainline kernels or getting karmic packages is not exactly a
> 1 click installation, and in fact could break your system. A backport on
> the other hand can be enabled using the GUI. And could provide an easier
> fix for those who need it.

I actually didn't even enable jaunty-proposed or jaunty-backport, I
wanted just a normal failsafe ubuntu. It took so much time to figure
out it's actually ext4 causing the troubles! There should be an update
even for users who got scared with the sentence "if you enable
'proposed' or 'backport', your system maybe not stable anymore!".

Your "pin" sounds promising, I will try this. But with care, as it's
currently running! :)

Peter

Revision history for this message

enb (elitenoobboy) wrote on 2009-07-23:

#245

Updating the kernel fixed this for me. Thanks JoseStefan for easy instructions. I think one of the hardest parts of trouble shooting this is that it only seems to happen on certain hardware configurations, which means that initially I thought it was a hardware glitch of some kind due to it not happening on any other computers with almost the same software setup.

Revision history for this message

Wei-Yee Chan (chanweiyee) wrote on 2009-07-29:

#246

This sounds similar to a problem that I experienced yesterday.

I did a fresh installation of Ubuntu 9.04 recently and formatted every partition to ext4. Yesterday, I was moving huge video files from my home directory to a removable USB hard disk (formatted to ext4 as well) when the system froze permanently (i.e. all hard disks stopped running completely). I did this with a couple of my other removable USB hard disks and the same thing happened many times.

The problem can be replicated by copying or moving files within the same IDE hard disk as well. Just a while ago, the system froze when I emptied Trash.

The computer has Windows XP installed, and no such problem problem occurs when I'm running it.

However, as far as I know, I have not experienced any data loss.

With reference to a few of the comments made above, I have more than 40Gb on every partition at any time, so the locking up seems unrelated to the amount of free hard disk space that one has.

Revision history for this message

getaceres (getaceres) wrote on 2009-07-29:

#247

I've installed the kernel in Jaunty proposed some days ago and since then I haven't had any hang. My system seems much more stable now.

Revision history for this message

Keith Moyer (keithmoyer) wrote on 2009-08-11:

#248

I have the -14 kernel, and just hit this bug again last night (actually caused me to lose a fair amount of data).

Are people still looking into this? By most accounts, the "fix committed" doesn't fix the problem.

Revision history for this message

Borph (borph) wrote on 2009-08-11:

#249

@getaceres:
Which kernel version are you using exactly?
Mine is 2.6.29-020629-generic, manually installed. But I would prefer to have a system with standard components. But I don't want to risk loosing my data again.

Revision history for this message

Igor Tarasov (tarasov-igor) wrote on 2009-08-11:

#250

I've tried using latest kernel from proposed (2.6.28-15) but I had two lockups, though they might be not that easy provoking. So, the bug is not fixed, I am back on 2.6.29-02062906

Revision history for this message

Xavier Guillot (valeryan-24) wrote on 2009-08-12:

#251

Since the last updates, it worked better : I could suppress definitively files in Nautilus without crashing.

But one time doing this I got a freeze, and 2 times also during copy / cut - paste of files (around 9 Gb), on a partition with a lot of space available for the first one.

SInce yesterday, due to this recurring problem (and risk of important datas loss), I installed Karmic alpha 3...

Arnaud Faucher (arnaud-faucher) on 2009-08-16

Changed in linux (Ubuntu Jaunty):
status:	Fix Committed → Confirmed

Steve Langasek (vorlon) on 2009-08-17

tags:

added: verification-failed
removed: verification-needed

Revision history for this message

Launchpad Janitor (janitor) wrote on 2009-08-17:

#252

Download full text (7.2 KiB)

This bug was fixed in the package linux - 2.6.28-15.48

---------------
linux (2.6.28-15.48) jaunty-proposed; urgency=low

[ Andy Whitcroft ]

  * SAUCE: pnp: add PNP resource range checking function
    - LP: #349314
  * SAUCE: i915: enable MCHBAR if needed
    - LP: #349314

[ Brad Figg ]

  * SAUCE: Add information to recognize Toshiba Satellite Pro M10 Alps
    Touchpad
    - LP: #330885

[ Colin Ian King ]

* Input: atkbd - add forced release keys quirk for Samsung Q45
- LP: #347623

[ Manoj Iyer ]

* SAUCE: Added quirk to enable the installer to recognize NetXen NIC.
- LP: #389603

[ Stefan Bader ]

* SAUCE: input: Blacklist digitizers from joydev.c
- LP: #300143

[ Tim Gardner ]

  * Revert "SAUCE: md: wait for possible pending deletes after stopping an
    array"
    - LP: #334994

[ Upstream Kernel Changes ]

  * bonding: Fix updating of speed/duplex changes
    - LP: #371651
  * net: fix sctp breakage
    - LP: #371651
  * ipv6: don't use tw net when accounting for recycled tw
    - LP: #371651
  * ipv6: Plug sk_buff leak in ipv6_rcv (net/ipv6/ip6_input.c)
    - LP: #371651
  * netfilter: nf_conntrack_tcp: fix unaligned memory access in tcp_sack
    - LP: #371651
  * xfrm: spin_lock() should be spin_unlock() in xfrm_state.c
    - LP: #371651
  * bridge: bad error handling when adding invalid ether address
    - LP: #371651
  * bas_gigaset: correctly allocate USB interrupt transfer buffer
    - LP: #371651
  * USB: EHCI: add software retry for transaction errors
    - LP: #371651
  * USB: fix USB_STORAGE_CYPRESS_ATACB
    - LP: #371651
  * USB: usb-storage: increase max_sectors for tape drives
    - LP: #371651
  * USB: gadget: fix rndis regression
    - LP: #371651
  * USB: add quirk to avoid config and interface strings
    - LP: #371651
  * cifs: fix buffer format byte on NT Rename/hardlink
    - LP: #371651
  * b43: fix b43_plcp_get_bitrate_idx_ofdm return type
    - LP: #371651
  * Add a missing unlock_kernel() in raw_open()
    - LP: #371651
  * x86, PAT, PCI: Change vma prot in pci_mmap to reflect inherited prot
    - LP: #371651
  * security/smack: fix oops when setting a size 0 SMACK64 xattr
    - LP: #371651
  * x86, setup: mark %esi as clobbered in E820 BIOS call
    - LP: #371651
  * dock: fix dereference after kfree()
    - LP: #371651
  * mm: define a UNIQUE value for AS_UNEVICTABLE flag
    - LP: #371651
  * mm: do_xip_mapping_read: fix length calculation
    - LP: #371651
  * vfs: skip I_CLEAR state inodes
    - LP: #371651
  * net/netrom: Fix socket locking
    - LP: #371651
  * kprobes: Fix locking imbalance in kretprobes
    - LP: #371651
  * netfilter: {ip, ip6, arp}_tables: fix incorrect loop detection
    - LP: #371651
  * ALSA: hda - add missing comma in ad1884_slave_vols
    - LP: #371651
  * SCSI: libiscsi: fix iscsi pool error path
    - LP: #371651
  * SCSI: libiscsi: fix iscsi pool error path again
    - LP: #371651
  * posixtimers, sched: Fix posix clock monotonicity
    - LP: #371651
  * sched: do not count frozen tasks toward load
    - LP: #371651
  * spi: spi_write_then_read() bugfixes
    - LP: #371651
  * powerpc: Fix data-corrupting bug in __futex_atomic_op
    - LP...

This bug was fixed in the package linux - 2.6.28-15.48

---------------
linux (2.6.28-15.48) jaunty-proposed; urgency=low

[ Andy Whitcroft ]

* SAUCE: pnp: add PNP resource range checking function
    - LP: #349314
  * SAUCE: i915: enable MCHBAR if needed
    - LP: #349314

[ Brad Figg ]

* SAUCE: Add information to recognize Toshiba Satellite Pro M10 Alps
    Touchpad
    - LP: #330885

[ Colin Ian King ]

* Input: atkbd - add forced release keys quirk for Samsung Q45
    - LP: #347623

[ Manoj Iyer ]

* SAUCE: Added quirk to enable the installer to recognize NetXen NIC.
    - LP: #389603

[ Stefan Bader ]

* SAUCE: input: Blacklist digitizers from joydev.c
    - LP: #300143

[ Tim Gardner ]

* Revert "SAUCE: md: wait for possible pending deletes after stopping an
    array"
    - LP: #334994

[ Upstream Kernel Changes ]

* bonding: Fix updating of speed/duplex changes
    - LP: #371651
  * net: fix sctp breakage
    - LP: #371651
  * ipv6: don't use tw net when accounting for recycled tw
    - LP: #371651
  * ipv6: Plug sk_buff leak in ipv6_rcv (net/ipv6/ip6_input.c)
    - LP: #371651
  * netfilter: nf_conntrack_tcp: fix unaligned memory access in tcp_sack
    - LP: #371651
  * xfrm: spin_lock() should be spin_unlock() in xfrm_state.c
    - LP: #371651
  * bridge: bad error handling when adding invalid ether address
    - LP: #371651
  * bas_gigaset: correctly allocate USB interrupt transfer buffer
    - LP: #371651
  * USB: EHCI: add software retry for transaction errors
    - LP: #371651
  * USB: fix USB_STORAGE_CYPRESS_ATACB
    - LP: #371651
  * USB: usb-storage: increase max_sectors for tape drives
    - LP: #371651
  * USB: gadget: fix rndis regression
    - LP: #371651
  * USB: add quirk to avoid config and interface strings
    - LP: #371651
  * cifs: fix buffer format byte on NT Rename/hardlink
    - LP: #371651
  * b43: fix b43_plcp_get_bitrate_idx_ofdm return type
    - LP: #371651
  * Add a missing unlock_kernel() in raw_open()
    - LP: #371651
  * x86, PAT, PCI: Change vma prot in pci_mmap to reflect inherited prot
    - LP: #371651
  * security/smack: fix oops when setting a size 0 SMACK64 xattr
    - LP: #371651
  * x86, setup: mark %esi as clobbered in E820 BIOS call
    - LP: #371651
  * dock: fix dereference after kfree()
    - LP: #371651
  * mm: define a UNIQUE value for AS_UNEVICTABLE flag
    - LP: #371651
  * mm: do_xip_mapping_read: fix length calculation
    - LP: #371651
  * vfs: skip I_CLEAR state inodes
    - LP: #371651
  * net/netrom: Fix socket locking
    - LP: #371651
  * kprobes: Fix locking imbalance in kretprobes
    - LP: #371651
  * netfilter: {ip, ip6, arp}_tables: fix incorrect loop detection
    - LP: #371651
  * ALSA: hda - add missing comma in ad1884_slave_vols
    - LP: #371651
  * SCSI: libiscsi: fix iscsi pool error path
    - LP: #371651
  * SCSI: libiscsi: fix iscsi pool error path again
    - LP: #371651
  * posixtimers, sched: Fix posix clock monotonicity
    - LP: #371651
  * sched: do not count frozen tasks toward load
    - LP: #371651
  * spi: spi_write_then_read() bugfixes
    - LP: #371651
  * powerpc: Fix data-corrupting bug in __futex_atomic_op
    - LP: #371651
  * hpt366: fix HPT370 DMA timeouts
    - LP: #371651
  * pata_hpt37x: fix HPT370 DMA timeouts
    - LP: #371651
  * mm: pass correct mm when growing stack
    - LP: #371651
  * SCSI: sg: fix races during device removal
    - LP: #371651
  * SCSI: sg: fix races with ioctl(SG_IO)
    - LP: #371651
  * SCSI: sg: avoid blk_put_request/blk_rq_unmap_user in interrupt
    - LP: #371651
  * usb gadget: fix ethernet link reports to ethtool
    - LP: #371651
  * USB: ftdi_sio: add vendor/project id for JETI specbos 1201 spectrometer
    - LP: #371651
  * USB: fix oops in cdc-wdm in case of malformed descriptors
    - LP: #371651
  * USB: usb-storage: augment unusual_devs entry for Simple Tech/Datafab
    - LP: #371651
  * Input: gameport - fix attach driver code
    - LP: #371651
  * r8169: Reset IntrStatus after chip reset
    - LP: #371651
  * hugetlbfs: return negative error code for bad mount option
    - LP: #371651
  * block: revert part of 18ce3751ccd488c78d3827e9f6bf54e6322676fb
    - LP: #371651
  * anon_inodes: use fops->owner for module refcount
    - LP: #371651
  * KVM: x86: Reset pending/inject NMI state on CPU reset
    - LP: #371651
  * KVM: call kvm_arch_vcpu_reset() instead of the kvm_x86_ops callback
    - LP: #371651
  * KVM: MMU: Extend kvm_mmu_page->slot_bitmap size
    - LP: #371651
  * KVM: VMX: Move private memory slot position
    - LP: #371651
  * KVM: SVM: Set the 'g' bit of the cs selector for cross-vendor migration
    - LP: #371651
  * KVM: SVM: Set the 'busy' flag of the TR selector
    - LP: #371651
  * KVM: MMU: Fix aliased gfns treated as unaliased
    - LP: #371651
  * KVM: Fix cpuid leaf 0xb loop termination
    - LP: #371651
  * KVM: Fix cpuid iteration on multiple leaves per eac
    - LP: #371651
  * KVM: Prevent trace call into unloaded module text
    - LP: #371651
  * KVM: Really remove a slot when a user ask us so
    - LP: #371651
  * KVM: x86 emulator: Fix handling of VMMCALL instruction
    - LP: #371651
  * KVM: set owner of cpu and vm file operations
    - LP: #371651
  * KVM: Advertise the bug in memory region destruction as fixed
    - LP: #371651
  * KVM: MMU: check for present pdptr shadow page in walk_shadow
    - LP: #371651
  * KVM: MMU: handle large host sptes on invlpg/resync
    - LP: #371651
  * KVM: mmu_notifiers release method
    - LP: #371651
  * KVM: PIT: fix i8254 pending count read
    - LP: #371651
  * KVM: x86: disable kvmclock on non constant TSC hosts
    - LP: #371651
  * KVM: x86: fix LAPIC pending count calculation
    - LP: #371651
  * KVM: VMX: Flush volatile msrs before emulating rdmsr
    - LP: #371651
  * ath9k: implement IO serialization
    - LP: #371651
  * ath9k: AR9280 PCI devices must serialize IO as well
    - LP: #371651
  * md: fix deadlock when stopping arrays
    - LP: #334994
  * block: include empty disks in /proc/diskstats
    - LP: #371651
  * powerpc: Sanitize stack pointer in signal handling code
    - LP: #371651
  * fs core fixes
    - LP: #371651
  * fix ptrace slowness
    - LP: #371651
  * crypto: ixp4xx - Fix handling of chained sg buffers
    - LP: #371651
  * PCI: fix incorrect mask of PM No_Soft_Reset bit
    - LP: #371651
  * b44: Use kernel DMA addresses for the kernel DMA API
    - LP: #371651
  * thinkpad-acpi: fix LED blinking through timer trigger
    - LP: #371651
  * Linux 2.6.28.10
    - LP: #371651
  * ext4: fix locking typo in mballoc which could cause soft lockup hangs
    - LP: #330824, #371651
  * V4L/DVB (9667): Fixed typo in sizeof() causing NULL pointer OOPS
    - LP: #316405
  * ALSA: hdsp - poll for iobox
    - LP: #363003
  * revalidate parent inode when rmdir done within that directory
    - LP: #317274
  * ext4: Fix race in ext4_inode_info.i_cached_extent
    - LP: #389555
  * V4L/DVB (9848): gspca: Webcam 06f8:3004 added in sonixj.
    - LP: #374122
  * kernel/resource.c: fix sign extension in reserve_setup()
    - LP: #370003
  * iwl3945: release resources before shutting down
    - LP: #345710
  * iwl3945: use cancel_delayed_work_sync to cancel rfkill_poll
    - LP: #345710

-- Stefan Bader <stefan.bader@canonical.com>   Mon, 01 Jun 2009 17:25:15 +0200

Changed in linux (Ubuntu Jaunty):
status:	Confirmed → Fix Released

Revision history for this message

Steve Langasek (vorlon) wrote on 2009-08-18:

#253

verification failed, but the patch doesn't appear to have introduced regressions, so the updated kernel has been published to jaunty-updates. Resetting for the next pass.

Changed in linux (Ubuntu Jaunty):
status:	Fix Released → Confirmed
tags:	removed: verification-failed

Revision history for this message

Phil Norbeck (ptn107) wrote on 2009-08-25:

#254

logs.tar.bz2 Edit (66.9 KiB, application/octet-stream)

I can reproduce this every single time when deleting large files from ext3 partitions as well as ext4. I have too noticed that it is easier to reproduce when the working partition is low on free space. In my case though when reviewing the log files each soft lockup instance has lines in common relating to 'eCryptfs'. My other kernels 2.6.29.6 and 2.6.30.5 do not have this problem.

Logs attached.

Ubuntu 9.04 x86_64
Linux phil-desktop 2.6.28-15-generic #49-Ubuntu SMP Tue Aug 18 19:25:34 UTC 2009 x86_64 GNU/Linux

Revision history for this message

santiago (santiagozky) wrote on 2009-08-30:

#255

Im running a fully updated Jaunty and I am still experiencing lockups when deleting large files/directories. Any idea of when will have a fix release for jaunty?

Revision history for this message

Theodore Ts'o (tytso) wrote on 2009-08-30:

#256

At this point, it seems pretty clear to me that no one is really working on this for Jaunty; if you must use Januty, the only thing I can suggest is to use a mainline kernel --- any mainline kernel, whether it is 2.6.28, 2.6.29, or 2.6.30 will work fine. The problem seems to be in Canonical's backports of patches to the 2.6.28 kernel, and the only people who could work on it are busy working on the Karmic release and/or the Karmic kernel. Those of us (like myself) who are working on the upstream ext4 are busy working on the latest set of improvements and bug fixes that will go into 2.6.31 or 2.6.32.

For those of you who need some proprietary drivers, I'm sorry to say, the only thing you can really do is wait for them to become ported to the Karmic kernel (or port them yourself).

Revision history for this message

Andrew Berry (andrewberry) wrote on 2009-09-01:

#257

Is there a list somewhere of notable patches / features which Canonical has integrated into their kernel? I'd like to switch to a mainline kernel to avoid this bug (which is still affecting me), but want to be sure I'm not missing anything critical which Canonical has changed.

Revision history for this message

papukaija (papukaija) wrote on 2009-09-04:

#258

Should we close this bug for Jaunty as no one is working for it (see comment 256) ?

Revision history for this message

Saivann Carignan (oxmosys) wrote on 2009-09-04:

#259

No, Jaunty is still supported (it's still the latest release) and the bug is still confirmed, therefore closing it would be inappropriate. It would also don't help developers to track the bug and work on it later.

Revision history for this message

tiagolp (tiagolp) wrote on 2009-09-09:

#260

mounting the ext4 filesystem with the mount options "sync,barrier=1" seems to solve the problem on my case (2.6.28-15-generic).

Revision history for this message

Logicwax (logicwax) wrote on 2009-09-10:

#261

thanks taigolp! I can confirm as well that mounting my native ext4 with "sync,barrier=1" option in my fstab solves the problem on Jaunty.

Revision history for this message

Logicwax (logicwax) wrote on 2009-09-10:

#262

actually I'm sorry, I take that back. I was trying to rm -rf over 1.3TB of data, composed of over 17,000 sub directories each a dozen or so files located inside.

I too had complete system lock-up when I would try deleting them (moving and copying was fine).

I tried to move the directories in blocks of about 100 or so to another directory, then tried deleting those. I had the same lockup issues.

The method that taigolp proposed helped a lot, but didn't completely solve my problem. While I could delete about a 100 or so directories now, I still can't delete the entire 17k directory tree without a full lock-up.

for the record, I'm running jaunty 32-bit, 2.6.28-15-generic. ext4 native on a LVM volume spanned across two 1.5TB sata drives on a silicon image SATA pci card.

Revision history for this message

Andrew Berry (andrewberry) wrote on 2009-10-05:

#263

It seems to me that this is fixed in the patches committed from #418197. Can anyone else confirm? I was able to delete around 2.6 million links and files in a single rm -rf, which would previously cause a lockup in a minute or two.

Revision history for this message

Andrew Berry (andrewberry) wrote on 2009-10-05:

#264

Link since comments don't autolink to bug numbers: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/418197

Revision history for this message

Rene (g.xrc) wrote on 2009-10-16:

#265

Since I upgraded to
Linux rgm 2.6.28-15-generic #52-Ubuntu SMP Wed Sep 9 10:49:34 UTC 2009 i686 GNU/
no freeze when deleting big files together (> 1GB)
no "BUG: soft locking - CPU#0 stuck for 61s! [uic: 5356]"
mean 2 PC had the problem, 2PC solved !!!
Previously I had to switch to mainline kernel (I chose 2.6.30.6).
Thank you.

Andrius Štikonas (stikonas) on 2009-10-16

Changed in linux (Ubuntu Jaunty):
status:	Confirmed → Fix Released

Revision history for this message

ViPeRaY (mail-erayyilmaz) wrote on 2010-01-07:

#266

It seems like the fix has been released for this but I am still having this problem. I can copy large files (around 15-20 gig) to a NTFS hard drive and there is no problem. However when I try to copy same files to an internal hard drive which uses ext4, the system freezes. I am using Karmic with kernel 2.6.31-16-generic.

My question is, how do I get the fix? I get auto updates but do I have to manually install the fix? And where is the patch files are located?

Thanks,

Revision history for this message

enb (elitenoobboy) wrote on 2010-01-07:

#267

"However when I try to copy same files to an internal hard drive which uses ext4, the system freezes."

This would be a different bug, as this bug only occurs when removing files.

"My question is, how do I get the fix?"

It looks like the latest karmic kernel release is 2.6.31-17. You might want to try installing that.

If that doesn't work, and assuming that it really is a kernel problem and not caused by something else, you could try the 2.6.32 kernel from lucid's repository, though since lucid is still in alpha stages, it might be best to find out if it really is being caused by the kernel first.

Revision history for this message

hoover (uwe-schuerkamp) wrote on 2011-01-23:

#268

I have experienced a similar bug removing largish video files (about 4GB or so) from an internal SATA drive formatted with an xfs filesystem.

Sometimes when doing an "rm -rf" on a directory on that file system, the rm will hang and remain pegged at 100% cpu usage. As opposed to other posters in this thread, I don't see any suspicious messages in dmesg about hangs or timeouts, and usually I'm able to "rm -rf" the directory in question from another terminal session without a hang.

The only thing that kills the rm is a reboot, kill -9, Ctrl-C and so on all won't work on that process.

Please let me know if you need any further logs, I'm running kernel 2.6.32-27-generic #49-Ubuntu SMP Wed Dec 1 23:52:12 UTC 2010 i686 GNU/Linux on Linux Mint10 which is based on Maverick 10.10.

Revision history for this message

reini (rrumberger) wrote on 2011-01-24:

#269

Since this report is about ext4 and you're having problems with xfs, you really should open a separate report...

Revision history for this message

pritam ghanghas (pritam-ghanghas) wrote on 2012-09-06: Invitation to connect on LinkedIn

#270

LinkedIn
------------

Bug,

I'd like to add you to my professional network on LinkedIn.

- Pritam

Pritam Ghanghas
Technology specialist at Infosys
Bengaluru Area, India

Confirm that you know Pritam Ghanghas:
https://www.linkedin.com/e/-xbysru-h6rdrmt4-2s/isd/8524018569/kdL7IApK/?hs=false&tok=1nzfTR9Cd5ylo1

--
You are receiving Invitation to Connect emails. Click to unsubscribe:
http://www.linkedin.com/e/-xbysru-h6rdrmt4-2s/u8T3vuO4neBI5tyng4kKHld4Y3irWqJhOpbybZf/goo/330824%40bugs%2Elaunchpad%2Enet/20061/I2866543655_1/?hs=false&tok=2vmglBmsx5ylo1

Ubuntu
linux package

Soft lockups (freezes) when deleting files from ext4 partitions on 2.6.28

Bug Description

Duplicates of this bug

Other bug subscribers

Patches

Bug attachments

Remote bug watches

	Status	Importance	Assigned to
Release Notes for Ubuntu	Fix Released	Undecided	Unassigned
linux (Ubuntu)	Fix Released	Medium	Tim Gardner
Jaunty	Fix Released	Medium	Tim Gardner
Karmic	Fix Released	Medium	Tim Gardner

Ubuntulinux package

Soft lockups (freezes) when deleting files from ext4 partitions on 2.6.28

Bug Description

Duplicates of this bug

Other bug subscribers

Patches

Bug attachments

Remote bug watches

Ubuntu
linux package