Hardy Freeze/Lockup with kernel 2.6.24-16-generic up to 2.6.24-19-generic

Bug #239021 reported by DJ Dallas
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Hi

I am experiencing complete lockup of my system at seemingly random intervals. The time from system start to crash, varies from minutes to hours. Applications running are completely varied. System, processor, memory, swap, and network load (no wireless, only LAN) are also completely varied. The screen freezes and there is no response to any keyboard or mouse input. Keyboard status lights are as before the crash (not flashing, etc) and optical sensor on mouse stays at same intensity (normally dims or brightens depending on whether in use or not). Both are USB, and unplugging and reconnecting leaves them without power. The only way to recover is by holding the chassis power button in. Restart is normal with no noticeable BIOS faults.

The only thing which seems to happen in every crash, is that the crash takes place during a low-level of disk activity. The disk activity fails to register on the task-bar system monitor that I set up (why is that?). The disk activity lasts several seconds at a time and seems to occur every couple of minutes regardless of whether the machine is in use or not. The machine doesn't't crash every time this happens. The activity sound is different to that of the rapid clicking sound when the disk is reading and writing in multiple locations (ie opening an app or saving a document etc). The sound is more of a low background buzz, similar to when defragmenting a drive in windows or running long disk checks. I have no idea what is accessing the drive during this time. The disk activity continues for at least a few seconds after the freeze.

I cannot make the computer crash or reproduce the fault. There are also no crash reports in any of the suggested locations. The machine is fully updated and has been through every kernel and package released, including those in Hardy proposed. I have tried changing what BIOS settings are changeable.

Odd things I've noticed:
During startup there is a considerable delay while "HALD" loads. The Ubuntu splash screen progress bar stops, and after a few seconds changes to a text output. A few seconds later it marks HALD as "OK" and continues loading.
During one crash I had system monitor open and 4 instances of the "hald-addon-stor" process were visible on the list (there are more because of card reader but list was sorted for CPU activity). During the disk activity mentioned the processes changed rapidly from sleeping to uninteruptable 3 times and then everything froze. The machine crashed again a few minutes later, but process list was sorted alphabetically and hald seemed to stay sleeping.

Gutsy was completely stable on the same system, although it was on an IDE drive rather than SATA. However, the Heron Live-CD would crash even if both drives were disconnected. The Heron install is a clean install from the alternate install CD, not an upgrade.

Things I'm still going to try:
Removing pci modem card (not currently in use anyway).
Removing pci TV tuner and video capture card (also not currently in use).
Downgrading from latest "Vista" BIOS to last released "XP" BIOS.
Trying latest kernel in kernel-ppa
Downgrading to Gutsy kernel.

System specs:
Intel Core 2 Duo E6600 2.4GHz
1 GB DDR2 533 MHz RAM
Seagate Barracuda 7200.10 320 GB Hard Disk SATA2
MSI MS-7301 Motherboard
Onboard VIA LAN, Sound and Firewire
Motorola MSP2900-W(M) modem
Asustek TV-7131/FM Hybrid TV Tuner (with video capture)
Nvidia 8600 GT 256MB RAM

I have different hardware to Bug #204996 and others, hence the new report. Anyway, I see that Bug #204996 is marked as fix released against Intrepid. I had no intention of upgrading to Intrepid and wanted to stay with a LTS release. Is there a fix for Hardy users, and if kernels 25 and 26 do solve the problem, will they ever be released for Hardy?

Thanks,
Dave

Tags: cft-2.6.27
Revision history for this message
DJ Dallas (djdallas) wrote :
Revision history for this message
DJ Dallas (djdallas) wrote :
Revision history for this message
DJ Dallas (djdallas) wrote :
Revision history for this message
DJ Dallas (djdallas) wrote :
Revision history for this message
DJ Dallas (djdallas) wrote :
Revision history for this message
DJ Dallas (djdallas) wrote :

Forgot to mention that I have also tried removing the Nvidia drivers and disabling Compiz. Tried Kubuntu Heron before the final clean install of Ubuntu Heron, but it crashed more frequently.

Thanks,
Dave

Revision history for this message
Lukáš Chmela (lukaschmela) wrote :

The same happens to me and Gutsy was stable for me too. I read a few similar threads and I think, it's really a bug in the 2.6.24 Linux kernel. I also saw an error report on my server variant of Hardy. It was printed on the screen, but it was too long and as in the desktop variand, it wasn't loggen anywhere. It just printed some backtrace of an kernel error.

Revision history for this message
Supersaiyan_IV (saiyan-iv) wrote :

I can confirm that the 2.6.24-16-generic (hardy) Linux kernel running under Gutsy had no such issues. However, appeared only when I reinstalled with Hardy. Just like DJ Dallas, I cannot reproduce the bug at will, nor can I trace it in any way. The only difference is that the heavy freeze doesn't actually crash the computer. If I have music on while the system freezes, the sound loops 5-6 times, then the CPU calms down and works normally without any need to restart. This occurs about 2-3times/h without pattern, except the 100% CPU and complete unresponsiveness of keyboard/touchpad.

Revision history for this message
Supersaiyan_IV (saiyan-iv) wrote :

Under the assumption that this is related to harddrive (sata) I've tried changing fstab to not use the realtime function, but with no avail. The 2.6.25-7/8 kernels did not contribute to any improvement either.

However there's a major clue (ugly fix),
echo 1 > /proc/sys/vm/block_dump
this triggers repetitive debug writes to disk, thus works like a "keep-alive". The funny thing is that the freezes cease when the block_dump is enabled. If the type of freeze that you experience is akin of mine then this should work. But this is nothing I would recommend. Also, I don't have tested this long enough to be sure, but 1h+ without freeze is sure a great improvement in my case.

Revision history for this message
tsultana (tsultana) wrote :

I have the same problem and also think it is a problem with the 2.6.24 kernel. I have tried 2.6.24-17, -18, and -19 with the same results.

I have disabled compiz, nvidia proprietary drivers, and Open GL screensavers but still have freezes. Trying Alt+SysRq REISUB does not work since my mouse and keyboard are both USB and I get no response from them.

I previously had Gutsy without problems. I upgraded from the desktop-i386 disk.

I have reinstalled Gutsy on a separate partition and it is working OK (2.6.22-14).

ASRock 4CoreDual-SATA2 motherboard with onboard audio, serial, parallel, midi, and game ports disabled.
IDE Drives Only (not using any SATA)
E6600 Dual Core 2.4 MHz
NVidia GeForce 7600 GS AGP
Soundblaster Audigy LS PCI
PCI USB/Firewire Card

I attached a sysinfo dump of my configuration.

Tony

Revision history for this message
exidez (exidez) wrote :

Same problem as above.
I have upgraded to kernel 2.6.24-19-generic and still no fix.
However, i CAN reproduce this bug. I have found every time i load up GNU cash (my finance program) and it load the tip of the day information box it freezes. I had also noticed that when i clicked the information box (that little yellow light bulb) on the top right panel it would begin loading the dialog box and then freeze half way. The information on the box will not be show, just the gray box.
Whats the deal with that? I don't even know where to start with this!

Revision history for this message
exidez (exidez) wrote :

Another way i can reproduce this bug:

Hardy also freezes when i open up a terminal and press the down arrow to move to a more recently used command. It will only freeze if there are no more future commands to cycle through...
ie. i can press up 4 times to cycle through to the forth recently used command and then i can cycle down three times to cycle to the last used command. I can then press down again to get a clear line where i started from but if i press down again (5th time) it will freeze (there are no more commands beyond this point anyway).

Does this happen to anyone else???

Revision history for this message
tsultana (tsultana) wrote :

I use GNU Cash and tried your steps above but do not get a lockup. I also tried the terminal moving through the command list but also did not experience any problems.

I still have disabled compiz, NVidia proprietary drivers, and Open GL screensavers. Since disabling all of those I have experienced lockups only 4 or 5 times in the past 3 weeks.

When the lockups occurred they tended to happen in spurts with at least a couple back to back (within a 1/2 to 1 hour of each other).

Revision history for this message
tsultana (tsultana) wrote :

I have tried compiling the 2.6.25 kernel and running with that under Hardy. If it didn't introduce other problems too, I still had it lock up within an hour of boot.

I use the directions from Negative Zero: Ubuntu: Kernel 2.6.25 on Hardy at
http://blog.gunbladeiv.com/2008/05/ubuntu-kernel-2625-on-hardy.html
for compiling the kernel. I added ALSA and Intel HDA when I compiled, for sound.

Has anyone else tried 2.6.25 under Hardy?

Revision history for this message
tsultana (tsultana) wrote :

I have not had any lockups. I reenabled Nvidia and compiz but turned off ACPI by changing display inactive to Never. Since then I have not had any lockups. I am back to running 2.6.24.

Also an update 2 days ago updated the ACPI as well. This problem might be corrected.

My computer has been up and running now for 1 days 1 h 43 min.

Revision history for this message
Miguel Branco (mpbbranco) wrote :

Seems like a I have the same issue. Had had in the past some rare lockups but since Hardy 2.6.24 (and I guess since update to .1 - or -19 - even more) I've had more frequent lock-ups. Hard drive led starts blinking fast as for a low-level routine and nothing happens, complete freeze, only sys-alt-b works and in general when I reboot I happen to have a kernel panic for out of sync. Nothing works then included power button, have to take out the battery of my laptop !

Today happened twice, for some reason when opening Trash folder in nautilus and trying to run the script to have root rights in the window to restore a resilient file. Was able to "reproduce" (actually not intentionally since I never imagined this could be the cause of it). The other times seemed more linked to use of Firefox.

Is it related to acpi, should i set it off ?

Revision history for this message
Scott A (scalexan) wrote :

I too have this problem. Seems to happen at random times. I can't reproduce. I first thought that it was a heat issue but I ran some tests where I really loaded the system and run up the temps and nothing happened. No logs to see what happened. I've attached my sysinfo output.

Revision history for this message
Dbenitez (diegobenitezc) wrote :

I am also getting lockups, but within really long intervals (it varies between 2-10 days between lockups) ... ive tried a bunch of things to reproduce but its solid as a rock (until it locks ) ... this is on 64bit HH with 2.6.24-19, intel q6600 ... i will attach log files the next time it locks up, though i dont think anything useful is showing up in any of them ...

do the lockups happen more frequently to 32bit users than 64bit? could that be a relevant factor?

Revision history for this message
tsultana (tsultana) wrote :

My run of good luck changed with the first lockup in 2 weeks. I think the last update for ACPI mostly corrected my problem so I think this is a multi-symptom issue.

I am still using Nvidia proprietary drivers and compiz and have ACPI turned on with the display to sleep in 11 minutes on AC (desktop). I am back to running 32-bit 2.6.24.20 since my attempt with 2.6.25 showed no previous improvement.

Revision history for this message
DJ Dallas (djdallas) wrote :

Hi again

I've pretty much given up on Hardy. Occasionally I try something else which has been mentioned by others, but nothing has worked for me so far. I have decided not to compile my own kernel based on the feedback of others, and will wait for the new kernel in Intrepid Ibis or Hardy Backports.

Crashes occur on SATA, IDE, USB, and RAM drives with all disks unplugged. So it seems the disk activity thing that I mentioned seems to be a dead end and may not be relevant. Sorry about that.

I installed and updated the 8.04.1 release and was nearly convinced that the issue was resolved. The machine was left on and ran for several days with no issues. Once I started using it to do some fairly normal desktop work (Open Office, Firefox etc), it began to lock-up repeatedly.

The 2.6.25 kernel was solid in Fedora 9 and OpenSUSE 11 when installed on my machine. Nvidia (propriety drivers) and Compiz all seemed fine.

I haven't looked very hard at machine specs in this thread, but Intel 6600 seems to feature more than once. Also noticed in this thread are newer model Nvidia cards. I know that these specs are not relevant in some of the other crash report threads, but maybe this really is a different issue. Something about introducing Nvidia modules into 2.6.24 kernel on the Intel 6600? Might be significant, might not....

Good luck!

Revision history for this message
tsultana (tsultana) wrote :

Could this be a USB issue? My keyboard and mouse are USB so when the lockup occurs I lose both of them. If anyone is running PS2 mouse and keyboards, have you been able to get your system out of the lockup?

I am going to put a PS2 keyboard on my computer and plug my mouse into a PS2 adapter to see if it helps. Does anyone know though if PS2 is separate hardware from USB?

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

Revision history for this message
tsultana (tsultana) wrote :

Where is 2.6.27 for Hardy? I went to Software Sources to add backport and Normal releases but neither made this update visible.

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Sorry for any confusion, 2.6.27 is only available in the Intrepid repository. If you are unfamiliar with how to enable the Intrepid repository, I'd suggest waiting to test with the Alpha5 LiveCD. Thanks.

Revision history for this message
DJ Dallas (djdallas) wrote :

Hi again.

Sorry it's taken a while, but I've tested the new kernel as per the request, and the short answer is my problem is fixed by the 2.6.27 kernel. The rest of this write-up is what I did and found during testing.

In Hardy I simply installed the linux-image-2.6.27-2-generic deb package from the Intrepid archive. The freezing/lockups stopped and the system was completely stable. There were obviously a few minor issues, because installing linux-restricted-modules etc wasn't an option. Restarting Hardy with the standard kernel brought back the frequent crashes.

Intrepid Alpha5 was also stable apart from a few bug related application crashes (but not system crashes). Updating the packages to the latest in the repository, which included the updated 2.6.27-3 kernel, was also fine and I haven't had a single crash.

Installing the Nvidia restricted drivers (and so enabling Compiz) had no detrimental effect. I am also happy to report that for the first time, I am able to use the suspend and hibernate options! (Only works with Nvidia restricted drivers installed.)

Alpha5 has won me over to Intrepid, but for anyone with the same problem as mine and preferring an LTS release, it would seem that a backport of the 2.6.27 kernel would fix the problem in Hardy.

Let me know if there's anything else needed concerning this bug. I'm happy to mark the this bug as solved on release of Intrepid.

Thanks to the many developers involved and keep up the good work!

David

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks for the update DJ. I'm marking this "Fix Released" for Intrepid. Thanks.

Changed in linux:
status: New → Fix Released
Revision history for this message
eschwab (eschwab) wrote :

I have been experiencing a hang or two a week in my server farm seemingly due to this issue, headless Dell 1950s, SAS drives...previously stable under Feisty.

Load averages will start to creep up, top will indicate increasing iowait, but iostat shows very low io %util. Machine will become unresponsive, nothing logged to dmesg or syslogs, nothing logged on console.

I too would love to see Hardy/LTS for 2.6.27.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.