Oops (seemingly at random)

Bug #85500 reported by Glyph Lefkowitz
12
Affects Status Importance Assigned to Milestone
linux-source-2.6.17 (Ubuntu)
Won't Fix
Undecided
Unassigned
linux-source-2.6.20 (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Binary package hint: linux-image-2.6.17-11-generic

I clicked a mouse button, the machine froze, and I noticed this in /var/log/kern.log upon reboot. I hope it will prove useful. The machine was otherwise quiescent (and this is the only Oops visible in its log). I recently upgraded to edgy and it has seemed stable for several days.

Feb 16 00:27:21 legion kernel: [17349761.120000] BUG: unable to handle kernel NULL pointer dereference at virtual address 00000044
Feb 16 00:27:21 legion kernel: [17349761.120000] printing eip:
Feb 16 00:27:21 legion kernel: [17349761.120000] c02dabf0
Feb 16 00:27:21 legion kernel: [17349761.120000] *pde = 00000000
Feb 16 00:27:21 legion kernel: [17349761.120000] Oops: 0002 [#1]
Feb 16 00:27:21 legion kernel: [17349761.120000] SMP
Feb 16 00:27:21 legion kernel: [17349761.120000] Modules linked in: nls_utf8 vfat fat usb_storage libusual joydev binfmt_misc rfcomm hidp l2cap speedstep_lib cpufreq_userspace cpufreq_stats freq_table cpufreq_p
owersave cpufreq_ondemand cpufreq_conservative video tc1100_wmi sbs sony_acpi pcc_acpi i2c_ec hotkey dev_acpi button battery container ac asus_acpi ntfs ipv6 nls_cp437 cifs dm_mod md_mod fuse sr_mod sbp2 lp af_
packet snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_emu10k1 snd_ac97_codec snd_ac97_bus snd_util_mem snd_seq_dummy snd_seq_oss snd_seq_midi snd_seq_midi_event snd_usb_audio snd_pcm_oss
 snd_mixer_oss snd_seq snd_pcm snd_page_alloc snd_usb_lib snd_rawmidi snd_hwdep tsdev nvidia snd_timer snd_seq_device hci_usb sg i2c_core bluetooth usbhid snd emu10k1_gp gameport soundcore r1000 r8169 shpchp fl
oppy psmouse serio_raw pci_hotplug parport_pc parport hw_random intel_agp agpgart evdev pcspkr ext3 jbd ohci1394 ieee1394 ehci_hcd uhci_hcd usbcore ide_generic sd_mod ata_piix libata scs
Feb 16 00:27:21 legion kernel: _mod ide_cd cdrom piix generic thermal processor fan fbcon tileblit font bitblit softcursor vesafb capability commoncap
Feb 16 00:27:21 legion kernel: [17349761.120000] CPU: 1
Feb 16 00:27:21 legion kernel: [17349761.120000] EIP: 0060:[_spin_lock+0/16] Tainted: P VLI
Feb 16 00:27:21 legion kernel: [17349761.120000] EFLAGS: 00010202 (2.6.17-11-generic #2)
Feb 16 00:27:21 legion kernel: [17349761.120000] EIP is at _spin_lock+0x0/0x10
Feb 16 00:27:21 legion kernel: [17349761.120000] eax: 00000044 ebx: c7a0ce80 ecx: c0371594 edx: 00000001
Feb 16 00:27:21 legion kernel: [17349761.120000] esi: c7a0cf70 edi: 00000000 ebp: df84ff24 esp: df84ff08
Feb 16 00:27:21 legion kernel: [17349761.120000] ds: 007b es: 007b ss: 0068
Feb 16 00:27:21 legion kernel: [17349761.120000] Process kswapd0 (pid: 161, threadinfo=df84e000 task=dff61560)
Feb 16 00:27:21 legion kernel: [17349761.120000] Stack: c016e7e4 00000080 c7a0ce80 00000072 c0184d63 00000000 00000072 ce9602d0
Feb 16 00:27:21 legion kernel: [17349761.120000] cdd57aa0 00006aa4 00000081 dfffeaa0 000000d0 c0154c17 00013d21 0000664c
Feb 16 00:27:21 legion kernel: [17349761.120000] 00000000 00000008 00000000 00000000 00000080 00035473 00000000 00000003
Feb 16 00:27:21 legion kernel: [17349761.120000] Call Trace:
Feb 16 00:27:21 legion kernel: [17349761.120000] <c016e7e4> remove_inode_buffers+0x34/0x80 <c0184d63> shrink_icache_memory+0x1b3/0x240
Feb 16 00:27:21 legion kernel: [17349761.120000] <c0154c17> shrink_slab+0x117/0x180 <c01559f4> kswapd+0x354/0x440
Feb 16 00:27:21 legion kernel: [17349761.120000] <c0136180> autoremove_wake_function+0x0/0x50 <c01556a0> kswapd+0x0/0x440
Feb 16 00:27:21 legion kernel: [17349761.120000] <c0101005> kernel_thread_helper+0x5/0x10
Feb 16 00:27:21 legion kernel: [17349761.120000] Code: f0 fe 0a 79 1b a9 00 02 00 00 74 0b fb f3 90 80 3a 00 7e f9 fa eb e9 f3 90 80 3a 00 7f e2 eb f7 c3 8d 76 00 8d bc 27 00 00 00 00 <f0> fe 08 79 09 f3 90 80
38 00 7e f9 eb f2 c3 90 f0 81 28 00 00
Feb 16 00:27:21 legion kernel: [17349761.120000] EIP: [_spin_lock+0/16] _spin_lock+0x0/0x10 SS:ESP 0068:df84ff08

Tags: kernel-oops
Revision history for this message
Cristian Aravena Romero (caravena) wrote :

Thanks for taking the time to report this bug. Unfortunately we can't fix it, because your description didn't include enough information.

Please include the following additional information, if you have not already done so (please pay attention to lspci's additional options), as required by the Ubuntu Kernel Team:
1. Please include the output of the command "uname -a" in your next response. It should be one, long line of text which includes the exact kernel version you're running, as well as the CPU architecture.
2. Please run the command "dmesg > dmesg.log" and attach the resulting file "dmesg.log" to this bug report.
3. Please run the command "lspci -vvnn > lspci-vvnn.log" and attach the resulting file "lspci-vvnn.log" to this bug report.

For your reference, the full description of procedures for kernel-related bug reports is available here: <http://wiki.ubuntu.com/DebuggingKernelProblems> Thanks!

Changed in linux-source-2.6.17:
status: Unconfirmed → Needs Info
Revision history for this message
Alexandre Payment (alp) wrote :

I also got this problem with feisty. Before fesity this pc was with dapper and it was very stable, but since feisty it crash/freeze alot.

kswapd0 is most of the time involve.

uname -a:
Linux mars 2.6.20-15-generic #2 SMP Sun Apr 15 07:36:31 UTC 2007 i686 GNU/Linux

Revision history for this message
Alexandre Payment (alp) wrote :
Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Glyph:
Do you get these crashes if you disable the nvidia binary drivers?

Changed in linux-source-2.6.20:
status: Unconfirmed → Needs Info
Revision history for this message
Alexandre Payment (alp) wrote :

Sitsofe:

For my part I can't disable nvidia binary because the xorg nv driver freeze my system after logging.
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-nv/+bug/45797

The only way I can disable nvidia binary is to use vesa?

Revision history for this message
Alexandre Payment (alp) wrote :

Maybe this is related http://lkml.org/lkml/2007/1/13/29

Revision history for this message
Alexandre Payment (alp) wrote :

2 days with vesa and no Oops. So maybe the nvidia binary drivers are part of the problem.

I would like to try with nv drivers, but because of the problem mention in bug 45797 I can't.

Is there any other drivers I can try beside vesa and nv?

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Alexandre:
Alas no. I'll have a think about bug #45797 but I don't have any truly good ideas other than playing around with vga= settings and disabling splash in grub.

Changed in linux-source-2.6.20:
status: Needs Info → Unconfirmed
Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Alexandre:
I suspect your issue is different to the one reported in this bug. Please file a new bug report for your issue.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Glyph:
Can you reproduce this problem in Feisty?

Changed in linux-source-2.6.20:
status: Unconfirmed → Needs Info
Revision history for this message
Glyph Lefkowitz (glyph) wrote :

Sitsofe:

Sadly, reproducing the problem is extremely difficult. The machine continues to crash at regular intervals under Feisty.

Disabling the nvidia drivers seems to increase the amount of time before the crash, but doesn't stop it. Given that I can't deterministically reproduce the problem, though, I have only the vague impression of the passage of time, and only a few incidents. Collecting data on a crash that happens only once every 2-3 days is extremely time-consuming.

I haven't yet had the opportunity to downgrade the machine to Dapper to test it with.

Thanks for your investigations, but I am prepared to write this off as some kind of obscure hardware failure. None of my other machines (many of them configured nearly identically) have had this problem.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Glyph:
OK thanks for following up on this. Intermittent errors like this are often a sign of hardware failure. I would recommend reading http://people.redhat.com/davej/hardware-problems.txt and making your way down the checklist of tests (especially running memtest for a day). When you do please report back...

Revision history for this message
Glyph Lefkowitz (glyph) wrote :

Sitsofe:

No problem. I would really like to get to the bottom of it if there is in fact a software issue to help with here.

I've already done that checklist (among others) and there is nothing wrong with the machine that these tools can detect.

I am beginning to suspect that the problem may be bad *video* memory, which causes the graphics driver to interfere with the rest of the kernel (especially since the crash with the 'nv' driver still happens, but in a different way). Is there any tool to test for that specifically?

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Glyph:
None that I know of. For problems like that you wind up having to swap out individual pieces of hardware and then seeing whether the problem goes away. If it's a software issue you need to find the sequence of events that usually triggers the problem...

Changed in linux-source-2.6.20:
status: Needs Info → Unconfirmed
Changed in linux-source-2.6.17:
status: Needs Info → Unconfirmed
Revision history for this message
Thomas Hallgren (thomas-h) wrote :

This seems very similar to my experience with feisty. I had been using feisty since the middle of January and I think the problem started around kernel version 2.6.20-9.

I have not seen the problem after switching back to edgy. I am currently using Fedora 7 test 4, and I don't get any random oopses or crashes here either. So I don't think it is a hardware problem.

I filed a bug report about a month ago:

https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/108953

Revision history for this message
Philip Aston (philipa) wrote :
Download full text (4.7 KiB)

I'm also getting similar oops, about one a month.

Feisty, Linux version 2.6.20-16-generic (root@terranova) (gcc version 4.1.2 (Ubuntu 4.1.2-0ubuntu4)) #2 SMP Thu Jun 7 20:19:32 UTC 2007

I've run memtest86 with everything turned on, but failed to find any hardware memory problems.

Example oops:

Jun 16 10:06:29 paston01 kernel: [131511.196000] BUG: unable to handle kernel NULL pointer dereference at virtual address 00000044
Jun 16 10:06:29 paston01 kernel: [131511.196000] printing eip:
Jun 16 10:06:29 paston01 kernel: [131511.196000] c02ee810
Jun 16 10:06:29 paston01 kernel: [131511.196000] *pde = 00000000
Jun 16 10:06:29 paston01 kernel: [131511.196000] Oops: 0002 [#1]
Jun 16 10:06:29 paston01 kernel: [131511.196000] SMP
Jun 16 10:06:29 paston01 kernel: [131511.196000] Modules linked in: battery ac thermal fan button tg3 ipw3945 ieee80211 ieee80211_crypt snd_rtctimer nls_iso8859_1 nls_cp437 vfat fat option ppp_deflate bsd_comp ppp_async ppp_generic slhc usbserial sr_mod cdrom usb_storage libusual ohci_hcd ipt_LOG xt_limit xt_tcpudp nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink iptable_filter ip_tables x_tables binfmt_misc nlvcard mishim(P) ppdev rfcomm l2cap ipv6 af_packet acpi_cpufreq cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_ondemand freq_table cpufreq_conservative tc1100_wmi pcc_acpi dev_acpi sony_acpi video sbs i2c_ec dock container asus_acpi backlight deflate zlib_deflate twofish twofish_common serpent blowfish des cbc ecb blkcipher aes xcbc sha256 sha1 crypto_null af_key fuse parport_pc lp parport snd_hda_intel snd_hda_codec snd_pcm_oss snd_pcm snd_mixer_oss snd_seq_dummy snd_seq_oss joydev snd_seq_midi snd_rawmidi snd_seq_midi_event nvidia(P) hci_usb snd_seq snd_timer snd_seq_device irda bluetooth pcmci
Jun 16 10:06:29 paston01 kernel: i2c_core crc_ccitt iTCO_wdt iTCO_vendor_support snd soundcore psmouse serio_raw intel_agp yenta_socket rsrc_nonstatic pcmcia_core snd_page_alloc agpgart shpchp pci_hotplug evdev tsdev ext3 jbd mbcache sg sd_mod ata_piix ata_generic libata scsi_mod generic ehci_hcd uhci_hcd usbcore processor fbcon tileblit font bitblit softcursor vesafb capability commoncap
Jun 16 10:06:29 paston01 kernel: [131511.196000] CPU: 0
Jun 16 10:06:29 paston01 kernel: [131511.196000] EIP: 0060:[_spin_lock+0/16] Tainted: P VLI
Jun 16 10:06:29 paston01 kernel: [131511.196000] EFLAGS: 00010202 (2.6.20-16-generic #2)
Jun 16 10:06:29 paston01 kernel: [131511.196000] EIP is at _spin_lock+0x0/0x10
Jun 16 10:06:29 paston01 kernel: [131511.196000] eax: 00000044 ebx: c003fcc0 ecx: c03ab7b4 edx: 00000001
Jun 16 10:06:29 paston01 kernel: [131511.196000] esi: c003fdac edi: 00000000 ebp: c2b51ef0 esp: c2b51ec4
Jun 16 10:06:29 paston01 kernel: [131511.196000] ds: 007b es: 007b ss: 0068
Jun 16 10:06:29 paston01 kernel: [131511.196000] Process kswapd0 (pid: 211, ti=c2b50000 task=c2b11a90 task.ti=c2b50000)
Jun 16 10:06:29 paston01 kernel: [131511.196000] Stack: c0198dc4 c15745c0 c01ecc1d e774942c e774942c 00000080 c003fcc0 00000019
Jun 16 10:06:29 paston01 kernel: [131511.196000] c018a7ad 00000000 00000019 c003fb48 e882b688 00016cd8 00000082 dfffea80
Jun 16 10:06:29 paston01 kernel: ...

Read more...

Revision history for this message
Philip Aston (philipa) wrote :
Revision history for this message
Alexandre Payment (alp) wrote :

I finally was able to use nv driver with the system that the problem occur.
And sadly the Oops still happen frequently.

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. Now that the 7.10 Gutsy Gibbon release of Ubuntu is out, we were wondering if you can still reproduce this issue. Could you please download and try the new version of Ubuntu from http://www.ubuntu.com/getubuntu/download and report back your results. If the issue is still present in the new release, please attach the following information:

* uname -a > uname-a.log
* cat /proc/version_signature > version.log
* dmesg > dmesg.log
* sudo lspci -vvnn > lspci-vvnn.log

Please be sure to attach each file as a separate attachment. For more information regarding the kernel team bug policy, please refer to https://wiki.ubuntu.com/KernelTeamBugPolicies . Thanks again and we appreciate your help and feedback.

Changed in linux-source-2.6.20:
status: New → Incomplete
Revision history for this message
wolfger (wolfger) wrote :

We are closing this bug report because it lacks the information we need to investigate the problem, as described in the previous comments. Please reopen it if you can give us the missing information, and don't hesitate to submit bug reports in the future. To reopen the bug report you can click on the current status, under the Status column, and change the Status back to "New". Thanks again!

Changed in linux-source-2.6.20:
status: Incomplete → Invalid
Revision history for this message
Andrew Starr-Bochicchio (andrewsomething) wrote :

The 18 month support period for Edgy Eft 6.10 has reached it's end of life. As a result, we are closing the linux-source-2.6.17 Edgy Eft kernel task. However, Hardy Heron 8.04 was recently released. It would be helpful if you could test the new release and verify if this is still an issue - http://www.ubuntu.com/getubuntu/download . If the issue still exists, please add the Hardy kernel "linux" task to the bug report. This can be done by clicking on the "Also affects distribution" link in the Actions area on left hand side of the bug report. Select "Ubuntu" as the Distribution and type in "linux" for the Source Package Name. Thanks.

Changed in linux-source-2.6.17:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.