general protection fault: 0000 during resume

Bug #328440 reported by Matt Zimmerman
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Won't Fix
Medium
Unassigned

Bug Description

This happened while I was trying to diagnose bug 328035 by starting up a second X server and then executing a suspend/resume cycle.

[26481.877123] PM: Finishing wakeup.
[26481.877124] Restarting tasks ... done.
[26482.238410] general protection fault: 0000 [#1] SMP
[26482.238415] last sysfs file: /sys/devices/virtual/vc/vcsa63/dev
[26482.238418] Dumping ftrace buffer:
[26482.238420] (ftrace buffer empty)
[26482.238422] CPU 0
[26482.238423] Modules linked in: isofs udf crc_itu_t iwlagn btusb aes_x86_64 aes_generic sierra hfs lirc_atiusb lirc_dev ati_remote nls_iso8859_1 nls_cp437 vfat fat i915 drm bridge stp bnep kvm_intel kvm tun acpi_cpufreq input_polldev sbp2 ppdev parport_pc lp parport joydev snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss arc4 snd_seq_midi ecb thinkpad_acpi snd_rawmidi snd_seq_midi_event nvram snd_seq iwlcore snd_timer pcmcia snd_seq_device led_class psmouse snd mac80211 serio_raw pcspkr sdhci_pci soundcore yenta_socket rsrc_nonstatic pcmcia_core ricoh_mmc sdhci iTCO_wdt iTCO_vendor_support snd_page_alloc cfg80211 video output intel_agp usb_storage ohci1394 ieee1394 ehci_hcd e1000e uhci_hcd fbcon tileblit font bitblit softcursor fuse [last unloaded: iwlagn]
[26482.238467] Pid: 29831, comm: Xorg Not tainted 2.6.28-7-generic #20-Ubuntu
[26482.238469] RIP: 0010:[<ffffffff802b564e>] [<ffffffff802b564e>] set_page_dirty+0x2e/0xe0
[26482.238476] RSP: 0018:ffff8800af547ae8 EFLAGS: 00010286
[26482.238478] RAX: ffff018000002124 RBX: ffffe200015afff8 RCX: 8000000063249067
[26482.238479] RDX: ffffe20000000000 RSI: 0000000000000001 RDI: ffffe200015afff8
[26482.238481] RBP: ffff8800af547b08 R08: 00000000fffffff5 R09: ffff8800af547ca8
[26482.238483] R10: 0000000000000002 R11: 0000000000000000 R12: ffff880001022840
[26482.238484] R13: ffff8800af547ca8 R14: 00000000073b0000 R15: ffff8800b302ad80
[26482.238486] FS: 0000000000000000(0000) GS:ffffffff80a7d000(0000) knlGS:0000000000000000
[26482.238488] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[26482.238490] CR2: 00007f51a00e7300 CR3: 0000000000201000 CR4: 00000000000026a0
[26482.238491] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[26482.238493] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[26482.238495] Process Xorg (pid: 29831, threadinfo ffff8800af546000, task ffff8800b4015980)
[26482.238496] Stack:
[26482.238497] ffff8800af547b08 ffffe200015afff8 ffff880001022840 ffff8800af547ca8
[26482.238500] ffff8800af547b98 ffffffff802bfbf8 ffff8800af547b38 ffffe2000139b540
[26482.238503] ffffe20002728930 8000000063249067 ffffe20002728940 0000000007400000
[26482.238507] Call Trace:
[26482.238508] [<ffffffff802bfbf8>] zap_pte_range+0x3a8/0x420
[26482.238512] [<ffffffff802c0cea>] unmap_page_range+0x2da/0x360
[26482.238514] [<ffffffff802c156f>] unmap_vmas+0x16f/0x2a0
[26482.238517] [<ffffffff802c62ad>] exit_mmap+0xad/0x180
[26482.238519] [<ffffffff8024ba48>] mmput+0x38/0xd0
[26482.238523] [<ffffffff8024fe86>] exit_mm+0x116/0x150
[26482.238525] [<ffffffff80681841>] ? _spin_lock_irq+0x11/0x20
[26482.238530] [<ffffffff80251e3c>] do_exit+0x16c/0x3b0
[26482.238532] [<ffffffff802520c2>] do_group_exit+0x42/0xc0
[26482.238535] [<ffffffff8025d6dc>] get_signal_to_deliver+0x1ac/0x3a0
[26482.238539] [<ffffffff80212625>] ? sysret_signal+0x3d/0x67
[26482.238542] [<ffffffff80212230>] do_signal+0x70/0x1e0
[26482.238544] [<ffffffff802e5d20>] ? __fput+0x170/0x1f0
[26482.238547] [<ffffffff80681841>] ? _spin_lock_irq+0x11/0x20
[26482.238550] [<ffffffff80259ff1>] ? sigprocmask+0x91/0x110
[26482.238552] [<ffffffff8025a5e2>] ? sys_rt_sigprocmask+0x82/0x120
[26482.238555] [<ffffffff80681841>] ? _spin_lock_irq+0x11/0x20
[26482.238557] [<ffffffff80212625>] ? sysret_signal+0x3d/0x67
[26482.238559] [<ffffffff802123dd>] do_notify_resume+0x3d/0x40
[26482.238561] [<ffffffff802129c7>] ptregscall_common+0x67/0xb0
[26482.238564] Code: e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0 48 89 fb 4c 89 6d f8 66 83 3f 00 48 8b 47 18 0f 88 a7 00 00 00 a8 01 75 37 48 85 c0 74 32 <48> 8b 40 58 48 c7 c2 c0 9e 30 80 48 89 df 48 8b 40 20 48 85 c0
[26482.238590] RIP [<ffffffff802b564e>] set_page_dirty+0x2e/0xe0
[26482.238593] RSP <ffff8800af547ae8>
[26482.238596] ---[ end trace 6522d7efc1fbcd39 ]---

ProblemType: Bug
Architecture: amd64
DistroRelease: Ubuntu 9.04
Package: linux-image-2.6.28-7-generic 2.6.28-7.20
ProcCmdLine: root=UUID=305dde78-d20a-4248-aaf4-09447b7c5791 ro quiet splash
ProcEnviron:
 LC_COLLATE=C
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/zsh
ProcVersionSignature: Ubuntu 2.6.28-7.20-generic
SourcePackage: linux

Revision history for this message
Matt Zimmerman (mdz) wrote :
Revision history for this message
Matt Zimmerman (mdz) wrote :

After the fault, the system was unusable (X never came back up and the text consoles were corrupted) and had to be rebooted.

Changed in linux:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Andy Whitcroft (apw) wrote :

@Matt -- could you attach an X log file for this machine, /var/log/Xorg.0.log so we can see which extensions it is loading. Thanks.

Revision history for this message
Matt Zimmerman (mdz) wrote : Re: [Bug 328440] Re: general protection fault: 0000 during resume

On Tue, Feb 17, 2009 at 11:06:57AM -0000, Andy Whitcroft wrote:
> @Matt -- could you attach an X log file for this machine,
> /var/log/Xorg.0.log so we can see which extensions it is loading.
> Thanks.

Here's a recent one I attached to another bug:

http://launchpadlibrarian.net/21903422/XorgLog.txt

The extensions are:

(II) Loading extension MIT-SCREEN-SAVER
(II) Loading extension XFree86-VidModeExtension
(II) Loading extension XFree86-DGA
(II) Loading extension DPMS
(II) Loading extension XVideo
(II) Loading extension XVideo-MotionCompensation
(II) Loading extension X-Resource
(II) Loading extension DOUBLE-BUFFER
(II) Loading extension GLX
(II) Loading extension RECORD
(II) Loading extension XFree86-DRI
(II) Loading extension DRI2
(II) Initializing built-in extension Generic Event Extension
(II) Initializing built-in extension SHAPE
(II) Initializing built-in extension MIT-SHM
(II) Initializing built-in extension XInputExtension
(II) Initializing built-in extension XTEST
(II) Initializing built-in extension BIG-REQUESTS
(II) Initializing built-in extension SYNC
(II) Initializing built-in extension XKEYBOARD
(II) Initializing built-in extension XC-MISC
(II) Initializing built-in extension SECURITY
(II) Initializing built-in extension XINERAMA
(II) Initializing built-in extension XFIXES
(II) Initializing built-in extension RENDER
(II) Initializing built-in extension RANDR
(II) Initializing built-in extension COMPOSITE
(II) Initializing built-in extension DAMAGE

--
 - mdz

Revision history for this message
Matt Zimmerman (mdz) wrote :
Download full text (5.8 KiB)

I just saw this happen again. I had (apparently successfully) resumed from S3 and was starting to work in my session (I started a download of updates) when I noticed that mutt segfaulted. dmesg showed a series of oopses starting with the one below. Only this one was successfully logged, and the system quickly became unusable and had to be power cycled.

Feb 18 09:26:32 perseus kernel: [86097.605776] general protection fault: 0000 [#1] SMP
Feb 18 09:26:32 perseus kernel: [86097.605789] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
Feb 18 09:26:32 perseus kernel: [86097.605795] Dumping ftrace buffer:
Feb 18 09:26:32 perseus kernel: [86097.605801] (ftrace buffer empty)
Feb 18 09:26:32 perseus kernel: [86097.605804] CPU 1
Feb 18 09:26:32 perseus kernel: [86097.605809] Modules linked in: i915 drm bridge stp bnep kvm_intel kvm tun acpi_cpufreq input_polldev sbp2 ppdev parport_pc lp parport joydev snd_hda_intel arc4 snd_pcm_oss ecb snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss iwlagn snd_seq_midi pcmcia lirc_atiusb snd_rawmidi snd_seq_midi_event snd_seq thinkpad_acpi snd_timer iwlcore lirc_dev snd_seq_device led_class mac80211 nvram video psmouse yenta_socket rsrc_nonstatic pcmcia_core ricoh_mmc sdhci_pci sdhci output iTCO_wdt snd soundcore serio_raw intel_agp iTCO_vendor_support snd_page_alloc ati_remote pcspkr cfg80211 ohci1394 ieee1394 ehci_hcd uhci_hcd e1000e fbcon tileblit font bitblit softcursor fuse
Feb 18 09:26:32 perseus kernel: [86097.605922] Pid: 26970, comm: mutt Not tainted 2.6.28-7-generic #20-Ubuntu
Feb 18 09:26:32 perseus kernel: [86097.605927] RIP: 0010:[<ffffffff8031254d>] [<ffffffff8031254d>] do_mpage_readpage+0x3d/0x5e0
Feb 18 09:26:32 perseus kernel: [86097.605944] RSP: 0018:ffff88000f299b48 EFLAGS: 00010296
Feb 18 09:26:32 perseus kernel: [86097.605948] RAX: 03e9c04008000008 RBX: ffffe200015b0228 RCX: ffff88000f299ca0
Feb 18 09:26:32 perseus kernel: [86097.605953] RDX: 0000000000000001 RSI: ffffe200015b0228 RDI: 0000000000000000
Feb 18 09:26:32 perseus kernel: [86097.605958] RBP: ffff88000f299c08 R08: ffff88000f299c28 R09: ffff88000f299c98
Feb 18 09:26:32 perseus kernel: [86097.605963] R10: ffffe200015b0228 R11: 0000000000000001 R12: ffff88006690a2d8
Feb 18 09:26:32 perseus kernel: [86097.605968] R13: ffff88000f299c28 R14: 0000000000000002 R15: 0000000000000005
Feb 18 09:26:32 perseus kernel: [86097.605974] FS: 00007fc3099b4710(0000) GS:ffff8800be003c80(0000) knlGS:0000000000000000
Feb 18 09:26:32 perseus kernel: [86097.605979] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 18 09:26:32 perseus kernel: [86097.605984] CR2: 00000000026ea000 CR3: 00000000501f0000 CR4: 00000000000026a0
Feb 18 09:26:32 perseus kernel: [86097.605989] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Feb 18 09:26:32 perseus kernel: [86097.605994] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Feb 18 09:26:32 perseus kernel: [86097.606000] Process mutt (pid: 26970, threadinfo ffff88000f298000, task ffff88007ad5acc0)
Feb 18 09:26:32 perseus kernel: [86097.606004] Stack:
Feb 18 09:26:32 perseus kernel: [86097.606008] ffff88000f299c98 ffff88000f299ca0 000000018023b06f ffffe200...

Read more...

Steve Beattie (sbeattie)
Changed in linux:
assignee: nobody → canonical-kernel-team
Revision history for this message
Rocko (rockorequin) wrote :

I get something like this frequently while doing a shutdown cycle with the 2.6.28 kernel, including up to 2.6.28-11-generic#37 amd64.

The symptoms are:

1. I choose restart or shutdown.

2. The usplash screen shows and shutdown seems to progress normally.

3. Near the end, the screen changes and either goes blank or it shows the end of a fault message.

4. If I control-alt-delete, it continues shutdown and restarts.

The log might show something like the attached.

I used to think it was something to do with the nvidia module, because it seemed to always do this if I did a suspend/resume cycle and then tried to reboot, but not if I just rebooted without a suspend/resume cycle; then I added nvidia to the list of modules to remove by adding it to /etc/pm/config.d/modules and for a while this seemed to fix things. It isn't working any more though.

Currently, it does it with all of the nvidia 180.37, 180.41, and 185.13 drivers.

Revision history for this message
Rocko (rockorequin) wrote :

Here's Xorg.0.log.

By the way, suspend/resume does work properly on this PC (usually at least - sometimes though X restarts on resume). It's the shutdown that is crashing. It would be a big problem if I was trying to reboot remotely.

Stefan Bader (smb)
Changed in linux (Ubuntu):
assignee: canonical-kernel-team → stefan-bader-canonical
Steve Beattie (sbeattie)
tags: added: regression-release
removed: regression-potential
Revision history for this message
Ben Prescott (ben.prescott) wrote :
Download full text (7.1 KiB)

I couldn't find many hits at all for this issue.

I'm not getting much apart from the dump; no 'general protection fault'. But since I upgraded a machine to 9.04 and piled on a bunch of upgrades and additions, every time I've shut down the gnome session has ended, screens gone blank, system beeps, and hangs. I can't get to a virtual terminal. Numlock still responded, so I concluded it was a UI problem.

I can ssh in from another system and interact in a limited fashion; 'ps' hangs. init 0 does nothing; poweroff -f worked the first time, but not the second.

I'm running the nvidia driver offered by ubuntu.

 2.6.28-11-generic #42-Ubuntu SMP

first time ..

May 31 21:57:46 akira kernel: [ 2820.596735] NFSD: starting 90-second grace period
May 31 22:11:22 akira -- MARK --
May 31 22:21:23 akira kernel: [ 4237.473900] Dumping ftrace buffer:
May 31 22:21:23 akira kernel: [ 4237.473940] (ftrace buffer empty)
May 31 22:21:23 akira kernel: [ 4237.473978] CPU 3
May 31 22:21:23 akira kernel: [ 4237.474048] Modules linked in: nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc binfmt_misc bridge stp bnep video output input_polldev lp snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss tvaudio snd_seq_midi snd_rawmidi bttv snd_seq_midi_event ir_common compat_ioctl32 videodev v4l1_compat i2c_algo_bit snd_seq snd_timer snd_seq_device v4l2_common videobuf_dma_sg iTCO_wdt videobuf_core snd iTCO_vendor_support btcx_risc psmouse soundcore ppdev tveeprom pcspkr serio_raw snd_page_alloc e752x_edac shpchp parport_pc parport edac_core nvidia(P) reiserfs usbhid ohci1394 e1000 ieee1394 mptspi mptscsih mptbase scsi_transport_spi floppy raid10 raid456 async_xor async_memcpy async_tx xor raid1 raid0 multipath linear fbcon tileblit font bitblit softcursor
May 31 22:21:23 akira kernel: [ 4237.476914] Pid: 11255, comm: usplash Tainted: P 2.6.28-11-generic #42-Ubuntu
May 31 22:21:23 akira kernel: [ 4237.476957] RIP: 0010:[<ffffffff80239290>] [<ffffffff80239290>] reserve_memtype+0x3f0/0x640
May 31 22:21:23 akira kernel: [ 4237.477040] RSP: 0018:ffff88012c903d18 EFLAGS: 00010286
May 31 22:21:23 akira kernel: [ 4237.477080] RAX: 0000000010000000 RBX: 000ffffffffffff0 RCX: ffff88012c903d78
May 31 22:21:23 akira kernel: [ 4237.477123] RDX: ffffffffffffffff RSI: 000000000fff0000 RDI: ffffffffffff0000
May 31 22:21:23 akira kernel: [ 4237.477171] RBP: ffff88012c903d58 R08: 00000000000000fb R09: 0000000000000000
May 31 22:21:23 akira kernel: [ 4237.477217] R10: ffff8801138fc1f8 R11: 00000000000000a8 R12: ffffffffffff0000
May 31 22:21:23 akira kernel: [ 4237.477265] R13: ffffffffffff0000 R14: ffffffffffffffff R15: ffff88012c903d78
May 31 22:21:23 akira kernel: [ 4237.477313] FS: 00007f4578ff46f0(0000) GS:ffff88012f286180(0000) knlGS:0000000000000000
May 31 22:21:23 akira kernel: [ 4237.477368] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 31 22:21:23 akira kernel: [ 4237.477414] CR2: 00007f4578b8c4af CR3: 000000011d508000 CR4: 00000000000006a0
May 31 22:21:23 akira kernel: [ 4237.477462] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 31 22:21:23 akira kernel: [ 4237.477503] DR3: 00000000000...

Read more...

Revision history for this message
Ben Prescott (ben.prescott) wrote :

ps. appreciate I've matched my issue with this issue based on limited evidence - system is still up, Nvidia, attempted power cycle, release of Ubuntu, and a error which is a similar shape to mine. I've not seen one of these traces before, and am clueless about interpreting it. I guess the kernel spat it out, but I wouldn't know how to find documentation for it. If you can see I'm in the wrong place, please help me understand why!

Revision history for this message
Ben Prescott (ben.prescott) wrote :

"usplash," I deduce, refers to the process that triggered the issue, and I think this is the wrong thread for this issue.

for the benefit if anyone who finds their way here with the same issue as myself, the tactical fix to my problem appears to be to remove the pretty boot up sequence (and configure the console for my screen resolution) - removing the references to 'quiet' and 'splash' and replacing with, say, vga=0x307

about /boot/grub/menu.lst changes: http://ubuntuforums.org/showthread.php?t=542082
about vga=: http://tinyurl.com/5ff36

this doesn't fix the underlying issue, whether in usplash or elsewhere.

Revision history for this message
Brad Figg (brad-figg) wrote :

This bug was filed against a series that is no longer supported and so is being marked as Won't Fix. If this issue still exists in supported series, please file a new bug.

Changed in linux (Ubuntu):
assignee: Stefan Bader (stefan-bader-canonical) → nobody
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.