kernel oops after resuming from suspend to RAM

Bug #129226 reported by David Weinehall on 2007-07-30
136
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-source-2.6.22 (Ubuntu)
High
Amit Kucheria

Bug Description

Using the gutsy kernel linux-image-2.6.22-9-generic_2.6.22-9.19_i386.deb, I got an oops on resume from suspend to ram; the previous kernel version, linux-image-2.6.22-8-generic_2.6.22-8.18_i386.deb, works fine.

Hopefully the relevant part of the oops-dump (the rest can be had on request):

[ 1575.972000] PM: Finishing wakeup.
[ 1575.972000] Enabling non-boot CPUs ...
[ 1575.984000] SMP alternatives: switching to SMP code
[ 1575.984000] Booting processor 1/1 eip 3000
[ 1575.992000] Initializing CPU#1
[ 1575.992000] BUG: unable to handle kernel paging request at virtual address 80a722f5
[ 1575.992000] printing eip:
[ 1575.992000] 1fffd068
[ 1575.992000] *pde = 00000000
[ 1575.996000] Oops: 0000 [#1]
[ 1575.996000] SMP
[ 1575.996000] Modules linked in: sha256 aes cbc blkcipher cpufreq_stats binfmt_misc uinput ppdev parport_pc lp parport ipv6 microcode af_packet ext2 dm_crypt dm_snapshot dm_mirror dm_mod button ac battery mmc_block dock nvram thinkpad_acpi rfcomm l2cap bluetooth acpi_cpufreq cpufreq_ondemand freq_table i915 drm sbp2 loop pcmcia snd_hda_intel yenta_socket snd_pcm_oss rsrc_nonstatic snd_mixer_oss pcmcia_core snd_pcm sdhci snd_timer iTCO_wdt iTCO_vendor_support mmc_core pcspkr psmouse serio_raw i2c_i801 i2c_core snd ipw3945 intel_agp agpgart ieee80211 ieee80211_crypt soundcore snd_page_alloc evdev ext3 jbd mbcache sd_mod ata_piix ohci1394 ieee1394 ahci ata_generic libata scsi_mod ehci_hcd e1000 uhci_hcd usbcore thermal processor fan capability commoncap
[ 1575.996000] CPU: 1
[ 1575.996000] EIP: 0060:[<1fffd068>] Not tainted VLI
[ 1575.996000] EFLAGS: 00010002 (2.6.22-9-generic #1)
[ 1575.996000] EIP is at 0x1fffd068
[ 1575.996000] eax: c043c388 ebx: 00000005 ecx: fffff000 edx: 1fffd067
[ 1575.996000] esi: c043c388 edi: c0271f7d ebp: 00000001 esp: df9b9f34
[ 1575.996000] ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068
[ 1575.996000] Process swapper (pid: 0, ti=df9b8000 task=df9c2a40 task.ti=df9b8000)
[ 1575.996000] Stack: 00000001 01000000 00000000 c01018e1 00000280 00000320 c01180df 00000001
[ 1575.996000] 10002073 c0438000 ffffffff c010b60d 00050014 c0371e28 00000001 c10089c0
[ 1575.996000] 017cd000 00000001 01000000 00000000 00000001 c0116e45 00000001 00000001
[ 1575.996000] Call Trace:
[ 1575.996000] [<c01018e1>] calibrate_delay+0x11/0x730
[ 1575.996000] [<c01180df>] setup_local_APIC+0x28f/0x2a0
[ 1575.996000] [<c010b60d>] cpu_init+0x19d/0x250
[ 1575.996000] [<c0116e45>] start_secondary+0xb5/0x380
[ 1575.996000] [<c0116ae9>] cpu_exit_clear+0x19/0x40
[ 1575.996000] [<f884c731>] acpi_processor_idle+0x0/0x41f [processor]
[ 1575.996000] [<c010246c>] cpu_idle+0xac/0xe0
[ 1575.996000] =======================
[ 1575.996000] Code: Bad EIP value.
[ 1575.996000] EIP: [<1fffd068>] 0x1fffd068 SS:ESP 0068:df9b9f34
[ 1575.996000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 1580.968000] Stuck ??
[ 1580.968000] Inquiring remote APIC #1...
[ 1580.968000] ... APIC #1 ID: failed
[ 1580.968000] ... APIC #1 VERSION: failed
[ 1580.972000] ... APIC #1 SPIV: failed
[ 1580.972000] skipping cpu1, didn't come online
[ 1580.972000] Error taking CPU1 up: -5

Amit Kucheria (amitk) wrote :

David,

Could you try compiling the kernel with commit 7b312811ded64be5c9be09743ca019caba44d72a reverted?

Amit Kucheria (amitk) wrote :

Nevermind that comment. I was mistaken about that -rt only commit.

Ante Karamatić (ivoks) wrote :

Happens to me too. Additional info: after resume, caps lock starts blinking and dual core processor becomes single core - one core dies.

Changed in linux-source-2.6.22:
status: New → Confirmed
Ante Karamatić (ivoks) wrote :

And here is the oops:

[ 122.992000] Initializing CPU#1
[ 122.996000] BUG: unable to handle kernel paging request at virtual address fffff07a
[ 122.996000] printing eip:
[ 122.996000] 1fffd06c
[ 122.996000] *pde = 00004067
[ 122.996000] *pte = 00000000
[ 122.996000] Oops: 0000 [#1]
[ 122.996000] SMP
[ 122.996000] Modules linked in: af_packet binfmt_misc rfcomm l2cap bluetooth ipv6 i915 drm acpi_cpufreq cpufreq_ondemand cpufreq_stats freq_table cpufreq_powersave cpufreq_conservative cpufreq_userspace toshiba_acpi battery container sbs button bay ac video dock fuse loop sbp2 lp parport snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi pcmcia tifm_7xx1 tifm_core snd_seq_midi_event snd_seq joydev sdhci mmc_core snd_timer snd_seq_device yenta_socket rsrc_nonstatic pcmcia_core snd intel_agp agpgart serio_raw soundcore snd_page_alloc psmouse pcspkr iTCO_wdt iTCO_vendor_support shpchp pci_hotplug evdev ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_piix ohci1394 ieee1394 ehci_hcd ata_generic libata scsi_mod uhci_hcd usbcore dm_mirror dm_snapshot dm_mod thermal processor fan capability commoncap
[ 122.996000] CPU: 1
[ 122.996000] EIP: 0060:[phys_startup_32+535810156/-1073741824] Not tainted VLI
[ 122.996000] EFLAGS: 00010086 (2.6.22-9-generic #1)
[ 122.996000] EIP is at 0x1fffd06c
[ 122.996000] eax: c043c388 ebx: 00000005 ecx: fffff000 edx: 1fffd067
[ 122.996000] esi: c043c388 edi: 00000001 ebp: 00000001 esp: df839f30
[ 122.996000] ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068
[ 122.996000] Process swapper (pid: 0, ti=df838000 task=df830a40 task.ti=df838000)
[ 122.996000] Stack: c0271fdd 00000001 01000000 00000000 c01018e1 00000280 00000320 c01180df
[ 122.996000] 00000001 f0002073 c0438000 ffffffff c010b60d 00050014 c0371e40 00000001
[ 122.996000] c100897f 013cb000 00000001 01000000 00000000 00000001 c0116e45 00000001
[ 122.996000] Call Trace:
[ 122.996000] [dmi_check_system+77/112] dmi_check_system+0x4d/0x70
[ 122.996000] [calibrate_delay+17/1840] calibrate_delay+0x11/0x730
[ 122.996000] [setup_local_APIC+655/672] setup_local_APIC+0x28f/0x2a0
[ 122.996000] [cpu_init+413/592] cpu_init+0x19d/0x250
[ 122.996000] [start_secondary+181/896] start_secondary+0xb5/0x380
[ 122.996000] [cpu_exit_clear+25/64] cpu_exit_clear+0x19/0x40
[ 122.996000] [<f8831731>] acpi_processor_idle+0x0/0x41f [processor]
[ 122.996000] [cpu_idle+213/224] cpu_idle+0xd5/0xe0
[ 122.996000] =======================
[ 122.996000] Code: Bad EIP value.
[ 122.996000] EIP: [phys_startup_32+535810156/-1073741824] 0x1fffd06c SS:ESP 0068:df839f30
[ 122.996000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 128.192000] Stuck ??
[ 128.192000] Inquiring remote APIC #1...
[ 128.192000] ... APIC #1 ID: failed
[ 128.192000] ... APIC #1 VERSION: failed
[ 128.192000] ... APIC #1 SPIV: failed
[ 128.196000] skipping cpu1, didn't come online

Stéphane Graber (stgraber) wrote :
Download full text (4.9 KiB)

I've got something pretty similar on 2.6.22-9.20 amd64 but the oops isn't exactly the same :

Jul 31 16:14:39 laptop kernel: [ 0.192191] Initializing CPU#1
Jul 31 16:14:39 laptop kernel: [ 0.192964] Unable to handle kernel paging request at ffffffff805ac788 RIP:
Jul 31 16:14:39 laptop kernel: [ 0.192966] [dmi_check_system+7/128] dmi_check_system+0x7/0x80
Jul 31 16:14:39 laptop kernel: [ 0.192974] PGD 203067 PUD 205063 PMD 37be6163 PTE 5ac000
Jul 31 16:14:39 laptop kernel: [ 0.192978] Oops: 0000 [1] SMP
Jul 31 16:14:39 laptop kernel: [ 0.192981] CPU 1
Jul 31 16:14:39 laptop kernel: [ 0.192982] Modules linked in: rfcomm l2cap capability i915 drm ipv6 acpi_cpufreq cpufreq_ondemand cpufreq_conservative cpufreq_powersave cpufreq_stats freq_table cpufreq_userspace container button video ac dock sbs battery sbp2 parport_pc lp parport fuse snd_hda_intel snd_pcm_oss snd_mixer_oss pcmcia joydev snd_pcm hci_usb snd_seq_dummy bluetooth snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device iTCO_wdt iTCO_vendor_support af_packet yenta_socket pcspkr snd soundcore rsrc_nonstatic pcmcia_core ieee80211 ieee80211_crypt psmouse snd_page_alloc serio_raw shpchp pci_hotplug intel_agp evdev ext3 jbd mbcache sr_mod cdrom sg sd_mod ata_piix ohci1394 ahci ieee1394 ata_generic libata scsi_mod ehci_hcd uhci_hcd usbcore thermal processor fan apparmor commoncap aamatch_pcre
Jul 31 16:14:39 laptop kernel: [ 0.193024] Pid: 0, comm: swapper Not tainted 2.6.22-9-generic #1
Jul 31 16:14:39 laptop kernel: [ 0.193026] RIP: 0010:[dmi_check_system+7/128] [dmi_check_system+7/128] dmi_check_system+0x7/0x80
Jul 31 16:14:39 laptop kernel: [ 0.193031] RSP: 0018:ffff81007c5b3e48 EFLAGS: 00010246
Jul 31 16:14:39 laptop kernel: [ 0.193033] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 00000000ffffffff
Jul 31 16:14:39 laptop kernel: [ 0.193035] RDX: ffff810080a4e000 RSI: 0000000000000000 RDI: ffffffff805ac780
Jul 31 16:14:39 laptop kernel: [ 0.193037] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff81007c5b0320
Jul 31 16:14:39 laptop kernel: [ 0.193039] R10: 000000000004f6ae R11: 0000000000000001 R12: 0000000000000000
Jul 31 16:14:39 laptop kernel: [ 0.193041] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Jul 31 16:14:39 laptop kernel: [ 0.193044] FS: 0000000000000000(0000) GS:ffff810037918280(0000) knlGS:0000000000000000
Jul 31 16:14:39 laptop kernel: [ 0.193046] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Jul 31 16:14:39 laptop kernel: [ 0.193048] CR2: ffffffff805ac788 CR3: 0000000000201000 CR4: 00000000000006a0
Jul 31 16:14:39 laptop kernel: [ 0.193051] Process swapper (pid: 0, threadinfo ffff81007c5b2000, task ffff81007c5b0000)
Jul 31 16:14:39 laptop kernel: [ 0.193052] Stack: 0000000000000001 0000000000000000 0000000000000000 ffffffff802075da
Jul 31 16:14:39 laptop kernel: [ 0.193057] ffff81007f610a10 ffffffff8028b0fd ffff810001019000 ffffffff80210532
Jul 31 16:14:39 laptop kernel: [ 0.193060] 0000000000000000 ffffffff802105a6 0000000000000001 0000000000000096
Jul 31 16:14:39 laptop kernel: [ 0.193064] Call Trace:
Jul 31 16:14:39 laptop ke...

Read more...

schnee007 (ls-ps-webhosting) wrote :
Download full text (3.9 KiB)

Hi!

Same to me since kernel upgrade last night (Linux leela 2.6.22-9-generic #1 SMP Mon Jul 30 18:00:27 GMT 2007 i686 GNU/Linux) on Kubuntu Gutsy Tribe 3. Suspend worked before (2.6.22.8), now it resumes with both Capslock and Numlock blinking. System then is available for some minutes and then crashes completely.

root@leela:/home/schnee# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 CPU T5500 @ 1.66GHz
stepping : 6
cpu MHz : 1000.000
cache size : 2048 KB

Sony Vaio VGN-C2

[ 1382.328000] PM: Finishing wakeup.
[ 1382.328000] Enabling non-boot CPUs ...
[ 1382.340000] SMP alternatives: switching to SMP code
[ 1382.340000] Booting processor 1/1 eip 3000
[ 1382.348000] Initializing CPU#1
[ 1382.348000] BUG: unable to handle kernel paging request at virtual address 81632dc7
[ 1382.348000] printing eip:
[ 1382.348000] 1fffd06c
[ 1382.348000] *pde = 00000000
[ 1382.348000] Oops: 0000 [#1]
[ 1382.348000] SMP
[ 1382.348000] Modules linked in: i915 drm tun binfmt_misc rfcomm l2cap sonypi ipv6 acpi_cpufreq cpufreq_powersave cpufreq_userspace cpufreq_stats cpufreq_conservative cpufreq_ondemand freq_table sbs container dock battery video button ac sony_acpi sbp2 parport_pc lp parport af_packet fuse pcmcia snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi hci_usb yenta_socket rsrc_nonstatic joydev bluetooth pcmcia_core snd_rawmidi tifm_7xx1 tifm_core snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore intel_agp psmouse serio_raw snd_page_alloc iTCO_wdt iTCO_vendor_support agpgart shpchp pci_hotplug evdev ext3 jbd mbcache sg sr_mod cdromsd_mod ata_generic ohci1394 ieee1394 ehci_hcd ata_piix libata scsi_mod uhci_hcd usbcore thermal processor fan capability commoncap
[ 1382.348000] CPU: 1
[ 1382.348000] EIP: 0060:[phys_startup_32+535810156/-1073741824] Not tainted VLI
[ 1382.348000] EFLAGS: 00010082 (2.6.22-9-generic #1)
[ 1382.348000] EIP is at 0x1fffd06c
[ 1382.348000] eax: c043c388 ebx: 00000005 ecx: fffff000 edx: 1fffd067
[ 1382.348000] esi: c043c388 edi: 00000001 ebp: 00000001 esp: df819f30
[ 1382.348000] ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068
[ 1382.348000] Process swapper (pid: 0, ti=df818000 task=c18e2a40 task.ti=df818000)
[ 1382.348000] Stack: c0271fdd 00000001 01000000 00000000 c01018e1 00000280 00000320 c01180df
[ 1382.348000] 00000001 f0002073 c0438000 ffffffff c010b60d 00050014 c0371e40 00000001
leela ovpn-client[5127]: event_wait : Interrupted system call (code=4)
leela hcid[5359]: HCI dev 0 down
[ 1382.348000] c100897f 013cb000 00000001 01000000 00000000 00000001 c0116e45 00000001
leela ovpn-client[5127]: write UDPv4 []: Network is unreachable (code=101)
leela hcid[5359]: Stopping security manager 0
[ 1382.348000] Call Trace:
leela hcid[5359]: Device hci0 has been disabled
[ 1382.348000] [dmi_check_system+77/112] dmi_check_system+0x4d/0x70
[ 1382.348000] [calibrate_delay+17/1840] calibrate_delay+0x11/0x730
[ 1382.348000] [setup_local_APIC+655/672] setup_local_APIC+0x28f/0x2a0
[ 1382.348000] [cpu_init+413/592] c...

Read more...

Mikael Gerdin (mgerdin) wrote :

I've just experienced the same Bug upon resuming from Hibernate (s2disk) using linux-image-2.6.22-9-generic ver. 2.6.22-9.20 on i386.
I also experienced the caps lock + scroll lock LEDs blinking.
However I have failed in my attempts to reproduce the error, I've tried hibernating and then resuming several times, but nothing out of the ordinary happened.
I'll attach my kernel log if it can help.
The machine is a Dell XPS M1210 with a Core 2 Duo T7200 CPU.

schnee007 (ls-ps-webhosting) wrote :

Just noticed, like Ante, that before crashing completely dual core processor becomes single core after resuming.

Amit Kucheria (amitk) on 2007-08-02
Changed in linux-source-2.6.22:
assignee: nobody → amitk
importance: Undecided → High
Brian Rogers (brian-rogers) wrote :

Another way to trigger the bug is to take the second CPU offline, then turn it back on:
sudo bash
cd /sys/devices/system/cpu/cpu1
echo 0 > online
echo 1 > online

Erik Meitner (e.meitner) wrote :

I also have this problem on a Thinkpad T60. Symptom is a blinking caps-lock light. System remains usable on CPU 0. After the oops, doing as Brian Rogers suggests above brings back CPU 1 without problem.

See attached log.

Everyone seems hung up on the blinking caps lock led; that's a red
herring. Caps lock led blinking on oops is normal behaviour. It's the
oops that isn't normal behaviour =)

Regards: David

David: Thats why I mention that it's a symptom. I thought it significant because I have never had that happen before until this bug.

Also, More info on my hardware: https://wiki.ubuntu.com/LaptopTestingTeam/ThinkpadT60-5BU

Download full text (6.4 KiB)

I am also seeing this problem -- and additionally the main keyboard and mouse lock up -- plugging external usb devices allows me to workaround the problem and bring the machine to a clean shutdown. Caps lock led gets switched on -- doesn't blink. I think that leds blinking on oops is a kernel build option anyway.

$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz
stepping : 10
cpu MHz : 2201.000
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips : 4393.44
clflush size : 64

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz
stepping : 10
cpu MHz : 2201.000
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips : 4388.83
clflush size : 64

Snippet of kern.log:
Aug 6 19:21:22 aporo2 kernel: [32311.760000] Disabling non-boot CPUs ...
Aug 6 19:21:22 aporo2 kernel: [32311.876000] CPU 1 is now offline
Aug 6 19:21:22 aporo2 kernel: [32311.876000] SMP alternatives: switching to UP code
Aug 6 19:21:22 aporo2 kernel: [32311.876000] CPU1 is down
Aug 6 19:21:22 aporo2 kernel: [32311.876000] PM: Entering mem sleep
Aug 6 19:21:22 aporo2 kernel: [32311.876000] Back to C!
Aug 6 19:21:22 aporo2 kernel: [32311.876000] PM: Finishing wakeup.
Aug 6 19:21:22 aporo2 kernel: [32311.876000] Enabling non-boot CPUs ...
Aug 6 19:21:22 aporo2 kernel: [32311.888000] SMP alternatives: switching to SMP code
Aug 6 19:21:22 aporo2 kernel: [32311.888000] Booting processor 1/1 eip 3000
Aug 6 19:21:22 aporo2 kernel: [32311.896000] Initializing CPU#1
Aug 6 19:21:22 aporo2 kernel: [32311.896000] BUG: unable to handle kernel paging request at virtual address 6a494846
Aug 6 19:21:22 aporo2 kernel: [32311.896000] printing eip:
Aug 6 19:21:22 aporo2 kernel: [32311.896000] c01fd3cb
Aug 6 19:21:22 aporo2 kernel: [32311.896000] *pde = 00000000
Aug 6 19:21:22 aporo2 kernel: [32311.896000] Oops: 0000 [#1]
Aug 6 19:21:22 aporo2 kernel: [32311.896000] SMP
Aug 6 19:21:22 aporo2 kernel: [32311.896000] Modules linked in: michael_mic arc4 ecb blkcipher ieee80211_crypt_tkip binfmt_misc capability ipv6 acpi_cpufreq cpufr
eq_conservative cpufreq_ondemand cpufreq_userspace cpufreq_powersave cpufreq_stats freq_table container button dock sbs battery ac video sbp2 lp snd_hda_intel snd_
pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq pcmcia snd_timer snd_seq_device ieee80211_crypt snd sou
ndcore xpad af_packe...

Read more...

Amit Kucheria (amitk) wrote :

Reverted a patch that was responsible for this. David confirmed that this works for him.

Amit Kucheria (amitk) wrote :

Patch reverted

Changed in linux-source-2.6.22:
status: Confirmed → Fix Committed
Martin Pitt (pitti) wrote :

Too late for a full kernel upload now, unless this is an unintrusive one and does not break ABI.

vlowther (victor-lowther) wrote :

For the benefit of those of us who would like to be able to use both cores of their dual-core processor and have (suspend|hibernate) and resume functionality at the same time, could a pointer to the patch which fixes this issue be provided? for me, at least, it is either that or run with the Feisty kernel until Tribe 5.

You could download the kernel tree and compile it yourself.

$ git clone git://kernel.ubuntu.com/ubuntu/ubuntu-gutsy.git
$ cd ubuntu-gutsy
$ fakeroot debian/rules binary-generic
                                           ^^^^^^^
                                           Change this to the correct
architecture - this one is for i386
$ cd ..; dpkg -i *.deb

On 8/9/07, vlowther <email address hidden> wrote:
> For the benefit of those of us who would like to be able to use both
> cores of their dual-core processor and have (suspend|hibernate) and
> resume functionality at the same time, could a pointer to the patch
> which fixes this issue be provided? for me, at least, it is either that
> or run with the Feisty kernel until Tribe 5.
>
> --
> kernel oops after resuming from suspend to RAM
> https://bugs.launchpad.net/bugs/129226
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Peter Clifton (pcjc2) wrote :

Thanks for the pointer to the git repo.

Whilst I can happily track changes to source packages in Ubuntu, rebuild them etc..., check launchpad - its often a bit of a mystery where the cutting edge work goes on behind the scenes. As the developers bugzilla posts are probably trying to be user-friendly, thereI never seems to be a lot of linking to the underlying technical stuff.

For the curious like me, could you point me to which commit you reverte to fix the issue? I only see:

http://kernel.ubuntu.com/git?p=zul/ubuntu-gutsy.git;a=commitdiff;h=8444584e11e4c80a1bde56152db8bb615c242bc1

There didn't appear to be an explanation in the log why it was reverted though. Was that the one, just config changes?

If so, doesn't this mean there is still some underlying kernel bug which is just set off by a particular config?

vlowther (victor-lowther) wrote :

It is actually http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-gutsy.git;a=commit;h=98f537ca1bc7218f299831c31224027892093ab6 which is causing the glitch. Based on what the patch reverts it (according to my WAG), it would come as no surprise that the DMI information is not precisely sane after a suspend/hibernate event in certain BIOSes and there is some odd poking that has to be done to get it back and/or you may have to wait for the information to stabilize.

I can confirm this behaviour with Asus V6J too.

Amit Kucheria (amitk) wrote :

@Peter - In most cases users don't want to know the 'underlying stuff' - they just want their HW to work. :)

OTOH, interested users always ask, and in most cases I dare say, receive somewhat satisfactory answers. In this case, the post after yours answers your question.

But I agree, that some more details wouldn't do any harm - I will try to do that in the future.

Dandy (dandydagenius) wrote :

Hi! I have run into a pretty similar problem after suspend with Gutsy Tribe 4 AMD64, I have a Dual CPU Xeon system and after the error messages:

-Kernel panic - not syncing: Attempted to kill the idel task
-Oops: 0000 SMP
-CR2:ffffffff805ac78

my system dies. I have to switch it off and on to get it run again.

Michael Plump (plumpy) wrote :

Shouldn't the status of this get changed from "Fix Committed" since the patch was reverted? I just want to make sure this doesn't get forgotten, but I'm not entirely sure how the procedures are supposed to work...

st33med (st33med) wrote :

I have a Dell E1505 and have Ubuntu 7.04 with the Gutsy kernel compiled via script. I can confirm this bug as well.

Amit Kucheria (amitk) wrote :

@Michael - Reverting a patch fixes this bug. So in that sense, the Fix has been committed. Confusing.. I know :)

Is the fix already released? Because I have a fully up-to-date Gutsy and I got an oops just 10 minutes ago when going out of suspend.

Raphael Slinckx (kikidonk) wrote :

It's still happening for me with a fully uptodate gutsy. The only kernel update happened 3-4 days ago and didn't contain the fix, it seems.

Matthew Haughton (snafu109) wrote :

A new kernel has been released as source, but there's no binary yet. At the moment I'm still on 2.6.22-9.25 (and everyone else would be too at this point), which doesn't contain the fixes. I believe the kernel with the fix is version 2.6.22-10.26. It'll be available soon.

Keep in mind I've never tracked the Ubuntu kernel before so I may be completely wrong :)

Amit Kucheria (amitk) wrote :

Make sure you are running 2.6.22-10.26. It has just finished building @ https://launchpad.net/ubuntu/+source/linux-source-2.6.22/

uname -a will show your current version.

Amit Kucheria (amitk) wrote :

It hasn't finished building yet. Hold on for a day or so.

Matthew Haughton (snafu109) wrote :

New kernel (2.6.22-10-generic) fixed it for me on a Dell Inspiron 1420 (Core 2 Duo). No flashing keyboard lights, cat /proc/cpuinfo shows both cores, and they both also appear in System Monitor.

Brian Murray (brian-murray) wrote :

The new version of the kernel is now available in most repositories. This fixed it for me also so I am marking this bug as Fix Released. In the event that any one still has any issues feel free to reopen the bug.

Changed in linux-source-2.6.22:
status: Fix Committed → Fix Released
pauls (paulatgm) wrote :

This fixed the resume problem of 1 core not resuming, but created a different problem with the sound not resuming. If anyone else now has sound not resuming after upgrading to this kernel, I've started a new bug at 134167.

Dandy (dandydagenius) wrote :

With the new kernel I do not get the oops but it still dies the exact same way as with the oops error message and now I get the sound error too... Should I start a new bug report?

For me it looks like the fix fixed it, I will reopen a bug in case of troubles. Thanks! (ThinkPad X61s)

My Dell XPS M1330 enables also one core after suspend. I've got an uptodate Gutsy with Kernel 2.6.22-12
Here my /var/log/messages

Sep 27 02:40:06 kingslanding kernel: [ 0.458656] Enabling non-boot CPUs ...
Sep 27 02:40:06 kingslanding kernel: [ 0.459285] atkbd.c: Unknown key pressed (translated set 2, code 0x88 on isa0060/serio0).
Sep 27 02:40:06 kingslanding kernel: [ 0.459291] atkbd.c: Use 'setkeycodes e008 <keycode>' to make it known.
Sep 27 02:40:06 kingslanding kernel: [ 0.459634] atkbd.c: Unknown key pressed (translated set 2, code 0x88 on isa0060/serio0).
Sep 27 02:40:06 kingslanding kernel: [ 0.459640] atkbd.c: Use 'setkeycodes e008 <keycode>' to make it known.
Sep 27 02:40:06 kingslanding kernel: [ 0.459934] SMP alternatives: switching to SMP code
Sep 27 02:40:06 kingslanding kernel: [ 0.460182] Booting processor 1/2 APIC 0x1
Sep 27 02:40:06 kingslanding kernel: [ 0.471233] Initializing CPU#1
Sep 27 02:40:06 kingslanding kernel: [ 0.547518] Calibrating delay using timer specific routine.. 4387.96 BogoMIPS (lpj=8775936)
Sep 27 02:40:06 kingslanding kernel: [ 0.547532] CPU: L1 I cache: 32K, L1 D cache: 32K
Sep 27 02:40:06 kingslanding kernel: [ 0.547537] CPU: L2 cache: 4096K
Sep 27 02:40:06 kingslanding kernel: [ 0.547542] CPU 1/1 -> Node 0
Sep 27 02:40:06 kingslanding kernel: [ 0.547546] CPU: Physical Processor ID: 0
Sep 27 02:40:06 kingslanding kernel: [ 0.547549] CPU: Processor Core ID: 1
Sep 27 02:40:06 kingslanding kernel: [ 0.548720] Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz stepping 0a
Sep 27 02:40:06 kingslanding kernel: [ 0.551845] ACPI Error (psloop-0136): Found unknown opcode FE at AML address ffffc20000ab23ca offset 7, ignoring [20070126]
Sep 27 02:40:06 kingslanding kernel: [ 0.551857] ACPI Error (psloop-0136): Found unknown opcode 16 at AML address ffffc20000ab23cc offset 9, ignoring [20070126]
Sep 27 02:40:06 kingslanding kernel: [ 0.551871] ACPI Error (psloop-0136): Found unknown opcode 19 at AML address ffffc20000ab23d0 offset D, ignoring [20070126]
Sep 27 02:40:06 kingslanding kernel: [ 0.551888] ACPI Error (psargs-0355): [C\376\377^I] Namespace lookup failure, AE_NOT_FOUND
Sep 27 02:40:06 kingslanding kernel: [ 0.551899] ACPI Error (psparse-0551): Method parse/execution failed [\_PR_.CPU1._PCT] (Node ffff81007cfa3d60), AE_NOT_FOUND
Sep 27 02:40:06 kingslanding kernel: [ 0.551983] ACPI Exception (processor_perflib-0170): AE_NOT_FOUND, Evaluating _PCT [20070126]
Sep 27 02:40:06 kingslanding kernel: [ 0.551989] CPU1 is up
Sep 27 02:40:06 kingslanding kernel: [ 0.562877] PCI: Enabling device 0000:00:1a.0 (0000 -> 0001)
Sep 27 02:40:06 kingslanding kernel: [ 0.562881] ACPI: PCI Interrupt 0000:00:1a.0[A] -> GSI 20 (level, low) -> IRQ 20

i get a kernel panic on 2.6.22-13-generic if resuming from hibernation
unfortunately nothing is logged

I'm seeing the same issue on my Latitude D830 with a stock 2.6.23 kernel I compiled myself. This may be a bug in the kernel itself?

Also I don't think it's truly fixed in 2.6.22-14, I had still disappear Core1 after resume.

I don't think it's truly fixed.

Changed in linux-source-2.6.22:
status: Fix Released → Confirmed
tk23 (masq-fischlustig) wrote :

It's not really fixed. Speedstepping the second core fails after suspend, but it does not crash.

The log shows:
[ 4754.680000] CPU: Processor Core ID: 1
[ 4754.680000] CPU1: Intel(R) Core(TM)2 Duo CPU T7300 @ 2.00GHz stepping 0a
[ 4754.680000] ACPI Error (psloop-0136): Found unknown opcode FE at AML address f88523ca offset 3, ignoring [20070126]
[ 4754.680000] ACPI Error (psargs-0355): [C\376\377\376.D\376\377] Namespace lookup failure, AE_NOT_FOUND
[ 4754.680000] ACPI Error (psparse-0551): Method parse/execution failed [\_PR_.CPU1._PCT] (Node df816870), AE_NOT_FOUND
[ 4754.680000] ACPI Exception (processor_perflib-0170): AE_NOT_FOUND, Evaluating _PCT [20070126]
[ 4754.680000] CPU1 is up
[ 4754.684000] Switched to high resolution mode on CPU 1
[ 4754.724000] ACPI: PCI Interrupt 0000:00:02.0[A] -> GSI 16 (level, low) -> IRQ 16

$ cat /proc/cpuinfo |grep ...
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Duo CPU T7300 @ 2.00GHz
cpu MHz : 800.000

cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Duo CPU T7300 @ 2.00GHz
cpu MHz : 1995.106

$ sudo cpufreq-info
cpufrequtils 002: cpufreq-info (C) Dominik Brodowski 2004-2006
Report errors and bugs to <email address hidden>, please.
analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which need to switch frequency at the same time: 0
  hardware limits: 800 MHz - 2.00 GHz
  available frequency steps: 2.00 GHz, 2.00 GHz, 1.60 GHz, 1.20 GHz, 800 MHz
  available cpufreq governors: conservative, ondemand, userspace, powersave, performance
  current policy: frequency should be within 800 MHz and 2.00 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 800 MHz (asserted by call to hardware).
analyzing CPU 1:
  no or unknown cpufreq driver is active on this CPU

I repeat: The second core does *not* crash - it is available with full speed, only frequency scaling fails.

If you need some more information, please let me know.

On Wednesday 21 November 2007 16:10:56 tk23 wrote:
>
> I repeat: The second core does *not* crash - it is available with full
> speed, only frequency scaling fails.

This seems to be case in 2.6.23.8 from kernel.org, too.

A quick-and-dirty fix is reloading the kernel module "acpi_cpufreq":
$ modprobe -rv acpi_cpufreq
$ modprobe -v acpi_cpufreq

On Wednesday 21 November 2007 16:52:39 tk23 wrote:
> A quick-and-dirty fix is reloading the kernel module "acpi_cpufreq":
> $ modprobe -rv acpi_cpufreq
> $ modprobe -v acpi_cpufreq

That doesn't seem to help in my case:

root@buznote:~# modprobe -r acpi_cpufreq && modprobe -v acpi_cpufreq
root@buznote:~# cpufreq-info
cpufrequtils 002: cpufreq-info (C) Dominik Brodowski 2004-2006
Report errors and bugs to <email address hidden>, please.
analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which need to switch frequency at the same time: 0
  hardware limits: 800 MHz - 2.00 GHz
  available frequency steps: 2.00 GHz, 2.00 GHz, 1.60 GHz, 1.20 GHz, 800 MHz
  available cpufreq governors: conservative, userspace, powersave, ondemand,
performance
  current policy: frequency should be within 800 MHz and 2.00 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 800 MHz (asserted by call to hardware).
analyzing CPU 1:
  no or unknown cpufreq driver is active on this CPU

anonym (launch-mailinator) wrote :

Hi all. My notebook dell 1420 produces the same logs as tk23

$ modprobe -rv acpi_cpufreq
$ modprobe -v acpi_cpufreq

solves the problem.

the 2nd core doesnt die after resume. it's running without frequency scaling.

btw I'm using opensuse 10.3. only suspend to ram produces the problem. suspend to disk does not.

Hi All,

For those of you experiencing new or slightly different issues, please open a new bug report. It is helpful to the kernel team if bug reports target a specific issue against a specific set of hardware. The original kernel Oops issue appears to have been resolved so I am flipping back to "Fix Released". Also, please be sure to test with the latest Hardy Alpha release. The Hardy Heron Alpha series contains an updated version of the kernel. You can download and try the new Hardy Heron Alpha release from http://cdimage.ubuntu.com/releases/hardy/ . You should be able to then test the new kernel via the LiveCD. General information regarding the release can also be found here: http://www.ubuntu.com/testing/ . Thanks.

Changed in linux-source-2.6.22:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers