x201 intermittent crash on resume from suspend, with caps lock flashing

Bug #830355 reported by Martin Pool
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
powernap (Ubuntu)
Confirmed
High
Andres Rodriguez

Bug Description

This looks like a recent oneiric regression, perhaps in -8:

On my x201 thinkpad, about one time in four, the machine reboots when trying to come back from suspend.

ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: linux-image-3.0.0-8-generic 3.0.0-8.11
ProcVersionSignature: Ubuntu 3.0.0-8.11-generic 3.0.1
Uname: Linux 3.0.0-8-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: CONEXANT Analog [CONEXANT Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: mbp 5099 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xf2520000 irq 44'
   Mixer name : 'Intel IbexPeak HDMI'
   Components : 'HDA:14f15069,17aa2155,00100302 HDA:80862804,17aa21b5,00100000'
   Controls : 12
   Simple ctrls : 6
Card29.Amixer.info:
 Card hw:29 'ThinkPadEC'/'ThinkPad Console Audio Control at EC reg 0x30, fw 6QHT33WW-1.14'
   Mixer name : 'ThinkPad EC 6QHT33WW-1.14'
   Components : ''
   Controls : 1
   Simple ctrls : 1
Card29.Amixer.values:
 Simple mixer control 'Console',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [on]
Date: Sun Aug 21 13:18:07 2011
EcryptfsInUse: Yes
HibernationDevice: RESUME=UUID=a37b4b37-bf4a-4554-bce8-23f96cd1cc72
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release amd64 (20101007)
MachineType: LENOVO 3249CTO
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.0.0-8-generic root=UUID=8aff985d-377a-420d-a38e-62ce8bd54504 ro crashkernel=384M-2G:64M,2G-:128M quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.0.0-8-generic N/A
 linux-backports-modules-3.0.0-8-generic N/A
 linux-firmware 1.59
SourcePackage: linux
StagingDrivers: mei
UpgradeStatus: Upgraded to oneiric on 2011-08-18 (2 days ago)
dmi.bios.date: 05/31/2011
dmi.bios.vendor: LENOVO
dmi.bios.version: 6QET66WW (1.36 )
dmi.board.name: 3249CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr6QET66WW(1.36):bd05/31/2011:svnLENOVO:pn3249CTO:pvrThinkPadX201:rvnLENOVO:rn3249CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 3249CTO
dmi.product.version: ThinkPad X201
dmi.sys.vendor: LENOVO

Revision history for this message
Martin Pool (mbp) wrote :
Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Martin Pool (mbp) wrote :

I think it would suspend once, work, suspend again, crash, repeat. But it seems to be stable in 3.0.0-9.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Martin Pool (mbp) wrote :

This is consistently happening on the second suspend after rebooting in current oneiric.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Brad Figg (brad-figg) wrote : Test with newer development kernel (3.0.0-11.18)

Thank you for taking the time to file a bug report on this issue.

However, given the number of bugs that the Kernel Team receives during any development cycle it is impossible for us to review them all. Therefore, we occasionally resort to using automated bots to request further testing. This is such a request.

We have noted that there is a newer version of the development kernel currently in the release pocket than the one you tested when this issue was found. Please test again with the newer kernel and indicate in the bug if this issue still exists or not.

If the bug still exists, change the bug status from Incomplete to Confirmed. If the bug no longer exists, change the bug status from Incomplete to Fix Released.

Thank you for your help.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: kernel-request-3.0.0-11.18
Revision history for this message
Martin Pool (mbp) wrote : Re: intermittent crash on resume from suspend

i'm delighted to say this seems to be fixed in 3.0.0-11.18

Changed in linux (Ubuntu):
status: Incomplete → Fix Released
Revision history for this message
Martin Pool (mbp) wrote :

Sadly, no: it's intermittent, but it is still happening in current oneiric 3.0.0-11 #18.

summary: - intermittent crash on resume from suspend
+ x201 intermittent crash on resume from suspend, with caps lock flashing
Changed in linux (Ubuntu):
importance: Undecided → High
status: Fix Released → Confirmed
Revision history for this message
Brad Figg (brad-figg) wrote : Test with newer development kernel (3.0.0-12.19)

Thank you for taking the time to file a bug report on this issue.

However, given the number of bugs that the Kernel Team receives during any development cycle it is impossible for us to review them all. Therefore, we occasionally resort to using automated bots to request further testing. This is such a request.

We have noted that there is a newer version of the development kernel than the one you last tested when this issue was found. Please test again with the newer kernel and indicate in the bug if this issue still exists or not.

If the bug still exists, change the bug status from Incomplete to Confirmed. If the bug no longer exists, change the bug status from Incomplete to Fix Released.

Thank you for your help, we really do appreciate it.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: kernel-request-3.0.0-12.19
Revision history for this message
Julian Edwards (julian-edwards) wrote :

Hi - this is happening for me also with the current oneiric kernel (3.0.0-12-generic). It happens about 70% of the time I resume from suspend.

I have a T410 though:
model name : Intel(R) Core(TM) i5 CPU M 540 @ 2.53GHz

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Brad Figg (brad-figg) wrote : Test with newer development kernel (3.0.0-12.20)

Thank you for taking the time to file a bug report on this issue.

However, given the number of bugs that the Kernel Team receives during any development cycle it is impossible for us to review them all. Therefore, we occasionally resort to using automated bots to request further testing. This is such a request.

We have noted that there is a newer version of the development kernel than the one you last tested when this issue was found. Please test again with the newer kernel and indicate in the bug if this issue still exists or not.

If the bug still exists, change the bug status from Incomplete to Confirmed. If the bug no longer exists, change the bug status from Incomplete to Fix Released.

If you want this bot to quit automatically requesting kernel tests, add a tag named: kernel-bot-stop-nagging.

 Thank you for your help, we really do appreciate it.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: kernel-request-3.0.0-12.20
Revision history for this message
Julian Edwards (julian-edwards) wrote :

Happens on 3.0.0-12.20

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Martin Pool (mbp) wrote :

Still happening in 3.0.0-13.22, with the variation that it now sometimes just reboots with no caps lock flashing.

Do you have any advice other than trying to bisect back through previous versions?

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . If possible, please test the latest kernel (Not a kernel in the daily directory). Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag(Only that one tag, please leave the others). This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed by the mainline kernel, please add the following tag 'kernel-fixed-upstream-KERNEL-VERSION'. For example, if kernel version 3.1-rc9 fixed and issue, the tag would be: 'kernel-fixed-upstream-v3.1-rc9'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Thanks in advance.

tags: added: needs-upstream-testing
Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 830355] Re: x201 intermittent crash on resume from suspend, with caps lock flashing

The hang with caps lock flashing happens with 3.2.0-030200rc1. In addition
the ipw2200 wifi does not seem to work in that version.

  tags kernel-bug-exists-upstream -needs-upstream-testing

Revision history for this message
Martin Pool (mbp) wrote :
Download full text (4.2 KiB)

I don't know if this is related, but I noticed that sometimes this machine reports only one cpu in /proc/cpuinfo, and it has these dmesg entries:

Nov 30 13:57:23 joy kernel: [ 6.139336] CPU 1 is now offline
Nov 30 13:57:23 joy kernel: [ 6.154093] CPU 2 is now offline
Nov 30 13:57:23 joy kernel: [ 6.158264] init: plymouth-stop pre-start process (1941) terminated with status 1
Nov 30 13:57:23 joy kernel: [ 6.163038] CPU 3 MCA banks CMCI:2 CMCI:3 CMCI:5
Nov 30 13:57:23 joy kernel: [ 6.174432] CPU 3 is now offline
Nov 30 13:57:23 joy kernel: [ 6.174435] SMP alternatives: switching to UP code
Nov 30 13:57:23 joy kernel: [ 6.338643] e1000e 0000:00:19.0: eth0: Unsupported Speed/Duplex configuration
Nov 30 13:57:23 joy kernel: [ 6.597691] select_fallback_rq: 63 callbacks suppressed
Nov 30 13:57:23 joy kernel: [ 6.597697] process 1539 (beam.smp) no longer affine to cpu1

and then later

Nov 30 15:43:33 joy kernel: [ 2033.202296] ACPI: Preparing to enter system sleep state S3
Nov 30 15:43:33 joy kernel: [ 2033.393563] PM: Saving platform NVS memory
Nov 30 15:43:33 joy kernel: [ 2033.396880] Disabling non-boot CPUs ...
Nov 30 15:43:33 joy kernel: [ 2033.501243] CPU 1 is now offline
Nov 30 15:43:33 joy kernel: [ 2033.604986] CPU 2 is now offline
Nov 30 15:43:33 joy kernel: [ 2033.606097] Broke affinity for irq 9
Nov 30 15:43:33 joy kernel: [ 2033.606108] Broke affinity for irq 23
Nov 30 15:43:33 joy kernel: [ 2033.708716] CPU 3 is now offline
Nov 30 15:43:33 joy kernel: [ 2033.709079] Extended CMOS year: 2000
Nov 30 15:43:33 joy kernel: [ 2033.709320] ACPI: Low-level resume complete
Nov 30 15:43:33 joy kernel: [ 2033.709385] PM: Restoring platform NVS memory
Nov 30 15:43:33 joy kernel: [ 2033.710095] Extended CMOS year: 2000
Nov 30 15:43:33 joy kernel: [ 2033.710163] Enabling non-boot CPUs ...
Nov 30 15:43:33 joy kernel: [ 2033.710281] Booting Node 0 Processor 1 APIC 0x1
Nov 30 15:43:33 joy kernel: [ 2033.710282] smpboot cpu 1: start_ip = 99000
Nov 30 15:43:33 joy kernel: [ 2033.821442] Switched to NOHz mode on CPU #1
Nov 30 15:43:33 joy kernel: [ 2033.833551] CPU1 is up
Nov 30 15:43:33 joy kernel: [ 2033.833728] Booting Node 0 Processor 2 APIC 0x4
Nov 30 15:43:33 joy kernel: [ 2033.833732] smpboot cpu 2: start_ip = 99000
Nov 30 15:43:33 joy kernel: [ 2033.945082] Switched to NOHz mode on CPU #2
Nov 30 15:43:33 joy kernel: [ 2033.957239] CPU2 is up
Nov 30 15:43:33 joy kernel: [ 2033.957403] Booting Node 0 Processor 3 APIC 0x5
Nov 30 15:43:33 joy kernel: [ 2033.957407] smpboot cpu 3: start_ip = 99000
Nov 30 15:43:33 joy kernel: [ 2034.068806] Switched to NOHz mode on CPU #3
Nov 30 15:43:33 joy kernel: [ 2034.080975] CPU3 is up
Nov 30 15:43:33 joy kernel: [ 2034.083353] ACPI: Waking up from system sleep state S3

and then apparently when it tries to come back up:

Nov 30 15:43:34 joy kernel: [ 2035.922303] SMP alternatives: switching to UP code
Nov 30 15:43:34 joy kernel: [ 2035.984272] e1000e 0000:00:19.0: eth0: Unsupported Speed/Duplex configuration
Nov 30 15:43:34 joy kernel: [ 2036.081101] EXT4-fs (sda5): re-mounted. Opts: errors=remount-ro,commit=600
Nov 30 15:43:34 joy kernel: [ 2036.387203] e1000e 0000:00:19.0: irq 44 f...

Read more...

Revision history for this message
Martin Pool (mbp) wrote :

This is still happening on precise from last week, with

Linux joy 3.2.0-5-generic #11-Ubuntu SMP Thu Dec 15 19:06:32 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Martin Pool (mbp) wrote :

Linux joy 3.2.0-7-generic #13-Ubuntu SMP Sat Dec 24 18:06:57 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

still has this problem, at least to the extent that after the first resume, all but one cpu core is gone.

Revision history for this message
Martin Pool (mbp) wrote :

ok, thanks to colin and apw, we localized this to actually being due to powernap taking the cpus offline. with powernap-common purged, the problem is fixed.

affects: linux (Ubuntu) → powernap (Ubuntu)
Revision history for this message
Julian Edwards (julian-edwards) wrote :

\o/

Thank you thank you thank you.

Changed in powernap (Ubuntu):
assignee: nobody → Andres Rodriguez (andreserl)
Revision history for this message
Soenke (s0enke) wrote :

I had the same problem (11.11), but powernap NOT installed. Installing cpufrequtils fixed the problem for me. Now all CPUs are restarted properly on resume. I found this solution here: https://answers.launchpad.net/ubuntu/+source/acpi/+question/155520

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.