[LENOVO 4313CTO] suspend/resume failure

Bug #1079534 reported by Anders Kaseorg on 2012-11-16
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Unassigned

Bug Description

On my ThinkPad T510, raring’s kernel 3.7.0-2-generic crashes on suspend, leaving a black screen with the power light blinking. (It works in quantal’s kernel 3.5.)

[Rereporting from kernel 3.7, to try to get more relevant information from apport.]

ProblemType: KernelOops
DistroRelease: Ubuntu 13.04
Package: linux-image-3.7.0-2-generic 3.7.0-2.8
ProcVersionSignature: Ubuntu 3.7.0-2.8-generic 3.7.0-rc5
Uname: Linux 3.7.0-2-generic x86_64
Annotation: This occured during a previous suspend and prevented it from resuming properly.
ApportVersion: 2.6.2-0ubuntu3
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: anders 2757 F.... pulseaudio
Date: Fri Nov 16 00:00:26 2012
EcryptfsInUse: Yes
ExecutablePath: /usr/share/apport/apportcheckresume
Failure: suspend/resume
HibernationDevice: RESUME=UUID=fd305e7c-c58c-4061-8105-5cda63c38849
InstallationDate: Installed on 2010-12-05 (711 days ago)
InstallationMedia: Ubuntu 11.04 "Natty Narwhal" - Alpha amd64 (20101202)
InterpreterPath: /usr/bin/python3.3
MachineType: LENOVO 4313CTO
MarkForUpload: True
ProcCmdline: /usr/bin/python3 /usr/share/apport/apportcheckresume
ProcEnviron:
 TERM=linux
 PATH=(custom, no user)
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.7.0-2-generic root=/dev/mapper/fdisk-ubuntu ro nmi_watchdog=0 crashkernel=384M-2G:64M,2G-:128M quiet splash vt.handoff=7
PulseList:
 Error: command ['pacmd', 'list'] failed with exit code 1: Home directory /home/anders not ours.
 No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-3.7.0-2-generic N/A
 linux-backports-modules-3.7.0-2-generic N/A
 linux-firmware 1.97
SourcePackage: linux
Title: [LENOVO 4313CTO] suspend/resume failure
UpgradeStatus: Upgraded to raring on 2012-11-15 (0 days ago)
UserGroups:

dmi.bios.date: 06/05/2012
dmi.bios.vendor: LENOVO
dmi.bios.version: 6MET91WW (1.51 )
dmi.board.name: 4313CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr6MET91WW(1.51):bd06/05/2012:svnLENOVO:pn4313CTO:pvrThinkPadT510:rvnLENOVO:rn4313CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 4313CTO
dmi.product.version: ThinkPad T510
dmi.sys.vendor: LENOVO

Anders Kaseorg (andersk) wrote :

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.7 kernel[0] (Not a kernel in the daily directory) and install both the linux-image and linux-image-extra .deb packages.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.7-rc5-raring/

Changed in linux (Ubuntu):
importance: Undecided → Medium
importance: Medium → High
tags: added: needs-bisect
Joseph Salisbury (jsalisbury) wrote :

I'd also like to perform a bisect to figure out what commit caused this regression. It would be very helpful to know the earliest kernel where the issue started happening as well as the latest kernel that did not have this issue.

Can you test the following kernels and report back? We are looking for the first kernel version that exhibits this bug:

v3.5.7: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.5.7-quantal/
v3.6 final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.6-quantal/
v3.7-rc3: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.7-rc3-raring/

You don't have to test every kernel, just up until the kernel that first has this bug.

Thanks in advance!

tags: added: performing-bisect regression-release
removed: needs-bisect

This bug was filed against a series that is no longer supported and so is being marked as Won't Fix. If this issue still exists in a supported series, please file a new bug.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: Confirmed → Won't Fix
Anders Kaseorg (andersk) wrote :

Seems to be an upstream bug introduced somewhere between v3.6 and v3.7-rc3:

linux-image-3.5.0-18-generic_3.5.0-18.29_amd64.deb works
linux-image-3.5.7-030507-generic_3.5.7-030507.201210130556_amd64.deb works
linux-image-3.6.0-030600-generic_3.6.0-030600.201209302035_amd64.deb works
linux-image-3.7.0-2-generic_3.7.0-2.8_amd64.deb FAILS
linux-image-3.7.0-030700rc3-generic_3.7.0-030700rc3.201210310756_amd64.deb FAILS
linux-image-3.7.0-030700rc5-generic_3.7.0-030700rc5.201211110835_amd64.deb FAILS

tags: added: kernel-bug-exists-upstream
Anders Kaseorg (andersk) wrote :

linux-image-3.6.6-030606-generic_3.6.6-030606.201211050512_amd64.deb works
linux-image-3.7.0-030700rc1-generic_3.7.0-030700rc1.201210220602_amd64.deb FAILS

Anders Kaseorg (andersk) wrote :

I assume the “Won’t Fix” status is due to an error in the automated script, can you please remove it?

Joseph Salisbury (jsalisbury) wrote :

This bug was filed against a series that is no longer supported and so is being marked as Won't Fix. If this issue still exists in a supported series, please file a new bug.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: Won't Fix → Confirmed
status: Confirmed → Won't Fix
Joseph Salisbury (jsalisbury) wrote :

@Andres Kaseorg,

Yes the wont fix status was due to an error.

Changed in linux (Ubuntu):
status: Won't Fix → Confirmed
Changed in linux (Ubuntu):
status: Confirmed → In Progress
assignee: nobody → Joseph Salisbury (jsalisbury)
Joseph Salisbury (jsalisbury) wrote :

I started a kernel bisect between upstream v3.6 and v3.7-rc1. The bisect will require testing of about 10 - 12 kernels. I built the first test kernel, which is up to commit:
24d7b40a60cf19008334bcbcbd98da374d4d9c64

This kernel is available from:
http://people.canonical.com/~jsalisbury/lp1079534

Can you test this latest kernel and report back if it has the bug or not?

Anders Kaseorg (andersk) wrote :

Sure.
linux-image-3.6.0-030600-generic_3.6.0-030600.201211202050_amd64.deb FAILS

Anders Kaseorg (andersk) wrote :

The bug seems config-dependent, since if I compile Ubuntu-3.7.0-3.9 with defconfig, the resulting kernel can successfully suspend from the initramfs, while linux-image-3.7.0-3-generic cannot.

Anders Kaseorg (andersk) wrote :

I started randomly hybridizing the working and failing .configs, and bisected the problem down to this difference:

--- bad4 2012-11-27 07:52:28.535253255 -0500
+++ good6 2012-11-27 07:35:58.376369630 -0500
@@ -3850,6 +3850 @@
-CONFIG_LOCKUP_DETECTOR=y
-CONFIG_HARDLOCKUP_DETECTOR=y
-# CONFIG_BOOTPARAM_HARDLOCKUP_PANIC is not set
-CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=0
-# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
-CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
+# CONFIG_LOCKUP_DETECTOR is not set

That is, if I turn off CONFIG_LOCKUP_DETECTOR, suspend starts working again.

Joseph Salisbury (jsalisbury) wrote :

Thanks for testing and providing this information.

I see that CONFIG_LOCKUP_DETECTOR is enabled in Precise(12.04) and Quantal(12.10) as well as Raring(13.04). Do you also see this bug on those releases?

However, I do see that Raring is the only release that has CONFIG_HARDLOCKUP_DETECTOR enabled.

I'll build a Raring test kernel with CONFIG_HARDLOCKUP_DETECTOR disabled and post a link to it.

Anders Kaseorg (andersk) wrote :

I’m in the middle of a commit bisect now, and one of the suspicious commits is the same one mentioned here:
http://thread.gmane.org/gmane.linux.kernel/1397985
which has a reply with a patch.

Anders Kaseorg (andersk) wrote :

Confirmed that bcd951cf10f24e341defcd002c15a1f4eea13ddb is the first bad commit, that tglx’s patch in the above LKML thread fixes it, and that removing nmi_watchdog=0 from my kernel command line is a workaround (oops).

Joseph Salisbury (jsalisbury) wrote :

That's great news, Anders. I responded to the thread you posted in comment #16 to see if this patch will be submitted for v3.8.

Joseph Salisbury (jsalisbury) wrote :

The patch will be included in v3.8 and applied back into v3.7 stable:
http://thread.gmane.org/gmane.linux.kernel/1397985

I'll mark this bug back to triaged for now. It would be great if you can test once this patch lands and report back if it fixes this bug for you.

Thanks again for the great work!

Changed in linux (Ubuntu):
assignee: Joseph Salisbury (jsalisbury) → nobody
status: In Progress → Fix Committed
tags: added: kernel-bug-fixed-upstream
removed: performing-bisect
Julian Wiedmann (jwiedmann) wrote :

The referenced patch landed in 3.7, so let's assume that this bug is fixed.

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers