Hangs while suspending with iwlagn on Intel Corporation PRO/Wireless 5350 AGN [Echo Peak]

Bug #811214 reported by Loïc Minier
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Leann Ogasawara

Bug Description

Hey

Since some weeks, suspend resume doesn't work on my Thinkpad X301. The suspend light blinks while going into suspend, but screen doesn't actually turn off. The LED keeps blinking forever and pressing the power button or Fn doesn't get it out of this mode. Pressing alt-sysrq-b reboots the machine.

I've set /sys/power/pm_trace to 1, ran pm-suspend and saw some HCI related output, but removing the bluetooth modules before suspend didn't help. Doing this again and observing dmesg output after a failed suspend showed:
[ 1.324491] PM: Hibernation image not present or could not be loaded.
[ 1.324505] registered taskstats version 1
[ 1.336174] Magic number: 3:176:696

and only this occurrence of "Magic number". The wifi module is the only thing on PCI bus 3:
03:00.0 Network controller: Intel Corporation PRO/Wireless 5350 AGN [Echo Peak] Network Connection

sure enough, removing iwlagn before suspend allowed it to work like a charm.

I'm filing this against linux, but it might be a linux-firmware regression as I see this driver recently got updated.

Cheers,

ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: linux-image-3.0.0-5-generic 3.0.0-5.6
ProcVersionSignature: Ubuntu 3.0.0-5.6-generic 3.0.0-rc7
Uname: Linux 3.0.0-5-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
AplayDevices:
 **** List of PLAYBACK Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: CONEXANT Analog [CONEXANT Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: CONEXANT Analog [CONEXANT Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: lool 2782 F.... pulseaudio
CRDA: Error: [Errno 2] Aucun fichier ou dossier de ce type
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xf0620000 irq 46'
   Mixer name : 'Conexant CX20561 (Hermosa)'
   Components : 'HDA:14f15051,17aa211f,00100000'
   Controls : 12
   Simple ctrls : 7
Card29.Amixer.info:
 Card hw:29 'ThinkPadEC'/'ThinkPad Console Audio Control at EC reg 0x30, fw 6EHT11WW-1.05'
   Mixer name : 'ThinkPad EC 6EHT11WW-1.05'
   Components : ''
   Controls : 1
   Simple ctrls : 1
Card29.Amixer.values:
 Simple mixer control 'Console',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [on]
Date: Fri Jul 15 20:49:17 2011
EcryptfsInUse: Yes
MachineType: LENOVO 2777CTO
ProcEnviron:
 LANGUAGE=fr_FR:fr:en_GB:en
 PATH=(custom, user)
 LANG=fr_FR.UTF-8
 SHELL=/bin/zsh
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.0.0-5-generic root=/dev/mapper/hostname--vg0-ubuntu--root ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.0.0-5-generic N/A
 linux-backports-modules-3.0.0-5-generic N/A
 linux-firmware 1.56
SourcePackage: linux
StagingDrivers: mei
UpgradeStatus: Upgraded to oneiric on 2009-12-07 (585 days ago)
WpaSupplicantLog:

dmi.bios.date: 12/10/2010
dmi.bios.vendor: LENOVO
dmi.bios.version: 6EET54WW (3.14 )
dmi.board.name: 2777CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr6EET54WW(3.14):bd12/10/2010:svnLENOVO:pn2777CTO:pvrThinkPadX301:rvnLENOVO:rn2777CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 2777CTO
dmi.product.version: ThinkPad X301
dmi.sys.vendor: LENOVO

CVE References

Revision history for this message
Loïc Minier (lool) wrote :
Revision history for this message
Loïc Minier (lool) wrote :

I tried downgrading linux-firmware to the natty version (1.52), but that reported the same firmware version in dmesg after modprobe -r iwlagn + modprobe iwlagn; I downgraded to the maverick version (1.38) and saw an older firmware version getting loaded in dmesg (in fact I got a complaint that this was a v2 API firmware while linux expected v5), but the older firmware didn't allow suspending, so looks like a kernel bug. ISTR that early linux-3.0 Ubuntu kernels allowed suspending though, so I guess I signed for a bisect.

Revision history for this message
Loïc Minier (lool) wrote :
Revision history for this message
Loïc Minier (lool) wrote :

Kernel .debs I tested:
linux-image-2.6.38-8-generic 2.6.38-8.42 pass
linux-image-2.6.39-3-generic 2.6.39-3.10 pass
linux-image-3.0-0-generic 3.0-0.1 fail

Revision history for this message
Loïc Minier (lool) wrote :

I tried rebuilding the Ubuntu way with CONFIG_NL80211_TESTMODE and CONFIG_IWLWIFI_DEVICE_SVTOOL turned off as this showed up in the config diff and seemed related enough, but it didn't help.

The I tried rebuilding the upstream way but I didn't manage to build a sensible mainline config that I could use straight away with my current initrd, so I guess I'll have to dig a bit more into this; the config I used was based of Ubuntu and transformed with localyesconfig to allow carrying a single zImage around, but that resulted in an OOPS on boot (strncpy in i2400m module).

Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Hi Loic,

I'd be willing to help you with the bisect (ie build some test kernels for you) to really narrow down the bad commit between 2.6.39 and 3.0. Or if you're up to it, you could follow the ubuntu dev week talk that John gave about doing a kernel bisect:

https://wiki.ubuntu.com/MeetingLogs/devweek1107/KernelDebugging

Changed in linux (Ubuntu):
assignee: nobody → Leann Ogasawara (leannogasawara)
importance: Undecided → High
status: Confirmed → Triaged
Revision history for this message
Loïc Minier (lool) wrote :

I'd love if you could help me on this, thanks! I think I know how to run "git bisect", but because the history has been rebased between Ubuntu 2.6.39 and Ubuntu 3.0-rcs, I can't bisect between the two I think.

If you have some test kernels, I'd be happy to try them out

The next thing I would have tried given the time would have been to revert all the commits between the bad and good kernels in drivers/net/wireless/iwlwifi or just bisect these.

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Hi Loic,

So lets start this bisect.. I'll apologize in advance about the number of test kernels you'll have to go through. Bisecting between a major version jump is the most painful and will take 10+ iterations I'm guessing.

First, I'd really like to eliminate the possibility that this was magically fixed between 3.0-rc7 and the final 3.0 release. So if you could just update and try the latest 3.0.0-7.8 Ubuntu kernel we just uploaded, that would be great. It was rebased onto 3.0 final. Assuming that fails, lets proceed with the following:

Lets eliminate the set of Ubuntu patches so we can focus on bisecting just the upstream commits. If you could, please test the following 2.6.39 mainline kernel and confirm it works:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.39-oneiric/

If you could then test the 3.0-rc1 mainline kernel and confirm it fails:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.0-rc1-oneiric/

Finally, please test the following kernel which is a bisect point between the two (ie commit c44dead70a841d90ddc01968012f323c33217c9e)

http://people.canonical.com/~ogasawara/lp811214/

Let me know your results and we can continue from there. Thanks.

Revision history for this message
Loïc Minier (lool) wrote :

Ubuntu vmlinuz-3.0.0-7-generic => FAIL (suspend hangs)
2.6.39 vmlinuz-2.6.39-020639-generic => PASS (suspend works)
3.0 vmlinuz-3.0.0-0300rc1-generic => FAIL
first bisect iteration vmlinuz-2.6.39-020639gc44dead-generic => FAIL

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks, please test the next bisect point and let me know your results (commit d93515611bbc70c2fe4db232e5feb448ed8e4cc9). Test kernel posted to the same location as before:

http://people.canonical.com/~ogasawara/lp811214/

I've moved the previous test kernel to the bad/ directory.

Revision history for this message
Loïc Minier (lool) wrote : Re: [Bug 811214] Re: Hangs while suspending with iwlagn on Intel Corporation PRO/Wireless 5350 AGN [Echo Peak]

On Tue, Jul 26, 2011, Leann Ogasawara wrote:
> Thanks, please test the next bisect point and let me know your results
> (commit d93515611bbc70c2fe4db232e5feb448ed8e4cc9). Test kernel posted
> to the same location as before:
>
> http://people.canonical.com/~ogasawara/lp811214/

 vmlinuz-2.6.39-020639rc2gd935156-generic => BAD

--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks. Next bisect iteration, commit 9c6a02f41d10dc9fbf5dd42058e8846f38dd2d9a

http://people.canonical.com/~ogasawara/lp811214/

Revision history for this message
Loïc Minier (lool) wrote :

On Tue, Jul 26, 2011, Leann Ogasawara wrote:
> Thanks. Next bisect iteration, commit
> 9c6a02f41d10dc9fbf5dd42058e8846f38dd2d9a
> http://people.canonical.com/~ogasawara/lp811214/

 vmlinuz-2.6.39-020639rc2g9c6a02f-generic => good

 Thanks!
--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Next iteration, commit 972a77fbf1bbea6f54b5986b05041a17b607695b

http://people.canonical.com/~ogasawara/lp811214/

Revision history for this message
Loïc Minier (lool) wrote :

On Wed, Jul 27, 2011, Leann Ogasawara wrote:
> Next iteration, commit 972a77fbf1bbea6f54b5986b05041a17b607695b
> http://people.canonical.com/~ogasawara/lp811214/

 vmlinuz-2.6.39-020639rc2g972a77f-generic => bad

--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks Loïc, next iteration, commit ab42b4041707f075533845ecb320c7a1c5621f1b

http://people.canonical.com/~ogasawara/lp811214/

Revision history for this message
Loïc Minier (lool) wrote :

On Wed, Jul 27, 2011, Leann Ogasawara wrote:
> Thanks Loïc, next iteration, commit
> ab42b4041707f075533845ecb320c7a1c5621f1b
> http://people.canonical.com/~ogasawara/lp811214/

 vmlinuz-2.6.39-020639rc1gab42b40-generic => bad

   Thanks
--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks. Next iteration, commit 9eab61c2bff2f769ee771a7a9301fb720cec9b56

http://people.canonical.com/~ogasawara/lp811214/

Revision history for this message
Loïc Minier (lool) wrote :

On Thu, Jul 28, 2011, Leann Ogasawara wrote:
> Thanks. Next iteration, commit 9eab61c2bff2f769ee771a7a9301fb720cec9b56
> http://people.canonical.com/~ogasawara/lp811214/

 vmlinuz-2.6.39-020639rc1g9eab61c-generic => bad

   Thanks,
--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks. Next iteration, commit 534f0e29282a007a589a659d31baa1ef828c22da

http://people.canonical.com/~ogasawara/lp811214/

Revision history for this message
Loïc Minier (lool) wrote :

On Thu, Jul 28, 2011, Leann Ogasawara wrote:
> Thanks. Next iteration, commit 534f0e29282a007a589a659d31baa1ef828c22da
> http://people.canonical.com/~ogasawara/lp811214/

 vmlinuz-2.6.39-020639rc1g534f0e2-generic => good

   Thanks,
--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks, next iteration, commit 79d1d2b8a34fd36e63cc7f5267cf79217a44edcc

http://people.canonical.com/~ogasawara/lp811214/

Revision history for this message
Loïc Minier (lool) wrote :

On Fri, Jul 29, 2011, Leann Ogasawara wrote:
> Thanks, next iteration, commit 79d1d2b8a34fd36e63cc7f5267cf79217a44edcc
> http://people.canonical.com/~ogasawara/lp811214/

 vmlinuz-2.6.39-020639rc1g79d1d2b-generic => bad

   Thanks,
--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks, next iterations, commit 17869f4fe940407b5b80039110c0257c90e18a99

http://people.canonical.com/~ogasawara/lp811214/

Revision history for this message
Loïc Minier (lool) wrote :

On Fri, Jul 29, 2011, Leann Ogasawara wrote:
> Thanks, next iterations, commit 17869f4fe940407b5b80039110c0257c90e18a99
> http://people.canonical.com/~ogasawara/lp811214/

 Sorry for the delay, at Linaro Connect this week

 vmlinuz-2.6.39-020639rc1g17869f4-generic => good

   Thanks,
--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks. Next iteration, commit a969c09184e7cb7d14838598b54c6effbef8b584

http://people.canonical.com/~ogasawara/lp811214/

Revision history for this message
Loïc Minier (lool) wrote :

On Tue, Aug 02, 2011, Leann Ogasawara wrote:
> Thanks. Next iteration, commit a969c09184e7cb7d14838598b54c6effbef8b584
> http://people.canonical.com/~ogasawara/lp811214/

 vmlinuz-2.6.39-020639rc1ga969c09-generic => bad

   Thanks,
--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks. Next iteration, commit 66953d438576938b02e6ff0ade1958f3e90af4a9

http://people.canonical.com/~ogasawara/lp811214/

Revision history for this message
Loïc Minier (lool) wrote :

On Thu, Aug 04, 2011, Leann Ogasawara wrote:
> Thanks. Next iteration, commit 66953d438576938b02e6ff0ade1958f3e90af4a9
> http://people.canonical.com/~ogasawara/lp811214/

 vmlinuz-2.6.39-020639rc1g66953d4-generic => good

   Thanks,
--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks. Next iteration, commit 3594beae705523982823f84bf4997f680b2cf75f

http://people.canonical.com/~ogasawara/lp811214/

Revision history for this message
Loïc Minier (lool) wrote :

On Mon, Aug 08, 2011, Leann Ogasawara wrote:
> Thanks. Next iteration, commit 3594beae705523982823f84bf4997f680b2cf75f
> http://people.canonical.com/~ogasawara/lp811214/

 vmlinuz-2.6.39-020639rc1g3594bea-generic => good

   Thanks,
--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks Loïc. So it appears the following is the culprit according to the bisect:

a969c09184e7cb7d14838598b54c6effbef8b584 is the first bad commit
commit a969c09184e7cb7d14838598b54c6effbef8b584
Author: Vasanthakumar Thiagarajan <email address hidden>
Date: Tue Apr 19 19:29:13 2011 +0530

    ath9k_hw: Configure tuning capacitance value for AR9340 as well

    Signed-off-by: Vasanthakumar Thiagarajan <email address hidden>
    Signed-off-by: John W. Linville <email address hidden>

:040000 040000 85986a0a3a2b45b6a1b6c989c44d81a5bf315d9e 83360f99395b77ec4326410436b9f3637d7fab6b M drivers

If you could conduct two more tests for me that would be great:

1) Confirm the 3.0.0-8.10 Ubuntu kernel in the repo still fails. It was most recently rebased on the upstream stable v3.0.1 kernel.

2) Next, try the test kernel I've built with the above commit a969c091 reverted. The test kernel is basically the 3.0.0-8.10 Ubuntu kernel minus commit a969c091. I've placed it at the usual location. Version string is 3.0.0-8.10+lp811214v1

http://people.canonical.com/~ogasawara/lp811214/

Revision history for this message
Loïc Minier (lool) wrote :

I have some pretty bad news

First, the bad commit seemed incorrect since it related to ath9k instead of iwlagn (I don't use ath9k).

I tried with both 3.0.0-8.10 and your 3.0.0-8.10+lp811214v1 and both failed.

I went back to the latest good kernel, and I tried suspending twice, and it failed suspending the second time, but with different symptoms: blinking caps lock.

Yesterday, before testing the latest kernel, I had some doubts about whether I had properly removed the workaround before trying the immediately previous one; just to make sure I had done it right, I reinstalled the last 4 iterations of the bisect before the last one and confirmed each of my result and got the exact same good/bad patterns.

But today before trying your kernel I updated userspace and ran into many weird bugs: race in initrd, xserver-xorg-video-intel breaking the boot, and other weird conditions.

Over the course of the last weeks testing these kernels, I think the testing conditions weren't exactly identical:
* laptop might not be started from a cold boot, but might have been just rebooted
* userspace has been upgraded at various points (running oneiric)
* sometimes, I just fail to boot (hang in initrd) due to races in the boot conditions; I suspect these are LVM related
* another class of differences is whether or not I need to fsck after a boot; I think this changes the raciness of my boot and might cause different issues
* sometimes I can't get a kernel to boot, hanging in initrd on every single boot, in which case I'll use recovery mode which works around the raciness but might give a different result
* I also fear that depending on the userspace I'm running, some things might not be loaded or in the same state, e.g. maybe lightdm brings up network-manager which brings up wifi card, or maybe it doesn't, depending on the version/boot conditions
* I don't even know what role the embedded controller plays here and whether I ought to remove the battery between trials

This is a bit depressing as I'm hitting probably a dozen of different bugs and I can't find reliable conditions to bisect just a single bug without slipping into slightly different symptoms which would indicate that another bug was hit.

So I'm trying to come up with a much smaller test case than booting + running pm-suspend from tty1 as root, as this is already too much; just running echo mem > /sys/power/state ain't enough:
a) it works because iwlagn isn't loaded yet; I tried modprobing it, but it doesn't suffice to give me an eth1 interface, and I have no idea why (modprobe itself doesn't complain though, but it's busybox')
b) video output isn't restored, so I can blindly try to run it multiple times, but it's not ideal for interactive testing and might not be representative

I'll try suspending multiple times in a row from the initrd or a minimal Ubuntu install on an USB stick and see if I can reproduce the exact same symptoms of the crash (suspend light blinks but not caps lock, screen remains on, can't wake up but can sysrq-reboot) and if I manage to suspend I will verify I can suspend at least twice in a row (three times seems to already trigger other bugs). *sigh*

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks Loïc. I have to agree that I was getting mighty skeptical as the bisect got narrower and saw it converging on ath9k.

Anyways, let us know if you're able to determine a reliable reproducer. You might want to also ping Colin King (cking) about his systemtap scripts for debugging suspend/resume as well. Good Luck.

Revision history for this message
Loïc Minier (lool) wrote :

So I've started from a plain natty debootstrap + grub + linux kernels + pm-utils + wireless-tools of USB

I did the same sequence starting with 2.6.39 (good) vs. 3.0-rc1 (bad) and confirmed most tests above with one failed suspend with expected blinking suspend light for each bad commit and 3 successive suspends for each good commit. Cold start for each test.

The first difference was with 79d1d2b which is good, not bad (tested 5 times).

Sorry about that, at least I now have a much less disruptive testing environment and we can continue testing from this commit on. :-/

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks Loïc. So I've reset the bisect:

ogasawara@tyler:~/linux$ git bisect good 79d1d2b8a34fd36e63cc7f5267cf79217a44edcc
ogasawara@tyler:~/linux$ git bisect bad 9eab61c2bff2f769ee771a7a9301fb720cec9b56
Bisecting: 10 revisions left to test after this (roughly 4 steps)
[3a7dbc3b2ac545efac75d4145839eaa7b59d9741] mwl8k: Do not stop tx queues

So our next iteration to test is commit 3a7dbc3b2ac545efac75d4145839eaa7b59d9741. Test kernel at the usual location. Let me know your results and we'll go from there. Thanks.

http://people.canonical.com/~ogasawara/lp811214/

Revision history for this message
Loïc Minier (lool) wrote :

On Tue, Aug 09, 2011, Leann Ogasawara wrote:
> So our next iteration to test is commit
> 3a7dbc3b2ac545efac75d4145839eaa7b59d9741. Test kernel at the usual
> location. Let me know your results and we'll go from there. Thanks.
> http://people.canonical.com/~ogasawara/lp811214/

 rc1 3a7dbc3 => good; 5 successful pm-suspend runs

--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks, next iteration, commit 788f6875fcf5d2bce221fbfd2318ac48df299031

http://people.canonical.com/~ogasawara/lp811214/

Revision history for this message
Loïc Minier (lool) wrote :

On Wed, Aug 10, 2011, Leann Ogasawara wrote:
> Thanks, next iteration, commit 788f6875fcf5d2bce221fbfd2318ac48df299031
> http://people.canonical.com/~ogasawara/lp811214/

 rc1 788f687 => bad; failed suspending with the usual blinking sleep led

--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks, next iteration, commit 3a769888797b7117005e9c60d4cd73a2efc92f8d

http://people.canonical.com/~ogasawara/lp811214/

Revision history for this message
Loïc Minier (lool) wrote :

On Wed, Aug 10, 2011, Leann Ogasawara wrote:
> Thanks, next iteration, commit 3a769888797b7117005e9c60d4cd73a2efc92f8d
> http://people.canonical.com/~ogasawara/lp811214/

 So again, another weird error on this one: I had plugged the USB stick
 to another port for booting, it booted fine, suspend fine one time,
 then failed to suspend: it was hung after pm-suspend, sleep light
 wasn't on, I rebooted with alt-sysrq and then the BIOS wouldn't boot
 the USB key again.

 So I switched the USB port back to the usual place and booted fine, I
 suspended successfully 7 times in a row.

 rc1 3a76988 => good

--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks. Next iteration, commit ca45de77ad706e86b135b8564e21aa2c8a63f09b

http://people.canonical.com/~ogasawara/lp811214/

Revision history for this message
Loïc Minier (lool) wrote :

On Wed, Aug 10, 2011, Leann Ogasawara wrote:
> Thanks. Next iteration, commit ca45de77ad706e86b135b8564e21aa2c8a63f09b
> http://people.canonical.com/~ogasawara/lp811214/

 rc1 ca45de7 => bad; usual symptoms on one failed suspend

--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks, next iteration, commit 31d291a769b4318cbf7943ca149e04d201e2c931

http://people.canonical.com/~ogasawara/lp811214/

Revision history for this message
Loïc Minier (lool) wrote :

On Thu, Aug 11, 2011, Leann Ogasawara wrote:
> Thanks, next iteration, commit 31d291a769b4318cbf7943ca149e04d201e2c931
> http://people.canonical.com/~ogasawara/lp811214/

 31d291a => good; 5 successful suspends

--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks, this appears to confirm that commit ca45de77 is the culprit:

ogasawara@tyler:~/linux$ git bisect good
ca45de77ad706e86b135b8564e21aa2c8a63f09b is the first bad commit
commit ca45de77ad706e86b135b8564e21aa2c8a63f09b
Author: Johannes Berg <email address hidden>
Date: Thu Apr 21 13:38:00 2011 +0200

    mac80211: tear down BA sessions properly on suspend

    Currently, the code to tear down BA sessions will
    execute after queues are stopped, but attempt to
    send frames, so those frames will just get queued,
    which isn't intended. Move this code to before to
    tear down the sessions properly.

    Additionally, after stopping queues, flush the TX
    queues in the driver driver to make sure all the
    frames went out.

    Signed-off-by: Johannes Berg <email address hidden>
    Signed-off-by: John W. Linville <email address hidden>

:040000 040000 6eb835cc6807975fa496a21c0d06d7f616fd03cd 345434da7cd928ed497b2da866cd4a94c85552bc M net

I've subsequently built an Ubuntu test kernel with the above commit ca45de77 reverted. It's basically the 3.0.0-8.10 Ubuntu kernel minus commit ca45de77. I've placed it at the usual location. Version string is 3.0.0-8.10+lp811214v2. Please test and let me know your results:

http://people.canonical.com/~ogasawara/lp811214/

If that test kernel works, I'm also going to have you test the latest v3.1-rc1 mainline kernel which just came out to confirm there hasn't been a subsequent patch applied which resolves this regression. I may have to build the v3.1-rc1 kernel for you as it appears it's not yet available at http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.1-rc1-oneiric/ . Thanks!

Revision history for this message
Julian Wiedmann (jwiedmann) wrote :
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks Julian. I indeed saw that commit in the v3.1-rc1 set of changes as well.

Loïc, I've built an additional test kernel with the commit Julian has noted in comment #47. It's basically the Ubuntu 3.0.0-8.10 kernel plus commit 94f9b97b. Version string for the test kernel is 3.0.0-8.10+lp811214v3. I've placed it at:

http://people.canonical.com/~ogasawara/lp811214/v3/

Revision history for this message
Loïc Minier (lool) wrote :

 Bot the v2 and v3 kernels (oneiric + revert and oneiric + upstream
 cherry-pick) suspend fine, 5 times in a row

--
Loïc Minier

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Perfect, Thanks Loïc. I'll get the upstream patch sent to the Ubuntu kernel-team mailing list and applied to Oneiric.

Changed in linux (Ubuntu):
status: Triaged → In Progress
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Loïc, I've applied the upstream patch to the Oneiric kernel git repo. I plan to upload by end of day today. In the mean time, just use the v3 test kernel. Thanks.

Changed in linux (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.0.0-8.11

---------------
linux (3.0.0-8.11) oneiric; urgency=low

  [ Andy Whitcroft ]

  * [Config] Enable CONFIG_MACVTAP=m
    - LP: #822601

  [ Colin Watson ]

  * Deliver more Atheros, Ralink, and iwlagn NIC drivers to d-i

  [ Stefan Bader ]

  * (config) Package macvlan and macvtap for virtual

  [ Tim Gardner ]

  * [Config] Clean up tools rules
  * [Config] Package x86_energy_perf_policy and turbostat
    - LP: #797556

  [ Upstream Kernel Changes ]

  * dell-wmi: Add keys for Dell XPS L502X
    - LP: #815914
  * hfsplus: ensure bio requests are not smaller than the hardware sectors
    - LP: #734883
  * Ecryptfs: Add mount option to check uid of device being mounted =
    expect uid
    - LP: #732628
    - CVE-2011-1833
  * ideapad: define cfg bits and create sysfs node for cfg
  * ideapad: let camera_power node invisiable if no camera
  * ideapad: add backlight driver
  * ideapad: add missing ideapad_input_exit in ideapad_acpi_add error path
  * eCryptfs: Fix payload_len unitialized variable warning
  * eCryptfs: fix compile error
  * eCryptfs: Return error when lower file pointer is NULL
  * mac80211: be more careful in suspend/resume
    - LP: #811214
 -- Leann Ogasawara <email address hidden> Mon, 08 Aug 2011 06:23:16 -0700

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Loïc Minier (lool) wrote :

On Fri, Aug 12, 2011, Leann Ogasawara wrote:
> Perfect, Thanks Loïc. I'll get the upstream patch sent to the Ubuntu
> kernel-team mailing list and applied to Oneiric.

 (FTR, Ubuntu kernel from oneiric confirmed to work too)

--
Loïc Minier

Revision history for this message
Erik Nygren (erik+ubuntu) wrote :

I've been experiencing the exact same behavior on a Thinkpad X301 using Ubuntu 10.04 Lucid. (Suspension worked great for multiple years, and sometime a few months ago a regression was introduced which now causes it to hang on suspend.)
It may be worth backporting this fix to Lucid (although I haven't yet confirmed that it resolved the issue there).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.