Comment 33 for bug 811214

Revision history for this message
Loïc Minier (lool) wrote :

I have some pretty bad news

First, the bad commit seemed incorrect since it related to ath9k instead of iwlagn (I don't use ath9k).

I tried with both 3.0.0-8.10 and your 3.0.0-8.10+lp811214v1 and both failed.

I went back to the latest good kernel, and I tried suspending twice, and it failed suspending the second time, but with different symptoms: blinking caps lock.

Yesterday, before testing the latest kernel, I had some doubts about whether I had properly removed the workaround before trying the immediately previous one; just to make sure I had done it right, I reinstalled the last 4 iterations of the bisect before the last one and confirmed each of my result and got the exact same good/bad patterns.

But today before trying your kernel I updated userspace and ran into many weird bugs: race in initrd, xserver-xorg-video-intel breaking the boot, and other weird conditions.

Over the course of the last weeks testing these kernels, I think the testing conditions weren't exactly identical:
* laptop might not be started from a cold boot, but might have been just rebooted
* userspace has been upgraded at various points (running oneiric)
* sometimes, I just fail to boot (hang in initrd) due to races in the boot conditions; I suspect these are LVM related
* another class of differences is whether or not I need to fsck after a boot; I think this changes the raciness of my boot and might cause different issues
* sometimes I can't get a kernel to boot, hanging in initrd on every single boot, in which case I'll use recovery mode which works around the raciness but might give a different result
* I also fear that depending on the userspace I'm running, some things might not be loaded or in the same state, e.g. maybe lightdm brings up network-manager which brings up wifi card, or maybe it doesn't, depending on the version/boot conditions
* I don't even know what role the embedded controller plays here and whether I ought to remove the battery between trials

This is a bit depressing as I'm hitting probably a dozen of different bugs and I can't find reliable conditions to bisect just a single bug without slipping into slightly different symptoms which would indicate that another bug was hit.

So I'm trying to come up with a much smaller test case than booting + running pm-suspend from tty1 as root, as this is already too much; just running echo mem > /sys/power/state ain't enough:
a) it works because iwlagn isn't loaded yet; I tried modprobing it, but it doesn't suffice to give me an eth1 interface, and I have no idea why (modprobe itself doesn't complain though, but it's busybox')
b) video output isn't restored, so I can blindly try to run it multiple times, but it's not ideal for interactive testing and might not be representative

I'll try suspending multiple times in a row from the initrd or a minimal Ubuntu install on an USB stick and see if I can reproduce the exact same symptoms of the crash (suspend light blinks but not caps lock, screen remains on, can't wake up but can sysrq-reboot) and if I manage to suspend I will verify I can suspend at least twice in a row (three times seems to already trigger other bugs). *sigh*