Comment 263 for bug 1247723

Revision history for this message
In , clancy.kieran+kernel (clancy.kieran+kernel-linux-kernel-bugs) wrote :

Created attachment 157601
dmesg on linux-next without quirk showing EC working before but not after triggering bug

(In reply to Lv Zheng from comment #240)
> Hi, Kieran
>
> Can you:
> 1. use current linux-pm/linux-next branch
> 2. enable EC debugging

Is it enough to just uncomment the #define DEBUG?

> 3. disable quirk

Is it enough to prevent EC_FLAGS_CLEAR_ON_RESUME from being set? Is this what you meant?

> 4. trigger the bug
> 5. capture the dmesg right after resume
> 6. upload the dmesg here

See attached. I annotated the dmesg with a few extra > /dev/kmsg lines starting with 'KC'. They are:

[ 107.787915] KC just booted and logged in
[ 137.240937] KC screen close/open works as intended
[ 203.943414] KC about to suspend, but not trigger bug this time
[ 226.584995] KC back from suspend
[ 265.000194] KC screen close/open works as intended
[ 298.608956] KC AC change detected as intended
[ 318.874279] KC about to suspend and trigger bug
[ 348.834217] KC back from suspend
[ 374.777099] KC screen open/close not detected
[ 398.000040] KC AC change not detected
[ 425.733615] KC about to manually clear EC events with script
[ 442.663539] KC EC events cleared
[ 469.766095] KC AC change detected again
[ 496.219180] KC screen open/close detected again

So between 0 and 318 you can see the EC operating as intended including one suspend cycle.

After 318 I trigger the bug (by unplugging AC a bunch of times), and after that you can see that there is nothing at all logged when I either closed the screen or unplugged AC.

At 425 I manually simulated the quirk by sending 0x84 EC command queries and reading the data until the data is 0x00 using Juan Manuel's userspace program.

After that, EC events are detected properly again.

----

My analysis is below:

After the bug is triggered, SCI_EVT=1 is set just ONE time, immediately after resume:

[ 337.319012] ACPI : EC: EC_SC(R) = 0x28 SCI_EVT=1 BURST=0 CMD=1 IBF=0 OBF=0

It does not seem as though we ever handle this event properly though. Namely, there seems to be no corresponding "EC_SC(W) = 0x84". There is a couple of "EC_DATA(W) = 0x84" but I'm pretty sure these are totally different?

----

Further testing shows that this SCI_EVT=1 happens for the first resume after the bug is triggered, but not for the second or subsequent resumes.

That means that we are still going to need a wakeup quirk because if for some reason we fail to clear the EC state before the next suspend, we will never get another SCI_EVT=1 (even after a power cycle, I believe).