[Samsung NP530U3C-A01] LID close, AC, and battery status events not produced anymore

Bug #1283589 reported by juanmanuel
140
This bug affects 28 people
Affects Status Importance Assigned to Milestone
Linux
Fix Released
High
linux (Ubuntu)
Triaged
Medium
Unassigned

Bug Description

On my Samsung Series 5 NP530U3C-A01 the LID close, AC, and battery status events not produced anymore. The situation is the same (tested with trusty-desktop-amd64.iso 2014-02-22). I created a blog post where I can put all the information related to this bug here: http://zenstep.com.ar/samsung-linux/

In summary, after a suspend sleep with many events (8 plug/unplug, or battery dropping or increasing about 16%, maybe less), the Embedded Controller accumulates those events and stops producing GPE 0x17. Neither Windows nor Linux query those accumulated events, because the EC doesn't produce GPE 0x17 anymore. The issue persists between restarts and shutdowns, and even after re-suspending/re-resuming. Hitting the reset button through the hole in the back fixes the problem temporarily. That is, until the next unlucky suspend.

IMPORTANT: if the laptop is never ever suspended, the issue never comes back.

To force the issue to happen, you can either:
1) Sleep the computer in linux (by closing the lid or any other means).
2) Unplug from the wall, plug, unplug, plug, unplug, plug, unplug, plug (8 actions or more).
3) Resume from sleep. You'll note that the battery icon is fixed, and unplugging or plugging doesn't update battery status anymore. Also note that the ability to suspend by closing the LID is lost.

OR:
1) echo disable > /sys/firmware/acpi/interrupts/gpe17
2) plug, unplug, plug, unplug, plug, unplug, plug, unplug (8 actions)
3) echo enable > /sys/firmware/acpi/interrupts/gpe17

WORKAROUND: Turn off the computer, unplug it, hit the reset button in the back of the laptop, and then plug it again.

WORKAROUND (kernel patch 1): https://bugzilla.kernel.org/show_bug.cgi?id=44161#c133

BETTER WORKAROUND (kernel patch 2): https://bugzilla.kernel.org/show_bug.cgi?id=44161#c149

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.13.0-11-generic 3.13.0-11.31
ProcVersionSignature: Ubuntu 3.13.0-11.31-generic 3.13.3
Uname: Linux 3.13.0-11-generic x86_64
ApportVersion: 2.13.2-0ubuntu5
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: ubuntu 2546 F.... pulseaudio
CasperVersion: 1.337
Date: Sat Feb 22 22:57:34 2014
LiveMediaBuild: Ubuntu 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140222)
MachineType: SAMSUNG ELECTRONICS CO., LTD. 530U3C/530U4C
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: initrd=/casper/initrd.lz file=/cdrom/preseed/hostname.seed boot=casper quiet splash -- BOOT_IMAGE=/casper/vmlinuz.efi
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-11-generic N/A
 linux-backports-modules-3.13.0-11-generic N/A
 linux-firmware 1.125
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/15/2013
dmi.bios.vendor: Phoenix Technologies Ltd.
dmi.bios.version: P14AAJ
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: SAMSUNG_NP1234567890
dmi.board.vendor: SAMSUNG ELECTRONICS CO., LTD.
dmi.board.version: FAB1
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 9
dmi.chassis.vendor: SAMSUNG ELECTRONICS CO., LTD.
dmi.chassis.version: 0.1
dmi.modalias: dmi:bvnPhoenixTechnologiesLtd.:bvrP14AAJ:bd04/15/2013:svnSAMSUNGELECTRONICSCO.,LTD.:pn530U3C/530U4C:pvr0.1:rvnSAMSUNGELECTRONICSCO.,LTD.:rnSAMSUNG_NP1234567890:rvrFAB1:cvnSAMSUNGELECTRONICSCO.,LTD.:ct9:cvr0.1:
dmi.product.name: 530U3C/530U4C
dmi.product.version: 0.1
dmi.sys.vendor: SAMSUNG ELECTRONICS CO., LTD.

Revision history for this message
In , email (email-linux-kernel-bugs) wrote :

While the power supply device (ADP1) seems to always understand when the cord is plugged in and removed, the battery (BAT1) does not seem to know when it is discharging or not.

Samsung Series 9 np900x4b with Fedora 17

$ uname -a
Linux Lethe 3.4.4-3.fc17.x86_64 #1 SMP Tue Jun 26 20:54:56 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

(Results of acpi -i -b -a)
*** AC PLUGGED IN ***
Battery 0: Full, 100%
Battery 0: design capacity 8400 mAh, last full capacity 8500 mAh = 100%
Adapter 0: on-line

*** AC REMOVED ***
Battery 0: Charging, 100%, until charged
Battery 0: design capacity 8400 mAh, last full capacity 8500 mAh = 100%
Adapter 0: off-line

*** System Suspended and woken back up ***

*** AC STILL REMOVED ***
Battery 0: Discharging, 100%, 06:33:49 remaining
Battery 0: design capacity 8400 mAh, last full capacity 8500 mAh = 100%
Adapter 0: off-line

*** AC PLUGGED IN ***
Battery 0: Unknown, 98%
Battery 0: design capacity 8400 mAh, last full capacity 8500 mAh = 100%
Adapter 0: on-line

*** SUSPEND AND WAKE ***

*** AC PLUGGED IN ***
Battery 0: Unknown, 98%
Battery 0: design capacity 8400 mAh, last full capacity 8500 mAh = 100%
Adapter 0: on-line

*** AC REMOVED ***
Battery 0: Charging, 98%, charging at zero rate - will never fully charge.
Battery 0: design capacity 8400 mAh, last full capacity 8500 mAh = 100%
Adapter 0: off-line

The same information is reflected when looking directly at the sysfs enteries.
/sys/class/power_supply/ADP1/online shows the correct status all the time. (1 for online, 0 offline)

/sys/class/power_supply/BAT1/status will show the correct status after a fresh suspend, reflecting the above results.

Revision history for this message
In , email (email-linux-kernel-bugs) wrote :

Created attachment 74581
dmesg

Revision history for this message
In , email (email-linux-kernel-bugs) wrote :

Created attachment 74591
lsmod

Revision history for this message
In , email (email-linux-kernel-bugs) wrote :

Created attachment 74601
lspci

Revision history for this message
In , email (email-linux-kernel-bugs) wrote :

Created attachment 74611
upower -d

Revision history for this message
In , email (email-linux-kernel-bugs) wrote :

Created attachment 74621
ACPI DSDT DSL

Revision history for this message
In , email (email-linux-kernel-bugs) wrote :

The effect this has on my system is that upowerd does not always know if the laptop is running on AC or Battery. This also means that things like the battery indicator in Gnome 3 and tuned do not end up in the right state.

Revision history for this message
In , bryan+lk (bryan+lk-linux-kernel-bugs) wrote :

Also affects new Ivy Bridge Samsung Series 9 models (observed on 900X4C). Also tested this using mainline kernels without vendor changes and got the same behavior.

Revision history for this message
In , mark (mark-linux-kernel-bugs) wrote :

Just in case the NP400x4C (Ivy Bridge version) give different values, here are the results of acpi -i -b -a on this system.

** AC Off **

Battery 0: Discharging, 70%, 03:46:06 remaining
Battery 0: design capacity 8400 mAh, last full capacity 8700 mAh = 100%
Adapter 0: off-line

** AC On **

Battery 0: Discharging, 70%, 01:55:16 remaining
Battery 0: design capacity 8400 mAh, last full capacity 8700 mAh = 100%
Adapter 0: on-line

** Suspend and wake **

** AC Still On **

Battery 0: Charging, 70%, 00:50:05 until charged
Battery 0: design capacity 8400 mAh, last full capacity 8700 mAh = 100%
Adapter 0: on-line

** AC Off **

Battery 0: Charging, 70%, 02:11:35 until charged
Battery 0: design capacity 8400 mAh, last full capacity 8700 mAh = 100%
Adapter 0: off-line

** Suspend and wake in this state **

** AC Off **

Battery 0: Discharging, 70%, 05:06:01 remaining
Battery 0: design capacity 8400 mAh, last full capacity 8700 mAh = 100%
Adapter 0: off-line

** AC On **

Battery 0: Discharging, 70%, 406:00:00 remaining
Battery 0: design capacity 8400 mAh, last full capacity 8700 mAh = 100%
Adapter 0: on-line

What I find quite interesting about these is that the times change as the AC state changes and seem to reflect the "correct" time, i.e. time to discharge when disconnected and time to charged when connected. So something is detecting the change, it's just not propagating to all properties.

Revision history for this message
In , mark (mark-linux-kernel-bugs) wrote :

Should have said, the above came from an Ubuntu 12.04 machine using mainline kernel packages.

Linux XXXXXX 3.4.0-030400-generic #201205210521 SMP Mon May 21 09:22:02 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
In , mail (mail-linux-kernel-bugs) wrote :

Also affected, Samsung 5 Series NP530U3C laptop, same symptoms:

Plugged in:
Battery 0: Charging, 79%, 00:27:20 until charged
Battery 0: design capacity 6100 mAh, last full capacity 5900 mAh = 96%
Adapter 0: on-line

Not plugged in:
Battery 0: Charging, 79%, 01:26:32 until charged
Battery 0: design capacity 6100 mAh, last full capacity 5900 mAh = 96%
Adapter 0: off-line

uname -a
Linux jaytec 3.2.0-27-generic #43-Ubuntu SMP Fri Jul 6 14:25:57 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Also had Ubuntu Quantal 3.5 kernel before wiping and reinstalling 3.2. The bug existed there also.

Revision history for this message
In , mail (mail-linux-kernel-bugs) wrote :

Created attachment 75961
dmesg

Revision history for this message
In , mail (mail-linux-kernel-bugs) wrote :

Created attachment 75971
lshw

Revision history for this message
In , mail (mail-linux-kernel-bugs) wrote :

Created attachment 75981
lsmod

Revision history for this message
In , mail (mail-linux-kernel-bugs) wrote :

Created attachment 75991
lspci

Revision history for this message
In , mail (mail-linux-kernel-bugs) wrote :

Created attachment 76001
upower

Revision history for this message
In , mail (mail-linux-kernel-bugs) wrote :

This for me seems fixed, although my Samsung laptop brightness buttons cause the screen to flicker and thus are useless now :)

uname -a
Linux jaytec 3.2.0-27-generic #43-Ubuntu SMP Fri Jul 6 14:25:57 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Kernel has not updated, not quite sure what update fixed it, possibly a driver update from Ubuntu http://ppa.launchpad.net/ubuntu-x-swat/x-updates/ Ubuntu PPA which offers updated X drivers.

38 comments hidden view all 301 comments
Revision history for this message
juanmanuel (rockerito99) wrote :
description: updated
Revision history for this message
juanmanuel (rockerito99) wrote :

Attached DSDT of a Samsung Series 5 NP530U3C ultrabook with the same problem. Bios version is latest: P14AAJ

description: updated
Revision history for this message
juanmanuel (rockerito99) wrote :

This is the program I made, that "unstucks" the computer so that it can send LID and AC and Battery events again. (it queries the embedded controller queued events, thus unblocking the EC so that it can start sending them again).

Ideally run after resume from sleep, or at any other time.

Revision history for this message
juanmanuel (rockerito99) wrote :

Script that calls the program found in the other attachment after resume from sleep.

Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
juanmanuel (rockerito99) wrote : Re: LID close and AC and battery status events not produced anymore on samsung ultrabook.

This is a patch created by Lan Tianyu on the kernel bugzilla to do the same that my workaround did, but in a more proper way, and from the kernel:

          https://bugzilla.kernel.org/show_bug.cgi?id=44161#c133

I tested it:

          https://bugzilla.kernel.org/show_bug.cgi?id=44161#c135

and it works.
Lets hope more people test this patch so that it can be included some day in the kernel.

description: updated
Revision history for this message
juanmanuel (rockerito99) wrote :

More description about this issue found I posted in blog format here:

           http://www.zenstep.com.ar/samsung-laptop
--
Juan Manuel Cabo

Revision history for this message
juanmanuel (rockerito99) wrote :
tags: added: patch
penalvch (penalvch)
summary: - LID close and AC and battery status events not produced anymore on
- samsung ultrabook.
+ [Samsung NP530U3C-A01] LID close and AC and battery status events not
+ produced anymore
summary: - [Samsung NP530U3C-A01] LID close and AC and battery status events not
+ [Samsung NP530U3C-A01] LID close, AC, and battery status events not
produced anymore
penalvch (penalvch)
description: updated
Revision history for this message
juanmanuel (rockerito99) wrote :

Christopher: you changed my description of the bug, and in doing so, confused the two different ways to force the issue to show up. You listed the second way as a fourth item of the first way, when it is a different way.

Take a look at the original description.
--
Juan Manuel Cabo

penalvch (penalvch)
tags: added: cherry-pick
Revision history for this message
Guillaume LAURENT (laurent-guillaume) wrote :

Also affect Serie 9 (NP900X3C in my case)

 Description: Ubuntu 13.10
 Release: 13.10

 Ubuntu 3.11.0-17.31-generic 3.11.10.3

juanmanuel workaround works.

Waiting Ubuntu patch

Revision history for this message
mmalmeida (mmalmeida) wrote :

Alfo affects: Samsung series 9 NP900X4C

Revision history for this message
Tim Edwards (tkedwards) wrote :

I have a Samsung NP530U3C-A01 laptop and I can confirm this bug affects my laptop. After applying the workaround script from juanmanuel in #3 and #4 I re-tested and the problem no longer appears.

Thanks juanmanuel for writing the fix and being so patient with Ubuntu's ridiculous bug policies - just don't lose your laptop otherwise we'll have to start this whole bug report process again!

Revision history for this message
Andrea Lazzarotto (Lazza) (andrea-lazzarotto) wrote :

«Thanks juanmanuel for writing the fix and being so patient with Ubuntu's ridiculous bug policies»

I think it's more of a one-man trolling issue than a policy problem.

Revision history for this message
juanmanuel (rockerito99) wrote :

UPDATED v2: Safer because it obtains the EC ports automatically from /proc/ioports, and it micro-pauses between queries, so that the EC returns each event only once. This replaces the little program in #3.
________
This is the program I made, that "unstucks" the computer so that it can send LID and AC and Battery events again. (it queries the embedded controller queued events, thus unblocking the EC so that it can start sending them again).

Ideally run after resume from sleep, or at any other time.

Revision history for this message
Tim Edwards (tkedwards) wrote :

@andrea-lazzarotto
To be fair it does say "One defect, per person, per hardware, per report" at https://wiki.ubuntu.com/Bugs/BestPractices#X.2BAC8-Reporting.Focus_on_One_Issue.

The 'One defect, per person' is obviously stupid when there's >100 of us with the same hardware and same problem, opening 100 bugs isn't going to help anyone. But I wouldn't make it personal against the guy if he's just a volunteer enforcing Canonical's rules.

Revision history for this message
Tomer (tbrisker) wrote :

@tkedwards
Only that rule was made to prevent multiple bugs being cluttered together in one bug report, not to be taken literally as a means of filling the bug system with hundreds of useless duplicates, and I quote:
"Don't amalgamate every issue and hardware you find a problem in after an update... Instead, do a separate report for each distinct issue, on each hardware. This is how different hardware can have similar problematic symptoms (ex. computer won't boot), but different root causes, and patches that fix the different causes."
Obviously this doesn't mean that every affected user should file a separate report for a common problem - otherwise what's the point of having the "this affects me too" on the top.

juanmanuel (rockerito99)
description: updated
Revision history for this message
Jon Cowell (info-synct) wrote :

I also have a Samsung NP-A530U3C-A01UK laptop that exhibits the same behaviour. Since applying the fix from juanmanul all is now well.

Many thanks juanmanual for identifying the cause for this long running issue and for working diligently to find the fix.

Revision history for this message
juanmanuel (rockerito99) wrote :

This is a NEW and BETTER patch created by Kieran Clancy on the kernel bugzilla to do the same that my workaround did, but in a more proper way, and from the kernel:

          https://bugzilla.kernel.org/show_bug.cgi?id=44161#c149

I also tested it this weekend, and it works (I forced the issue to happen again, and then saw it being resolved automatically by a kernel compiled with this patch).

The most salient advantage over the previous patch is that it also unstucks the EC when the computer starts (in addition to sleep resume). This is more fool proof and suits more use cases.

It also checks the laptop model.

Lets hope more people test this patch so that it can be included some day in the kernel.

Revision history for this message
juanmanuel (rockerito99) wrote :

> Many thanks juanmanuel for identifying the cause for this long running issue and for working diligently to find the fix.

You're welcome, it fills me with joy reading all this good feedback!!
--
Juan Manuel Cabo
http://zenstep.com.ar/samsung-linux

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

@juanmanuel ,

Do you plan on submitting your patch for inclusion in the mainline kernel?

tags: added: kernel-da-key
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Triaged
Revision history for this message
juanmanuel (rockerito99) wrote :

Joseph: The best kernel patch so far was made by Kieran Clancy (see post #18 here). His patch is attached to a kernel bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=44161 in comment 149.

I'm just the author of the userspace workaround (post #3, #4, #14) that started it all (see the other issue here https://bugs.launchpad.net/ubuntu/+source/acpi/+bug/971061 comments 102 onwards) . It is a standalone program that unstucks the EC. On the other hand, Kieran's kernel patch now does the same in a cleaner way from inside the kernel.

In my humble opinion, Kieran Clancy's patch can be included in the kernel. I don't know whether Kieran will submit it.

Cheers!
--
Juan Manuel Cabo

Revision history for this message
juanmanuel (rockerito99) wrote :

Breaking News!

A fix in the form of a kernel patch has now been posted to the linux-acpi and linux-kernel mailing list:

        "[PATCH v2] ACPI / EC: Clear stale EC events on Samsung systems"
        http://marc.info/?l=linux-acpi&m=139359680828880&w=2

That kernel patch was made by Kieran Clancy and tested by me and others.

--
Juan Manuel Cabo
http://zenstep.com.ar/samsung-linux

Revision history for this message
zeeeeee (miesogeno) wrote :

juanmanuel,
No matter who found the latest/better solution, you are truly a hero.
After reading all the associated bug reports, I decided to use your script because I don't know how to compile the kernel, and I presume I would have to redo it every time the kernel got an update.

My laptop model is NP530U4C, finally with a normal behavior.
Gracias!

238 comments hidden view all 301 comments
Revision history for this message
In , lv.zheng (lv.zheng-linux-kernel-bugs) wrote :
Download full text (3.4 KiB)

(In reply to Kieran Clancy from comment #238)
> Hi Lv,
>
> Thanks for all your work on this issue. It's certainly not easy, especially
> if you don't have the hardware available to test on yourself.
>
> You are probably right about the racy behaviour. One important point though,
> is that you would need to miss either 8 (or was it 16?) events in a row
> during run time for the buffer to fill and for SCI_EVT to stop being given
> entirely.
>
> Under normal usage, I think there's almost always going to be an event
> regularly enough that this "full" condition never occurs. It was only found
> to be a problem during suspend, where the EC controller seemed to continue
> logging, and after enough battery level or AC adapter changes it would stop
> triggering events even after resume.
>
> My HDD was too full recently to build kernels, but I have just freed some
> space (a bit late, sorry) so if you want me to test anything please let me
> know.

Hi,

Thanks a lot!
We failed to find this platform selling on China market.

I was thinking this issue doesn't exist during runtime. That's why I said in comment 236 that it might be a BIOS _WAK issue or something in acpi_hw_legacy_wake() has cleared SCI_EVT and we may need further investigation. But finally in comment 237, I found it occurred not only after resuming, but also runtime, so further investigation doesn't matter any more because this is definitely an EC firmware behavior according to the facts.

In our driver, we will queue 2nd QR_EC after sending the 1st QR_EC and before fetching the returned event. At that time, SCI_EVT should still be 1, so we can always send 2 QR_EC for 1 SCI_EVT=1 indication. This somewhat can reduce the reproduction ratio of the Samsung EC issue and that's why during runtime it can hardly be reproduced as we can drain events faster than they are accumulated in the firmware.

But the above code can only make a happen to work environment. If the host acts a bit slower than the target, this issue might still occurr during runtime when there are more than 3 events queued up and the driver will be unable to queue 3rd QR_EC if SCI_EVT is cleared after fetching the 1st event.

It seems the best way for Samsung is always sending QR_EC whatever SCI_EVT is until 0x00 returned. This driver behavior is reasonable from ACPI spec's point of view. The CLEAR_ON_RESUME quirk provided by you did this after resuming, we may need to extend this behavior to runtime.

But we do have many platforms act differently:
1. Some platforms never return 0x00 even when SCI_EVT=0, they return certain event value. Some of them even return the event value for which there is no _Qxx prepared in the namespace. So if we continously send QR_EC until 0x00 is returned, the process will never end on such platforms. The users of such platforms will sure blame us.
2. Some platforms (Acer) do not return anything if SCI_EVT=0, the EC query transaction will be blocked, and our driver cannot issue further EC transaction unless previous one is completed. So if we allow QR_EC to be sent without checking SCI_EVT, users of such platforms will complain.

What I'm going to do is:
1. extending the draining behavior - poll...

Read more...

Revision history for this message
In , lv.zheng (lv.zheng-linux-kernel-bugs) wrote :

Hi, Kieran

Can you:
1. use current linux-pm/linux-next branch
2. enable EC debugging
3. disable quirk
4. trigger the bug
5
. capture the dmesg right after resume
6. upload the dmesg here
for investigation?

Thanks in advance

Revision history for this message
In , clancy.kieran+kernel (clancy.kieran+kernel-linux-kernel-bugs) wrote :

Created attachment 157601
dmesg on linux-next without quirk showing EC working before but not after triggering bug

(In reply to Lv Zheng from comment #240)
> Hi, Kieran
>
> Can you:
> 1. use current linux-pm/linux-next branch
> 2. enable EC debugging

Is it enough to just uncomment the #define DEBUG?

> 3. disable quirk

Is it enough to prevent EC_FLAGS_CLEAR_ON_RESUME from being set? Is this what you meant?

> 4. trigger the bug
> 5. capture the dmesg right after resume
> 6. upload the dmesg here

See attached. I annotated the dmesg with a few extra > /dev/kmsg lines starting with 'KC'. They are:

[ 107.787915] KC just booted and logged in
[ 137.240937] KC screen close/open works as intended
[ 203.943414] KC about to suspend, but not trigger bug this time
[ 226.584995] KC back from suspend
[ 265.000194] KC screen close/open works as intended
[ 298.608956] KC AC change detected as intended
[ 318.874279] KC about to suspend and trigger bug
[ 348.834217] KC back from suspend
[ 374.777099] KC screen open/close not detected
[ 398.000040] KC AC change not detected
[ 425.733615] KC about to manually clear EC events with script
[ 442.663539] KC EC events cleared
[ 469.766095] KC AC change detected again
[ 496.219180] KC screen open/close detected again

So between 0 and 318 you can see the EC operating as intended including one suspend cycle.

After 318 I trigger the bug (by unplugging AC a bunch of times), and after that you can see that there is nothing at all logged when I either closed the screen or unplugged AC.

At 425 I manually simulated the quirk by sending 0x84 EC command queries and reading the data until the data is 0x00 using Juan Manuel's userspace program.

After that, EC events are detected properly again.

----

My analysis is below:

After the bug is triggered, SCI_EVT=1 is set just ONE time, immediately after resume:

[ 337.319012] ACPI : EC: EC_SC(R) = 0x28 SCI_EVT=1 BURST=0 CMD=1 IBF=0 OBF=0

It does not seem as though we ever handle this event properly though. Namely, there seems to be no corresponding "EC_SC(W) = 0x84". There is a couple of "EC_DATA(W) = 0x84" but I'm pretty sure these are totally different?

----

Further testing shows that this SCI_EVT=1 happens for the first resume after the bug is triggered, but not for the second or subsequent resumes.

That means that we are still going to need a wakeup quirk because if for some reason we fail to clear the EC state before the next suspend, we will never get another SCI_EVT=1 (even after a power cycle, I believe).

Revision history for this message
In , clancy.kieran+kernel (clancy.kieran+kernel-linux-kernel-bugs) wrote :

Created attachment 157611
dmesg on linux-next without quirk showing bug persisting over several suspend cycles

Of the three suspend/resume cycles shown in this dmesg, I trigger the EC bug during the first suspend time.

It comes back with SCI_EVT=1 set the first time, but the 2nd and 3rd resumes do not have this. In a moment I will confirm if this persists across power cycles.

It seems the right behaviour for affected Samsung machines is to send QR_EC until data is 0x00 not just if we get SCI_EVT=1, but additionally on boot or resume.

Revision history for this message
In , clancy.kieran+kernel (clancy.kieran+kernel-linux-kernel-bugs) wrote :

Created attachment 157631
dmesg on linux-next without quirk showing bug persisting over power cycle

Confirming that once the bug is initially triggered, we don't get SCI_EVT=1 even on a power cycle, so we still need the boot/resume quirk. At least, that's my interpretation.

I hope the dmesgs were helpful. Unfortunately I'm not going to have internet access for the next week (going to the outback where there is not even any mobile reception), but I'm happy to test more things when I return.

Revision history for this message
In , lv.zheng (lv.zheng-linux-kernel-bugs) wrote :
Download full text (3.5 KiB)

(In reply to Kieran Clancy from comment #241)
> Created attachment 157601 [details]
> dmesg on linux-next without quirk showing EC working before but not after
> triggering bug
>
> (In reply to Lv Zheng from comment #240)
> > Hi, Kieran
> >
> > Can you:
> > 1. use current linux-pm/linux-next branch
> > 2. enable EC debugging
>
> Is it enough to just uncomment the #define DEBUG?

Yes.

> > 3. disable quirk
>
> Is it enough to prevent EC_FLAGS_CLEAR_ON_RESUME from being set? Is this
> what you meant?

Yes.

> > 4. trigger the bug
> > 5. capture the dmesg right after resume
> > 6. upload the dmesg here
>
> See attached. I annotated the dmesg with a few extra > /dev/kmsg lines
> starting with 'KC'. They are:
>
> [ 107.787915] KC just booted and logged in
> [ 137.240937] KC screen close/open works as intended
> [ 203.943414] KC about to suspend, but not trigger bug this time
> [ 226.584995] KC back from suspend
> [ 265.000194] KC screen close/open works as intended
> [ 298.608956] KC AC change detected as intended
> [ 318.874279] KC about to suspend and trigger bug
> [ 348.834217] KC back from suspend
> [ 374.777099] KC screen open/close not detected
> [ 398.000040] KC AC change not detected
> [ 425.733615] KC about to manually clear EC events with script
> [ 442.663539] KC EC events cleared
> [ 469.766095] KC AC change detected again
> [ 496.219180] KC screen open/close detected again

Great information!
Thanks.

> So between 0 and 318 you can see the EC operating as intended including one
> suspend cycle.
>
> After 318 I trigger the bug (by unplugging AC a bunch of times), and after
> that you can see that there is nothing at all logged when I either closed
> the screen or unplugged AC.
>
> At 425 I manually simulated the quirk by sending 0x84 EC command queries and
> reading the data until the data is 0x00 using Juan Manuel's userspace
> program.
>
> After that, EC events are detected properly again.

Great test cases!
Thanks.

> My analysis is below:
>
> After the bug is triggered, SCI_EVT=1 is set just ONE time, immediately
> after resume:
>
> [ 337.319012] ACPI : EC: EC_SC(R) = 0x28 SCI_EVT=1 BURST=0 CMD=1 IBF=0 OBF=0
>
> It does not seem as though we ever handle this event properly though.
> Namely, there seems to be no corresponding "EC_SC(W) = 0x84". There is a
> couple of "EC_DATA(W) = 0x84" but I'm pretty sure these are totally
> different?

This is a bug, in ec_poll(), there is no code to check SCI_EVT.
And event will thus be lost.

I think this has been fixed in the ec-flush6.patch.
+out:
+ if (status & ACPI_EC_FLAG_SCI &&
+ (!t || t->flags & ACPI_EC_COMMAND_COMPLETE))
+ __acpi_ec_set_event(ec);

We will have this flag checked in the advance_transaction.
So you won't see this with ec-flush[1-6].patch applied.

> Further testing shows that this SCI_EVT=1 happens for the first resume after
> the bug is triggered, but not for the second or subsequent resumes.
>
> That means that we are still going to need a wakeup quirk because if for
> some reason we fail to clear the EC state before the next suspend, we will
> never get another SCI_EVT=1 (even after a power cycle, I believe).

Yes, I think this is requi...

Read more...

Revision history for this message
In , lv.zheng (lv.zheng-linux-kernel-bugs) wrote :

(In reply to Kieran Clancy from comment #242)
> Created attachment 157611 [details]
> dmesg on linux-next without quirk showing bug persisting over several
> suspend cycles
>
> Of the three suspend/resume cycles shown in this dmesg, I trigger the EC bug
> during the first suspend time.
>
> It comes back with SCI_EVT=1 set the first time, but the 2nd and 3rd resumes
> do not have this. In a moment I will confirm if this persists across power
> cycles.
>
> It seems the right behaviour for affected Samsung machines is to send QR_EC
> until data is 0x00 not just if we get SCI_EVT=1, but additionally on boot or
> resume.

Yes.
I was planning to support this in this way.

Thanks

Revision history for this message
In , lv.zheng (lv.zheng-linux-kernel-bugs) wrote :

(In reply to Kieran Clancy from comment #243)
> Created attachment 157631 [details]
> dmesg on linux-next without quirk showing bug persisting over power cycle
>
> Confirming that once the bug is initially triggered, we don't get SCI_EVT=1
> even on a power cycle, so we still need the boot/resume quirk. At least,
> that's my interpretation.
>
> I hope the dmesgs were helpful. Unfortunately I'm not going to have internet
> access for the next week (going to the outback where there is not even any
> mobile reception), but I'm happy to test more things when I return.

Yes, it's very helpful.
Thanks for the help!

Best regards
-Lv

Revision history for this message
In , lv.zheng (lv.zheng-linux-kernel-bugs) wrote :

(In reply to Lv Zheng from comment #244)
> (In reply to Kieran Clancy from comment #241)
> > My analysis is below:
> >
> > After the bug is triggered, SCI_EVT=1 is set just ONE time, immediately
> > after resume:
> >
> > [ 337.319012] ACPI : EC: EC_SC(R) = 0x28 SCI_EVT=1 BURST=0 CMD=1 IBF=0
> OBF=0
> >
> > It does not seem as though we ever handle this event properly though.
> > Namely, there seems to be no corresponding "EC_SC(W) = 0x84". There is a
> > couple of "EC_DATA(W) = 0x84" but I'm pretty sure these are totally
> > different?
>
> This is a bug, in ec_poll(), there is no code to check SCI_EVT.
> And event will thus be lost.
>
> I think this has been fixed in the ec-flush6.patch.
> +out:
> + if (status & ACPI_EC_FLAG_SCI &&
> + (!t || t->flags & ACPI_EC_COMMAND_COMPLETE))
> + __acpi_ec_set_event(ec);
>
> We will have this flag checked in the advance_transaction.
> So you won't see this with ec-flush[1-6].patch applied.

I should take this back. :-)
This still need to be improved to indicate SCI_EVT even when there is a transaction running.

Thanks
-Lv

Revision history for this message
In , lenb (lenb-linux-kernel-bugs) wrote :

commit 74443bbed72ab22ee005ecb6ecdc657a8018e1db
Author: Lv Zheng <email address hidden>
Date: Wed Jan 14 19:28:47 2015 +0800

    ACPI / EC: Fix issues related to the SCI_EVT handling

shipped in Linux-4.0-rc1
closed.

Revision history for this message
In , el (el-linux-kernel-bugs) wrote :

commit 4c237371f290d1ed3b2071dd43554362137b1cce
Author: Lv Zheng <email address hidden>
Date: Wed Jan 4 11:17:17 2017 +0800

    ACPI / EC: Remove old CLEAR_ON_RESUME quirk

The above reintroduced this bug on my NP900x4c. Reverting the commit makes things work again.

Revision history for this message
In , balazs4web (balazs4web-linux-kernel-bugs) wrote :

(In reply to Elvis Pranskevichus from comment #249)
> commit 4c237371f290d1ed3b2071dd43554362137b1cce
> Author: Lv Zheng <email address hidden>
> Date: Wed Jan 4 11:17:17 2017 +0800
>
> ACPI / EC: Remove old CLEAR_ON_RESUME quirk
>
> The above reintroduced this bug on my NP900x4c. Reverting the commit makes
> things work again.

Confirmed, the bug is back again on samsung latops :-(

https://bbs.archlinux.org/viewtopic.php?pid=1767061#p1767061

Revision history for this message
In , mark (mark-linux-kernel-bugs) wrote :

Not exactly surprising when the change that fixed then gets reverted out of the kernel. Seems like Intel trying to force obsolescence of these systems.

Revision history for this message
In , balazs4web (balazs4web-linux-kernel-bugs) wrote :

(In reply to Mark Syms from comment #251)
> Not exactly surprising when the change that fixed then gets reverted out of
> the kernel. Seems like Intel trying to force obsolescence of these systems.

what has to be done to bring the fix back into the kernel code?
I mean at the first glance it was only for `Samsung hardware`, so I guess it does not affect any other vendors/hardwares.
The mentioned commit looks like just a `code cleanup` changes. but I'm not familiar with the kernel code.

Shall I file a new issue or could this be opened again?

Revision history for this message
In , rui.zhang (rui.zhang-linux-kernel-bugs) wrote :

Lv is not working on this now, and I'm new to the EC code, first let's confirm a quick revert patch.

Revision history for this message
In , rui.zhang (rui.zhang-linux-kernel-bugs) wrote :

Created attachment 274121
revert patch

please confirm
1. the problem exists with the latest upstream kernel, say, 4.16-rc1
2. the problem is gone after applying the patch attached

Revision history for this message
In , odi (odi-linux-kernel-bugs) wrote :

I can confirm that the regression exists with 4.15.3 and that your patch fixes it. 4.16 not tested.

Revision history for this message
In , cribari (cribari-linux-kernel-bugs) wrote :

I am experiencing the same problem (Arch Linux, kernel 4.15.3, KDE). Hardware: Samsung laptop model 900X3L. Will the patch be included in the Linux kernel?

Revision history for this message
In , balazs4web (balazs4web-linux-kernel-bugs) wrote :

Created attachment 274447
revert-patch works

confirmed, pacth is working on top of `4.15`

Revision history for this message
In , cribari (cribari-linux-kernel-bugs) wrote :

I am still facing this problem with Arch Linux + kernel 4.16.0-2. Hardware is a Samsung laptop (model: NP900X3L-KW1BR). Suggestions are welcome.

Revision history for this message
In , cribari (cribari-linux-kernel-bugs) wrote :

I compiled kernels 4.16.0 and 4.16.2 (in Arch Linux) using the patch in Comment #254 and I confirm that the patch fixes the problem. (Hardware: Samsung model NP900X3L-KW1BR.)

Revision history for this message
In , yu.c.chen (yu.c.chen-linux-kernel-bugs) wrote :

Hi @Rui,
may I know if the patch #Comment 254 will be pushed upstream ?

Changed in linux:
importance: Unknown → High
status: Unknown → Confirmed
Revision history for this message
In , jonas (jonas-linux-kernel-bugs) wrote :

Hi

Poll on status for this bug. Does anyone know if/when it will be fixed upstream?

1 comments hidden view all 301 comments
Revision history for this message
In , cribari (cribari-linux-kernel-bugs) wrote :

Has this been fixed upstream? Despite what we read at

https://www.systutorials.com/linux-kernels/59801/acpi-ec-fix-regression-related-to-triggering-source-of-ec-event-handling-linux-4-14-3/

some people still face the problem. Thank you.

Revision history for this message
In , rui.zhang (rui.zhang-linux-kernel-bugs) wrote :

sorry that I thought I have already sent this patch upstream, but apparently I didn't.

(In reply to Francisco Cribari from comment #263)
> Has this been fixed upstream? Despite what we read at
>
> https://www.systutorials.com/linux-kernels/59801/acpi-ec-fix-regression-
> related-to-triggering-source-of-ec-event-handling-linux-4-14-3/
>
I can not access this page.
But if you can confirm the problem still exists in 4.19-rc1, and the revert indeed fixes it, I will send the revert patch out.

Changed in linux:
status: Confirmed → Incomplete
Revision history for this message
Kira (kirasglimmer) wrote :

This happens for me.

Linux 4.15.0-34-generic #37-Ubuntu SMP Mon Aug 27 15:21:48 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

I'm running Linux Mint 19 Cinnamon 3.8.9. If I disconnect power, remove the back cover, disconnect both the main battery and the backup battery, I am able to get power status updates (e.g. battery will show as charging/charged when plugged in, or just regular battery when not). Once I sleep, I have to repeat this process if I want the battery status to work again.

`acpi -p` correctly shows the charging state, but `tlp-stat -b` does not.

Revision history for this message
Kira (kirasglimmer) wrote :

Additionally,

`acpitool -ab` shows when unplugged,
  Battery #1 : Charging, 100.0%, 00:00:00
  AC adapter : off-line

and when plugged in:
  Battery #1 : Charging, 100.0%, 00:00:00
  AC adapter : online

Note: Battery charge percentage does work.

Revision history for this message
In , rui.zhang (rui.zhang-linux-kernel-bugs) wrote :

Created attachment 278797
revert patch on top of 4.19-rc4

please confirm this revert patch works for you on top of 4.19-rc kernel.
I will send it out after got your confirmation.

Revision history for this message
In , odi (odi-linux-kernel-bugs) wrote :

(In reply to Zhang Rui from comment #265)
> please confirm this revert patch works for you on top of 4.19-rc kernel.
> I will send it out after got your confirmation.

It does! Also the CPU max frequency is now back to 2.4GHz after unplug/replug cycle.

VANILLA 4.19-rc4

plugged:
bat ~ # cat /sys/class/power_supply/ADP1/online
1
bat ~ # cat /sys/class/power_supply/BAT1/status
Charging

unplugged:
bat ~ # cat /sys/class/power_supply/ADP1/online
0
bat ~ # cat /sys/class/power_supply/BAT1/status
Charging

PATCHED 4.19-rc4

plugged:
bat ~ # cat /sys/class/power_supply/BAT1/status
Charging

unplugged:
bat ~ # cat /sys/class/power_supply/ADP1/online
0
bat ~ # cat /sys/class/power_supply/BAT1/status
Discharging

replugged:
bat ~ # cat /sys/class/power_supply/ADP1/online
1
bat ~ # cat /sys/class/power_supply/BAT1/status
Charging

Revision history for this message
In , jonas (jonas-linux-kernel-bugs) wrote :

(In reply to Zhang Rui from comment #265)
> Created attachment 278797 [details]
> revert patch on top of 4.19-rc4
>
> please confirm this revert patch works for you on top of 4.19-rc kernel.
> I will send it out after got your confirmation.

Hi Rui,

Any chance this will hit 4.19 rc before final release?

Revision history for this message
In , odi (odi-linux-kernel-bugs) wrote :

ping... what else is required to get this merged?

Revision history for this message
In , pablo.caron (pablo.caron-linux-kernel-bugs) wrote :

The bug is still present in kernel 4.20. I am using Ubuntu and installed the "standard" kernel downloaded from:
https://kernel.ubuntu.com/~kernel-ppa/mainline/v4.20/

Revision history for this message
In , mjg (mjg-linux-kernel-bugs) wrote :

Same problem on 4.20.11 on Fedora 29 (which has a patched upower, so it's not that other bug).

With the revert patch fromm comment #265, everything works as expected (and as it used to before the regression).

Please let me know if you need more info.

In case someone wants to test on Fedora:

https://copr.fedorainfracloud.org/coprs/mjg/kernel-book9/

Revision history for this message
In , rui.zhang (rui.zhang-linux-kernel-bugs) wrote :

The revert patch has been merged by Rafael, and it will show up in v5.1

Revision history for this message
In , rui.zhang (rui.zhang-linux-kernel-bugs) wrote :
Changed in linux:
status: Incomplete → Fix Released
Revision history for this message
In , mjg (mjg-linux-kernel-bugs) wrote :

(In reply to Zhang Rui from comment #271)
> The revert patch has been merged by Rafael, and it will show up in v5.1

Just for anyone wondering: it's in the stable kernel 5.0.9, also.

Revision history for this message
In , pablo.caron (pablo.caron-linux-kernel-bugs) wrote :

(In reply to Michael J Gruber from comment #273)
> (In reply to Zhang Rui from comment #271)
> > The revert patch has been merged by Rafael, and it will show up in v5.1
>
> Just for anyone wondering: it's in the stable kernel 5.0.9, also.

I can confirm that the bug has been fixed in the stable kernel 5.0.9 downloaded from https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.0.9/

Thank you!

You can find some test I did to check the correct behavior.

~$ uname -sr
Linux 5.0.9-050009-generic
Plugged
~$ cat /sys/class/power_supply/BAT1/status
Charging

Unplugged
~$ cat /sys/class/power_supply/ADP1/online
0
~$ cat /sys/class/power_supply/BAT1/status
Discharging

Replugged
~$ cat /sys/class/power_supply/ADP1/online
1
~$ cat /sys/class/power_supply/BAT1/status
Charging

Brad Figg (brad-figg)
tags: added: cscc
Revision history for this message
In , cinemaapk.in (cinemaapk.in-linux-kernel-bugs) wrote :

watch your loving colors tv show online
https://biggboss13tvs.com

Revision history for this message
In , cinemaapk.in (cinemaapk.in-linux-kernel-bugs) wrote :

watch pinoy tv channel and pinoy tv shows free
https://pinoytvz.su

Displaying first 40 and last 40 comments. View all 301 comments or add a comment.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.