Battery drains when laptop is off (shutdown)

Bug #1745646 reported by Gopal on 2018-01-26
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linux
Unknown
Unknown
linux (Ubuntu)
Status tracked in Cosmic
Artful
Medium
Joseph Salisbury
Bionic
Medium
Joseph Salisbury
Cosmic
Medium
Joseph Salisbury

Bug Description

== SRU Justification ==
A regression was introduced in 4.13-rc1 and newer kernels. This
regression caused battery drain during system suspend, hibernation or
shutdown for some PCI devices that are not allowed by user space to wake
up the system from sleep (or power off).

This fix has been submitted upstream and cc'd to stable. However, it
has not landed in linux-next or mainline yet, so it is being sent as
SAUCE.

== Fix ==
UBUNTU: SAUCE: PCI / PM: Check device_may_wakeup() in pci_enable_wake()

== Regression Potential ==
Medium. Commit fixes a current regresssion, but affects PCI power management.
It will also be submitted to upstream stable and have additional review.
The commit is a clean cherry pick, builds successfully and was confirmed
to resolve regression.

== Test Case ==
A test kernel was built with this patch and tested by the original bug reporter.
The bug reporter states the test kernel resolved the bug.

== Original Bug Description ==
I am using hp AY008tx laptop , and many other laptop users (HP) are also facing the same issue . The problem don't occur when i install windows 10 . Now, only ubuntu is installed.
i am using latest bios insyde20 rev 5.(latest)
WOL disabled and no usb device connected
checked my battery .
it don't happen when i remove battery and plug it again after shutdown.
tried laptop_mode_tools and tpl too

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-4.13.0-31-generic 4.13.0-31.34~16.04.1
ProcVersionSignature: Ubuntu 4.13.0-31.34~16.04.1-generic 4.13.13
Uname: Linux 4.13.0-31-generic x86_64
NonfreeKernelModules: wl
ApportVersion: 2.20.1-0ubuntu2.15
Architecture: amd64
CurrentDesktop: Unity
Date: Fri Jan 26 22:50:53 2018
InstallationDate: Installed on 2018-01-25 (1 days ago)
InstallationMedia: Ubuntu 16.04.3 LTS "Xenial Xerus" - Release amd64 (20170801)
SourcePackage: linux-hwe
UpgradeStatus: No upgrade log present (probably fresh install)
---
ApportVersion: 2.20.1-0ubuntu2.15
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: gopal 2391 F.... pulseaudio
DistroRelease: Ubuntu 16.04
HibernationDevice: RESUME=UUID=0bf03973-049c-480e-9e14-7596bf68d994
InstallationDate: Installed on 2018-01-25 (1 days ago)
InstallationMedia: Ubuntu 16.04.3 LTS "Xenial Xerus" - Release amd64 (20170801)
Lsusb:
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 003: ID 04f2:b56c Chicony Electronics Co., Ltd
 Bus 001 Device 002: ID 0a5c:216d Broadcom Corp.
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: HP HP Notebook
NonfreeKernelModules: wl
Package: linux (not installed)
ProcEnviron:
 LANGUAGE=en_IN:en
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=en_IN
 SHELL=/bin/bash
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.13.0-31-generic.efi.signed root=UUID=3a4ef59e-130a-4ce0-92d3-2fc42f5c9f59 ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 4.13.0-31.34~16.04.1-generic 4.13.13
PulseList:
 Error: command ['pacmd', 'list'] failed with exit code 1: Home directory not accessible: Permission denied
 No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-4.13.0-31-generic N/A
 linux-backports-modules-4.13.0-31-generic N/A
 linux-firmware 1.157.15
Tags: xenial
Uname: Linux 4.13.0-31-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 11/01/2017
dmi.bios.vendor: Insyde
dmi.bios.version: F.40
dmi.board.asset.tag: Type2 - Board Asset Tag
dmi.board.name: 81EC
dmi.board.vendor: HP
dmi.board.version: 61.58
dmi.chassis.type: 10
dmi.chassis.vendor: HP
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnInsyde:bvrF.40:bd11/01/2017:svnHP:pnHPNotebook:pvrType1ProductConfigId:rvnHP:rn81EC:rvr61.58:cvnHP:ct10:cvrChassisVersion:
dmi.product.family: 103C_5335KV HP Notebook
dmi.product.name: HP Notebook
dmi.product.version: Type1ProductConfigId
dmi.sys.vendor: HP

Gopal (s10gopal) wrote :
Gopal (s10gopal) on 2018-01-26
description: updated
Gopal (s10gopal) on 2018-01-26
Changed in linux-hwe (Ubuntu):
status: New → Confirmed
Gopal (s10gopal) on 2018-01-26
Changed in linux-hwe (Ubuntu):
status: Confirmed → New
Gopal (s10gopal) on 2018-01-26
description: updated
TJ (tj) on 2018-01-26
affects: linux-hwe (Ubuntu) → linux (Ubuntu)
Gopal (s10gopal) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected
description: updated
Gopal (s10gopal) wrote : CRDA.txt

apport information

apport information

Gopal (s10gopal) wrote : IwConfig.txt

apport information

apport information

Gopal (s10gopal) wrote : Lspci.txt

apport information

apport information

apport information

apport information

apport information

Gopal (s10gopal) wrote : RfKill.txt

apport information

Gopal (s10gopal) wrote : UdevDb.txt

apport information

apport information

description: updated

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
TJ (tj) wrote :

Attached disassembly of DSDT/SSDTs. iasl reports:

     * iASL Warning: There were 18 external control methods found during
     * disassembly, but only 9 were resolved (9 unresolved).

TJ (tj) wrote :

The TPM (Trusted Platform Module) ACPI device isn't know by the v4.13 kernel. Support for it was introduced with commit 4cb586a18 in v4.14 so it should be supported by the 18.04 kernel once it is rebased to v4.15, and will eventually be available to 16.04 as the package: linux-image-lowlatency-hwe-16.04-edge.

git describe --contains 4cb586a18
v4.14-rc2~6^2~34

It's possible the TPM is active and not using power-saving modes when the system is shutdown since Linux v4.13 doesn't know how to manage it (but it'll have been enabled by the firmware at boot time). This could explain why Windows doesn't see the same issue if it's TMP driver puts the device to sleep.

I'll review the disassembled DSDT/SSDTs later and leave remarks if I notice anything else that could be the culprit.

Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.15 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.15

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Gopal (s10gopal) wrote :

kernel-bug-exists-upstream

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Gopal (s10gopal) on 2018-01-30
summary: - Battery drain when laptop off (shutdown) , WOL disabled , no usb device
- connected
+ Battery drains when laptop off (shutdown)
description: updated
summary: - Battery drains when laptop off (shutdown)
+ Battery drains when laptop is off (shutdown)
Andy Whitcroft (apw) on 2018-02-03
tags: added: kernel-bug-exists-upstream
Joseph Salisbury (jsalisbury) wrote :

This issue appears to be an upstream bug, since you tested the latest upstream kernel. Would it be possible for you to open an upstream bug report[0]? That will allow the upstream Developers to examine the issue, and may provide a quicker resolution to the bug.

Please follow the instructions on the wiki page[0]. The first step is to email the appropriate mailing list. If no response is received, then a bug may be opened on bugzilla.kernel.org.

Once this bug is reported upstream, please add the tag: 'kernel-bug-reported-upstream'.

[0] https://wiki.ubuntu.com/Bugs/Upstream/kernel

Changed in linux (Ubuntu):
status: Confirmed → Triaged
Gopal (s10gopal) wrote :

System: Host: gopal-HP-Notebook Kernel: 4.15.0-041500-generic x86_64 (64 bit gcc: 7.2.0)
           Desktop: Unity 7.4.0 (Gtk 3.18.9-1ubuntu3.3) dm: lightdm
           Distro: Ubuntu 16.04 xenial
Machine: System: HP product: HP Notebook v: Type1ProductConfigId
           Mobo: HP model: 81EC v: 61.58 Bios: Insyde v: F.40 date: 11/01/2017
           Chassis: type: 10
CPU: Dual core Intel Core i5-6200U (-HT-MCP-) cache: 3072 KB
           flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 9600
           clock speeds: min/max: 400/2800 MHz 1: 987 MHz 2: 547 MHz
           3: 567 MHz 4: 541 MHz
Graphics: Card-1: Intel Sky Lake Integrated Graphics
           bus-ID: 00:02.0 chip-ID: 8086:1916
           Card-2: Advanced Micro Devices [AMD/ATI] Sun XT [Radeon HD 8670A/8670M/8690M / R5 M330]
           bus-ID: 01:00.0 chip-ID: 1002:6660
           Display Server: X.Org 1.19.5 drivers: (unloaded: fbdev,vesa)
           Resolution: 1366x768@60.02hz
           GLX Renderer: Mesa DRI Intel HD Graphics 520 (Skylake GT2)
           GLX Version: 3.0 Mesa 17.2.4 Direct Rendering: Yes
Audio: Card Intel Sunrise Point-LP HD Audio
           driver: snd_hda_intel bus-ID: 00:1f.3 chip-ID: 8086:9d70
           Sound: Advanced Linux Sound Architecture v: k4.15.0-041500-generic
Network: Card-1: Realtek RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller
           driver: r8169 v: 2.3LK-NAPI port: 3000
           bus-ID: 02:00.0 chip-ID: 10ec:8136
           IF: enp2s0 state: up speed: 100 Mbps duplex: full mac: <filter>
           Card-2: Broadcom BCM43142 802.11b/g/n
           bus-ID: 03:00.0 chip-ID: 14e4:4365
           IF: N/A state: N/A mac: N/A
Drives: HDD Total Size: 120.0GB (4.7% used)
           ID-1: /dev/sda model: ADATA_SP580 size: 120.0GB serial: H30276K015977 temp: 43C
Partition: ID-1: / size: 110G used: 5.3G (6%) fs: ext4 dev: /dev/sda1
RAID: System: supported: N/A
           No RAID devices: /proc/mdstat, md_mod kernel module present
           Unused Devices: none
Sensors: System Temperatures: cpu: 37.0C mobo: 29.8C
           Fan Speeds (in rpm): cpu: N/A
Repos: Active apt sources in file: /etc/apt/sources.list
           deb http://in.archive.ubuntu.com/ubuntu/ xenial main restricted
           deb http://in.archive.ubuntu.com/ubuntu/ xenial-updates main restricted
           deb http://in.archive.ubuntu.com/ubuntu/ xenial universe
           deb http://in.archive.ubuntu.com/ubuntu/ xenial-updates universe
           deb http://in.archive.ubuntu.com/ubuntu/ xenial multiverse
           deb http://in.archive.ubuntu.com/ubuntu/ xenial-updates multiverse
           deb http://in.archive.ubuntu.com/ubuntu/ xenial-backports main restricted universe multiverse
           deb http://security.ubuntu.com/ubuntu xenial-security main restricted
           deb http://security.ubuntu.com/ubuntu xenial-security universe
           deb http://security.ubuntu.com/ubuntu xenial-security multiverse
Info: Processes: 202 Uptime: 1:42 Memory: 1059.6/11900.5MB
           Init: systemd v: 229 runlevel: 5 default: 2 Gcc sys: 5.4.0
           Client: Shell (bash 4.3.481 running in gnome-terminal-) inxi: 2.2.35

Gopal (s10gopal) wrote :

Before shutdown http://paste.ubuntu.com/26523794/
bad case http://paste.ubuntu.com/26524029/ ( battery in)

again shutting down http://paste.ubuntu.com/26524034/
best case http://paste.ubuntu.com/26524216/ ( battery out )

Gopal (s10gopal) on 2018-02-07
no longer affects: acpi (Ubuntu)
Gopal (s10gopal) wrote :

affects: acpi (Ubuntu)

Gopal (s10gopal) wrote :

After installing Ubuntu 14.04.05 LTS , i think my problem is solved
WOL is on.
Before shutdown https://paste.ubuntu.com/=jyDm4qrdR7/
After 15 hours, the bad case is https://paste.ubuntu.com/=qFTztm4jrJ/.

Andy Whitcroft (apw) wrote :

This report has become very confused. I think you are saying that if you install a current 16.04 your battery drains when powered off. When you install 14.04.5 it does not. Assuming it is a kernel related issue then that implies that the 4.4.0-* kernel from 14.04, and likely the same kernel in 16.04 would be ok, but the linux-hwe kernel in 16.04 is not.

Could you confirm that my interpretation of your bug is correct.

Gopal (s10gopal) wrote :

i'm on 4.4.0-31-generic . I tried to install old kernel in ubuntu 16.04 (it was 4.10 ,4.11 and 4.8) but my laptop became super laggy . random processor was going to 99% use in use.

Gopal (s10gopal) on 2018-03-30
Changed in linux (Ubuntu):
status: Triaged → New
Changed in linux (Ubuntu Artful):
status: New → In Progress
Changed in linux (Ubuntu):
status: New → In Progress
Changed in linux (Ubuntu Artful):
importance: Undecided → Medium
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu):
assignee: nobody → Joseph Salisbury (jsalisbury)
Joseph Salisbury (jsalisbury) wrote :

I started a kernel bisect between v4.12 final and v4.13-rc1. The kernel bisect will require testing of about 13 test kernels.

I built the first test kernel, up to the following commit:
e5f76a2e0e84ca2a215ecbf6feae88780d055c56

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1745646

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Gopal (s10gopal) wrote :

Linux gopal-HP-Notebook 4.12.0-041200-generic #201804041240 SMP Wed Apr 4 12:44:39 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Before shutdown : http://paste.ubuntu.com/p/7MMYKh5V2S/

Gopal (s10gopal) wrote :

Linux gopal-HP-Notebook 4.12.0-041200-generic #201804041240 SMP Wed Apr 4 12:44:39 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux is bad . http://paste.ubuntu.com/p/Mf4y9dTfTv/

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
1849f800fba32cd5a0b647f824f11426b85310d8

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1745646

When you run uname -a this time, the kernel should have this string in the name: 201804041858

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Gopal (s10gopal) wrote :

Linux gopal-HP-Notebook 4.12.0-041200-generic #201804041858 SMP Wed Apr 4 19:15:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Before shutdown : http://paste.ubuntu.com/p/tXDwmDDFTT/

Gopal (s10gopal) wrote :

4.12.0-041200-generic #201804041858 is also bad. http://paste.ubuntu.com/p/JB2Zgzhjfr/

Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
cbcd4f08aa637b74f575268770da86a00fabde6d

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1745646

When you run uname -a this time, the kernel should have this string in the name: 201804041858

Can you test that kernel and report back if it has the bug or not? I will build the next test kernel based on your test results.

Thanks in advance

Gopal (s10gopal) wrote :

4.12.0-041200-generic #201804051118 SMP Thu Apr 5 11:34:19 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
 Before shutdown :http://paste.ubuntu.com/p/yHDGwsFxNF/

Gopal (s10gopal) wrote :

4.12.0-041200-generic #201804051118 is good

Gopal (s10gopal) on 2018-04-09
Changed in acpi (Ubuntu):
status: New → Confirmed
44 comments hidden view all 124 comments

On Fri, Apr 13, 2018 at 7:56 PM, Joseph Salisbury
<email address hidden> wrote:
> Hi Rafael,
>
> A kernel bug report was opened against Ubuntu [0]. After a kernel
> bisect, it was found that reverting the following two commits resolved
> this bug:
>
> 0ce3fcaff929 ("PCI / PM: Restore PME Enable after config space restoration")
> 0847684cfc5f("PCI / PM: Simplify device wakeup settings code")
>
> This is a regression introduced in v4.13-rc1 and still exists in
> mainline. The bug causes the battery to drain when the system is
> powered down and unplugged, which does not happed prior to these two
> commits.

What system and what do you mean by "powered down"? How much time
does it take for the battery to drain now?

> The bisect actually pointed to commit de3ef1e, but reverting
> these two commits fixes the issue.
>
> I was hoping to get your feedback, since you are the patch author. Do
> you think gathering any additional data will help diagnose this issue,
> or would it be best to submit a revert request?

First, reverting these is not an option or you will break systems
relying on them now. 4.13 is three releases back at this point.

Second, your issue appears to be related to the suspend/shutdown path
whereas commit 0ce3fcaff929 is mostly about resume, so presumably the
change in pci_enable_wake() causes the problem to happen. Can you try
to revert this one alone and see if that helps?

Joseph Salisbury (jsalisbury) wrote :

@Gopal, can you confirm the answer to Rafaels question in comment #85. Your system is physically powered off and not suspended, correct?

Gopal (s10gopal) wrote :

Yes

Gopal (s10gopal) wrote :

@Rafaels, Laptop is physically powered off

Joseph Salisbury (jsalisbury) wrote :

On 04/13/2018 05:34 PM, Rafael J. Wysocki wrote:
> On Fri, Apr 13, 2018 at 7:56 PM, Joseph Salisbury
> <email address hidden> wrote:
>> Hi Rafael,
>>
>> A kernel bug report was opened against Ubuntu [0]. After a kernel
>> bisect, it was found that reverting the following two commits resolved
>> this bug:
>>
>> 0ce3fcaff929 ("PCI / PM: Restore PME Enable after config space restoration")
>> 0847684cfc5f("PCI / PM: Simplify device wakeup settings code")
>>
>> This is a regression introduced in v4.13-rc1 and still exists in
>> mainline. The bug causes the battery to drain when the system is
>> powered down and unplugged, which does not happed prior to these two
>> commits.
> What system and what do you mean by "powered down"? How much time
> does it take for the battery to drain now?
By powered down, the bug reporter is saying physically powered off and
unplugged.  The system is a HP laptop:

dmi.chassis.vendor: HP
dmi.product.family: 103C_5335KV HP Notebook
dmi.product.name: HP Notebook
vendor_id    : GenuineIntel
cpu family    : 6

>
>> The bisect actually pointed to commit de3ef1e, but reverting
>> these two commits fixes the issue.
>>
>> I was hoping to get your feedback, since you are the patch author. Do
>> you think gathering any additional data will help diagnose this issue,
>> or would it be best to submit a revert request?
> First, reverting these is not an option or you will break systems
> relying on them now. 4.13 is three releases back at this point.
>
> Second, your issue appears to be related to the suspend/shutdown path
> whereas commit 0ce3fcaff929 is mostly about resume, so presumably the
> change in pci_enable_wake() causes the problem to happen. Can you try
> to revert this one alone and see if that helps?
A test kernel with commits 0ce3fcaff929 and de3ef1eb1cd0 reverted was
tested.  However, the test kernel still exhibited the bug.

Rafael J. Wysocki (rjwysocki) wrote :

On Mon, Apr 16, 2018 at 5:31 PM, Joseph Salisbury
<email address hidden> wrote:
> On 04/13/2018 05:34 PM, Rafael J. Wysocki wrote:
>> On Fri, Apr 13, 2018 at 7:56 PM, Joseph Salisbury
>> <email address hidden> wrote:
>>> Hi Rafael,
>>>
>>> A kernel bug report was opened against Ubuntu [0]. After a kernel
>>> bisect, it was found that reverting the following two commits resolved
>>> this bug:
>>>
>>> 0ce3fcaff929 ("PCI / PM: Restore PME Enable after config space restoration")
>>> 0847684cfc5f("PCI / PM: Simplify device wakeup settings code")
>>>
>>> This is a regression introduced in v4.13-rc1 and still exists in
>>> mainline. The bug causes the battery to drain when the system is
>>> powered down and unplugged, which does not happed prior to these two
>>> commits.
>> What system and what do you mean by "powered down"? How much time
>> does it take for the battery to drain now?
> By powered down, the bug reporter is saying physically powered off and
> unplugged. The system is a HP laptop:
>
> dmi.chassis.vendor: HP
> dmi.product.family: 103C_5335KV HP Notebook
> dmi.product.name: HP Notebook
> vendor_id : GenuineIntel
> cpu family : 6
>
>
>>
>>> The bisect actually pointed to commit de3ef1e, but reverting
>>> these two commits fixes the issue.
>>>
>>> I was hoping to get your feedback, since you are the patch author. Do
>>> you think gathering any additional data will help diagnose this issue,
>>> or would it be best to submit a revert request?
>> First, reverting these is not an option or you will break systems
>> relying on them now. 4.13 is three releases back at this point.
>>
>> Second, your issue appears to be related to the suspend/shutdown path
>> whereas commit 0ce3fcaff929 is mostly about resume, so presumably the
>> change in pci_enable_wake() causes the problem to happen. Can you try
>> to revert this one alone and see if that helps?
> A test kernel with commits 0ce3fcaff929 and de3ef1eb1cd0 reverted was
> tested. However, the test kernel still exhibited the bug.

So essentially the bisection result cannot be trusted.

Joseph Salisbury (jsalisbury) wrote :

On 04/16/2018 11:58 AM, Rafael J. Wysocki wrote:
> On Mon, Apr 16, 2018 at 5:31 PM, Joseph Salisbury
> <email address hidden> wrote:
>> On 04/13/2018 05:34 PM, Rafael J. Wysocki wrote:
>>> On Fri, Apr 13, 2018 at 7:56 PM, Joseph Salisbury
>>> <email address hidden> wrote:
>>>> Hi Rafael,
>>>>
>>>> A kernel bug report was opened against Ubuntu [0]. After a kernel
>>>> bisect, it was found that reverting the following two commits resolved
>>>> this bug:
>>>>
>>>> 0ce3fcaff929 ("PCI / PM: Restore PME Enable after config space restoration")
>>>> 0847684cfc5f("PCI / PM: Simplify device wakeup settings code")
>>>>
>>>> This is a regression introduced in v4.13-rc1 and still exists in
>>>> mainline. The bug causes the battery to drain when the system is
>>>> powered down and unplugged, which does not happed prior to these two
>>>> commits.
>>> What system and what do you mean by "powered down"? How much time
>>> does it take for the battery to drain now?
>> By powered down, the bug reporter is saying physically powered off and
>> unplugged. The system is a HP laptop:
>>
>> dmi.chassis.vendor: HP
>> dmi.product.family: 103C_5335KV HP Notebook
>> dmi.product.name: HP Notebook
>> vendor_id : GenuineIntel
>> cpu family : 6
>>
>>
>>>> The bisect actually pointed to commit de3ef1e, but reverting
>>>> these two commits fixes the issue.
>>>>
>>>> I was hoping to get your feedback, since you are the patch author. Do
>>>> you think gathering any additional data will help diagnose this issue,
>>>> or would it be best to submit a revert request?
>>> First, reverting these is not an option or you will break systems
>>> relying on them now. 4.13 is three releases back at this point.
>>>
>>> Second, your issue appears to be related to the suspend/shutdown path
>>> whereas commit 0ce3fcaff929 is mostly about resume, so presumably the
>>> change in pci_enable_wake() causes the problem to happen. Can you try
>>> to revert this one alone and see if that helps?
>> A test kernel with commits 0ce3fcaff929 and de3ef1eb1cd0 reverted was
>> tested. However, the test kernel still exhibited the bug.
> So essentially the bisection result cannot be trusted.
Yes, the bisect results were different than usual.  The bisect reported
commit de3ef1eb1cd0 as the first bad commit.  I could not revet commit
de3ef1eb1cd0 without either back porting  the revert of that commit or
reverting 0ce3fcaff929 first.  However, the bug still happened with
these two reverted.  I needed to revert 0847684cfc5f and 0ce3fcaff929
for the bug to go away.  I reverted 0ce3fcaff929 in this case to also
avoid having to back port the revert of 0847684cfc5f.  I was unsure if
these unexpected results were due to the interaction/dependency between
the commits or due to inaccurate testing by the end user.

I'll build some more test kernels and have the user perform some more
testing to see if the bug can be specifically narrowed down to 0847684cfc5f.

1 comments hidden view all 124 comments
Joseph Salisbury (jsalisbury) wrote :

I built one more test kernel. This one only has a revert for commit 0847684cfc5f.

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1745646

Can you test this kernel and see if it resolves this bug?

Note, to test this kernel, you need to install both the linux-image and linux-image-extra .deb packages.

Thanks in advance!

Gopal (s10gopal) wrote :

Linux gopal-HP-Notebook 4.15.0-15-generic #16~lp1745646 SMP Tue Apr 24 19:39:06 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Before shutdown: http://paste.ubuntu.com/p/qCt9tcC8ch/

Gopal (s10gopal) wrote :

4.15.0-15-generic #16~lp1745646 is good.
http://paste.ubuntu.com/p/QGC2yWNk9F/

Joseph Salisbury (jsalisbury) wrote :

I built one more test kernel.

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1745646

Can you test this kernel and see if it resolves this bug?

Note, to test this kernel, you need to install both the linux-image and linux-image-extra .deb packages.

Thanks in advance!

Gopal (s10gopal) wrote :

Linux gopal-HP-Notebook 4.15.0-15-generic #16~lp1745646v2 SMP Wed Apr 25 19:28:26 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
http://paste.ubuntu.com/p/TCxSpvdH3s/

Gopal (s10gopal) wrote :

4.15.0-15-generic #16~lp1745646v2 is bad .
http://paste.ubuntu.com/p/DySwKdWJ23/

Joseph Salisbury (jsalisbury) wrote :

On 04/16/2018 11:58 AM, Rafael J. Wysocki wrote:
> On Mon, Apr 16, 2018 at 5:31 PM, Joseph Salisbury
> <email address hidden> wrote:
>> On 04/13/2018 05:34 PM, Rafael J. Wysocki wrote:
>>> On Fri, Apr 13, 2018 at 7:56 PM, Joseph Salisbury
>>> <email address hidden> wrote:
>>>> Hi Rafael,
>>>>
>>>> A kernel bug report was opened against Ubuntu [0]. After a kernel
>>>> bisect, it was found that reverting the following two commits resolved
>>>> this bug:
>>>>
>>>> 0ce3fcaff929 ("PCI / PM: Restore PME Enable after config space restoration")
>>>> 0847684cfc5f("PCI / PM: Simplify device wakeup settings code")
>>>>
>>>> This is a regression introduced in v4.13-rc1 and still exists in
>>>> mainline. The bug causes the battery to drain when the system is
>>>> powered down and unplugged, which does not happed prior to these two
>>>> commits.
>>> What system and what do you mean by "powered down"? How much time
>>> does it take for the battery to drain now?
>> By powered down, the bug reporter is saying physically powered off and
>> unplugged. The system is a HP laptop:
>>
>> dmi.chassis.vendor: HP
>> dmi.product.family: 103C_5335KV HP Notebook
>> dmi.product.name: HP Notebook
>> vendor_id : GenuineIntel
>> cpu family : 6
>>
>>
>>>> The bisect actually pointed to commit de3ef1e, but reverting
>>>> these two commits fixes the issue.
>>>>
>>>> I was hoping to get your feedback, since you are the patch author. Do
>>>> you think gathering any additional data will help diagnose this issue,
>>>> or would it be best to submit a revert request?
>>> First, reverting these is not an option or you will break systems
>>> relying on them now. 4.13 is three releases back at this point.
>>>
>>> Second, your issue appears to be related to the suspend/shutdown path
>>> whereas commit 0ce3fcaff929 is mostly about resume, so presumably the
>>> change in pci_enable_wake() causes the problem to happen. Can you try
>>> to revert this one alone and see if that helps?
>> A test kernel with commits 0ce3fcaff929 and de3ef1eb1cd0 reverted was
>> tested. However, the test kernel still exhibited the bug.
> So essentially the bisection result cannot be trusted.

We performed some more testing and confirmed just a revert of the
following commit resolves the bug:

0847684cfc5f0 ("PCI / PM: Simplify device wakeup settings code")

Can you think of any suggestions to help debug further?

Rafael J. Wysocki (rjwysocki) wrote :

On Mon, Apr 30, 2018 at 4:22 PM, Joseph Salisbury
<email address hidden> wrote:
> On 04/16/2018 11:58 AM, Rafael J. Wysocki wrote:
>> On Mon, Apr 16, 2018 at 5:31 PM, Joseph Salisbury
>> <email address hidden> wrote:
>>> On 04/13/2018 05:34 PM, Rafael J. Wysocki wrote:
>>>> On Fri, Apr 13, 2018 at 7:56 PM, Joseph Salisbury
>>>> <email address hidden> wrote:
>>>>> Hi Rafael,
>>>>>
>>>>> A kernel bug report was opened against Ubuntu [0]. After a kernel
>>>>> bisect, it was found that reverting the following two commits resolved
>>>>> this bug:
>>>>>
>>>>> 0ce3fcaff929 ("PCI / PM: Restore PME Enable after config space restoration")
>>>>> 0847684cfc5f("PCI / PM: Simplify device wakeup settings code")
>>>>>
>>>>> This is a regression introduced in v4.13-rc1 and still exists in
>>>>> mainline. The bug causes the battery to drain when the system is
>>>>> powered down and unplugged, which does not happed prior to these two
>>>>> commits.
>>>> What system and what do you mean by "powered down"? How much time
>>>> does it take for the battery to drain now?
>>> By powered down, the bug reporter is saying physically powered off and
>>> unplugged. The system is a HP laptop:
>>>
>>> dmi.chassis.vendor: HP
>>> dmi.product.family: 103C_5335KV HP Notebook
>>> dmi.product.name: HP Notebook
>>> vendor_id : GenuineIntel
>>> cpu family : 6
>>>
>>>
>>>>> The bisect actually pointed to commit de3ef1e, but reverting
>>>>> these two commits fixes the issue.
>>>>>
>>>>> I was hoping to get your feedback, since you are the patch author. Do
>>>>> you think gathering any additional data will help diagnose this issue,
>>>>> or would it be best to submit a revert request?
>>>> First, reverting these is not an option or you will break systems
>>>> relying on them now. 4.13 is three releases back at this point.
>>>>
>>>> Second, your issue appears to be related to the suspend/shutdown path
>>>> whereas commit 0ce3fcaff929 is mostly about resume, so presumably the
>>>> change in pci_enable_wake() causes the problem to happen. Can you try
>>>> to revert this one alone and see if that helps?
>>> A test kernel with commits 0ce3fcaff929 and de3ef1eb1cd0 reverted was
>>> tested. However, the test kernel still exhibited the bug.
>> So essentially the bisection result cannot be trusted.
>
> We performed some more testing and confirmed just a revert of the
> following commit resolves the bug:
>
> 0847684cfc5f0 ("PCI / PM: Simplify device wakeup settings code")

Thanks for confirming this!

> Can you think of any suggestions to help debug further?

The root cause of the regression is likely the change in
pci_enable_wake() removing the device_may_wakeup() check from it.

Probably, one of the drivers in the platform calls pci_enable_wake()
directly from its ->shutdown() callback and that causes the device to
be set up for system wakeup which in turn causes the power draw while
the system is off to increase.

I would look at the PCI drivers used on that platform to find which of
them call pci_enable_wake() directly from ->shutdown() and I would
make these calls conditional on device_may_wakeup().

Rafael J. Wysocki (rjwysocki) wrote :
Download full text (4.5 KiB)

On Tue, May 1, 2018 at 9:55 PM, Bjorn Helgaas <email address hidden> wrote:
> On Tue, May 01, 2018 at 10:34:29AM +0200, Rafael J. Wysocki wrote:
>> On Mon, Apr 30, 2018 at 4:22 PM, Joseph Salisbury
>> <email address hidden> wrote:
>> > On 04/16/2018 11:58 AM, Rafael J. Wysocki wrote:
>> >> On Mon, Apr 16, 2018 at 5:31 PM, Joseph Salisbury
>> >> <email address hidden> wrote:
>> >>> On 04/13/2018 05:34 PM, Rafael J. Wysocki wrote:
>> >>>> On Fri, Apr 13, 2018 at 7:56 PM, Joseph Salisbury
>> >>>> <email address hidden> wrote:
>> >>>>> Hi Rafael,
>> >>>>>
>> >>>>> A kernel bug report was opened against Ubuntu [0]. After a kernel
>> >>>>> bisect, it was found that reverting the following two commits resolved
>> >>>>> this bug:
>> >>>>>
>> >>>>> 0ce3fcaff929 ("PCI / PM: Restore PME Enable after config space restoration")
>> >>>>> 0847684cfc5f("PCI / PM: Simplify device wakeup settings code")
>> >>>>>
>> >>>>> This is a regression introduced in v4.13-rc1 and still exists in
>> >>>>> mainline. The bug causes the battery to drain when the system is
>> >>>>> powered down and unplugged, which does not happed prior to these two
>> >>>>> commits.
>> >>>> What system and what do you mean by "powered down"? How much time
>> >>>> does it take for the battery to drain now?
>> >>> By powered down, the bug reporter is saying physically powered off and
>> >>> unplugged. The system is a HP laptop:
>> >>>
>> >>> dmi.chassis.vendor: HP
>> >>> dmi.product.family: 103C_5335KV HP Notebook
>> >>> dmi.product.name: HP Notebook
>> >>> vendor_id : GenuineIntel
>> >>> cpu family : 6
>> >>>
>> >>>
>> >>>>> The bisect actually pointed to commit de3ef1e, but reverting
>> >>>>> these two commits fixes the issue.
>> >>>>>
>> >>>>> I was hoping to get your feedback, since you are the patch author. Do
>> >>>>> you think gathering any additional data will help diagnose this issue,
>> >>>>> or would it be best to submit a revert request?
>> >>>> First, reverting these is not an option or you will break systems
>> >>>> relying on them now. 4.13 is three releases back at this point.
>> >>>>
>> >>>> Second, your issue appears to be related to the suspend/shutdown path
>> >>>> whereas commit 0ce3fcaff929 is mostly about resume, so presumably the
>> >>>> change in pci_enable_wake() causes the problem to happen. Can you try
>> >>>> to revert this one alone and see if that helps?
>> >>> A test kernel with commits 0ce3fcaff929 and de3ef1eb1cd0 reverted was
>> >>> tested. However, the test kernel still exhibited the bug.
>> >> So essentially the bisection result cannot be trusted.
>> >
>> > We performed some more testing and confirmed just a revert of the
>> > following commit resolves the bug:
>> >
>> > 0847684cfc5f0 ("PCI / PM: Simplify device wakeup settings code")
>>
>> Thanks for confirming this!
>>
>> > Can you think of any suggestions to help debug further?
>>
>> The root cause of the regression is likely the change in
>> pci_enable_wake() removing the device_may_wakeup() check from it.
>>
>> Probably, one of the drivers in the platform calls pci_enable_wake()
>> directly from its ->shutdown() callback and that causes the device to
>> be set ...

Read more...

1 comments hidden view all 124 comments
Joseph Salisbury (jsalisbury) wrote :
Download full text (4.0 KiB)

On 05/02/2018 06:41 AM, Rafael J. Wysocki wrote:
> On Tue, May 1, 2018 at 9:55 PM, Bjorn Helgaas <email address hidden> wrote:
>> On Tue, May 01, 2018 at 10:34:29AM +0200, Rafael J. Wysocki wrote:
>>> On Mon, Apr 30, 2018 at 4:22 PM, Joseph Salisbury
>>> <email address hidden> wrote:
>>>> On 04/16/2018 11:58 AM, Rafael J. Wysocki wrote:
>>>>> On Mon, Apr 16, 2018 at 5:31 PM, Joseph Salisbury
>>>>> <email address hidden> wrote:
>>>>>> On 04/13/2018 05:34 PM, Rafael J. Wysocki wrote:
>>>>>>> On Fri, Apr 13, 2018 at 7:56 PM, Joseph Salisbury
>>>>>>> <email address hidden> wrote:
>>>>>>>> Hi Rafael,
>>>>>>>>
>>>>>>>> A kernel bug report was opened against Ubuntu [0]. After a kernel
>>>>>>>> bisect, it was found that reverting the following two commits resolved
>>>>>>>> this bug:
>>>>>>>>
>>>>>>>> 0ce3fcaff929 ("PCI / PM: Restore PME Enable after config space restoration")
>>>>>>>> 0847684cfc5f("PCI / PM: Simplify device wakeup settings code")
>>>>>>>>
>>>>>>>> This is a regression introduced in v4.13-rc1 and still exists in
>>>>>>>> mainline. The bug causes the battery to drain when the system is
>>>>>>>> powered down and unplugged, which does not happed prior to these two
>>>>>>>> commits.
>>>>>>> What system and what do you mean by "powered down"? How much time
>>>>>>> does it take for the battery to drain now?
>>>>>> By powered down, the bug reporter is saying physically powered off and
>>>>>> unplugged. The system is a HP laptop:
>>>>>>
>>>>>> dmi.chassis.vendor: HP
>>>>>> dmi.product.family: 103C_5335KV HP Notebook
>>>>>> dmi.product.name: HP Notebook
>>>>>> vendor_id : GenuineIntel
>>>>>> cpu family : 6
>>>>>>
>>>>>>
>>>>>>>> The bisect actually pointed to commit de3ef1e, but reverting
>>>>>>>> these two commits fixes the issue.
>>>>>>>>
>>>>>>>> I was hoping to get your feedback, since you are the patch author. Do
>>>>>>>> you think gathering any additional data will help diagnose this issue,
>>>>>>>> or would it be best to submit a revert request?
>>>>>>> First, reverting these is not an option or you will break systems
>>>>>>> relying on them now. 4.13 is three releases back at this point.
>>>>>>>
>>>>>>> Second, your issue appears to be related to the suspend/shutdown path
>>>>>>> whereas commit 0ce3fcaff929 is mostly about resume, so presumably the
>>>>>>> change in pci_enable_wake() causes the problem to happen. Can you try
>>>>>>> to revert this one alone and see if that helps?
>>>>>> A test kernel with commits 0ce3fcaff929 and de3ef1eb1cd0 reverted was
>>>>>> tested. However, the test kernel still exhibited the bug.
>>>>> So essentially the bisection result cannot be trusted.
>>>> We performed some more testing and confirmed just a revert of the
>>>> following commit resolves the bug:
>>>>
>>>> 0847684cfc5f0 ("PCI / PM: Simplify device wakeup settings code")
>>> Thanks for confirming this!
>>>
>>>> Can you think of any suggestions to help debug further?
>>> The root cause of the regression is likely the change in
>>> pci_enable_wake() removing the device_may_wakeup() check from it.
>>>
>>> Probably, one of the drivers in the platform calls pci_enable_wake()
>>> directly from its ->shutdown()...

Read more...

Joseph Salisbury (jsalisbury) wrote :

Hi Gopal,

I built a test kernel with the patch requested by upstream.

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1745646

Can you test this kernel and see if it resolves this bug?

Thanks in advance!

Gopal (s10gopal) wrote :

Linux gopal-HP-Notebook 4.15.0-20-generic #21~lp1745646PatchFromBjorn SMP Wed May 2 11:20:48 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

before shutdown http://paste.ubuntu.com/p/Rprhmm44Sv/

Gopal (s10gopal) wrote :

4.15.0-20-generic #21~lp1745646PatchFromBjorn SMP Wed May 2 is good.
http://paste.ubuntu.com/p/7drHsv8jqB/

Joseph Salisbury (jsalisbury) wrote :
Download full text (4.0 KiB)

On 05/02/2018 06:41 AM, Rafael J. Wysocki wrote:
> On Tue, May 1, 2018 at 9:55 PM, Bjorn Helgaas <email address hidden> wrote:
>> On Tue, May 01, 2018 at 10:34:29AM +0200, Rafael J. Wysocki wrote:
>>> On Mon, Apr 30, 2018 at 4:22 PM, Joseph Salisbury
>>> <email address hidden> wrote:
>>>> On 04/16/2018 11:58 AM, Rafael J. Wysocki wrote:
>>>>> On Mon, Apr 16, 2018 at 5:31 PM, Joseph Salisbury
>>>>> <email address hidden> wrote:
>>>>>> On 04/13/2018 05:34 PM, Rafael J. Wysocki wrote:
>>>>>>> On Fri, Apr 13, 2018 at 7:56 PM, Joseph Salisbury
>>>>>>> <email address hidden> wrote:
>>>>>>>> Hi Rafael,
>>>>>>>>
>>>>>>>> A kernel bug report was opened against Ubuntu [0]. After a kernel
>>>>>>>> bisect, it was found that reverting the following two commits resolved
>>>>>>>> this bug:
>>>>>>>>
>>>>>>>> 0ce3fcaff929 ("PCI / PM: Restore PME Enable after config space restoration")
>>>>>>>> 0847684cfc5f("PCI / PM: Simplify device wakeup settings code")
>>>>>>>>
>>>>>>>> This is a regression introduced in v4.13-rc1 and still exists in
>>>>>>>> mainline. The bug causes the battery to drain when the system is
>>>>>>>> powered down and unplugged, which does not happed prior to these two
>>>>>>>> commits.
>>>>>>> What system and what do you mean by "powered down"? How much time
>>>>>>> does it take for the battery to drain now?
>>>>>> By powered down, the bug reporter is saying physically powered off and
>>>>>> unplugged. The system is a HP laptop:
>>>>>>
>>>>>> dmi.chassis.vendor: HP
>>>>>> dmi.product.family: 103C_5335KV HP Notebook
>>>>>> dmi.product.name: HP Notebook
>>>>>> vendor_id : GenuineIntel
>>>>>> cpu family : 6
>>>>>>
>>>>>>
>>>>>>>> The bisect actually pointed to commit de3ef1e, but reverting
>>>>>>>> these two commits fixes the issue.
>>>>>>>>
>>>>>>>> I was hoping to get your feedback, since you are the patch author. Do
>>>>>>>> you think gathering any additional data will help diagnose this issue,
>>>>>>>> or would it be best to submit a revert request?
>>>>>>> First, reverting these is not an option or you will break systems
>>>>>>> relying on them now. 4.13 is three releases back at this point.
>>>>>>>
>>>>>>> Second, your issue appears to be related to the suspend/shutdown path
>>>>>>> whereas commit 0ce3fcaff929 is mostly about resume, so presumably the
>>>>>>> change in pci_enable_wake() causes the problem to happen. Can you try
>>>>>>> to revert this one alone and see if that helps?
>>>>>> A test kernel with commits 0ce3fcaff929 and de3ef1eb1cd0 reverted was
>>>>>> tested. However, the test kernel still exhibited the bug.
>>>>> So essentially the bisection result cannot be trusted.
>>>> We performed some more testing and confirmed just a revert of the
>>>> following commit resolves the bug:
>>>>
>>>> 0847684cfc5f0 ("PCI / PM: Simplify device wakeup settings code")
>>> Thanks for confirming this!
>>>
>>>> Can you think of any suggestions to help debug further?
>>> The root cause of the regression is likely the change in
>>> pci_enable_wake() removing the device_may_wakeup() check from it.
>>>
>>> Probably, one of the drivers in the platform calls pci_enable_wake()
>>> directly from its ->shutdown()...

Read more...

Rafael J. Wysocki (rjwysocki) wrote :
Download full text (4.7 KiB)

On Thu, May 3, 2018 at 9:11 PM, Bjorn Helgaas <email address hidden> wrote:
> On Thu, May 03, 2018 at 02:29:02PM -0400, Joseph Salisbury wrote:
>> On 05/02/2018 06:41 AM, Rafael J. Wysocki wrote:
>> > On Tue, May 1, 2018 at 9:55 PM, Bjorn Helgaas <email address hidden> wrote:
>> >> On Tue, May 01, 2018 at 10:34:29AM +0200, Rafael J. Wysocki wrote:
>> >>> On Mon, Apr 30, 2018 at 4:22 PM, Joseph Salisbury
>> >>> <email address hidden> wrote:
>> >>>> On 04/16/2018 11:58 AM, Rafael J. Wysocki wrote:
>> >>>>> On Mon, Apr 16, 2018 at 5:31 PM, Joseph Salisbury
>> >>>>> <email address hidden> wrote:
>> >>>>>> On 04/13/2018 05:34 PM, Rafael J. Wysocki wrote:
>> >>>>>>> On Fri, Apr 13, 2018 at 7:56 PM, Joseph Salisbury
>> >>>>>>> <email address hidden> wrote:
>> >>>>>>>> Hi Rafael,
>> >>>>>>>>
>> >>>>>>>> A kernel bug report was opened against Ubuntu [0]. After a kernel
>> >>>>>>>> bisect, it was found that reverting the following two commits resolved
>> >>>>>>>> this bug:
>> >>>>>>>>
>> >>>>>>>> 0ce3fcaff929 ("PCI / PM: Restore PME Enable after config space restoration")
>> >>>>>>>> 0847684cfc5f("PCI / PM: Simplify device wakeup settings code")
>> >>>>>>>>
>> >>>>>>>> This is a regression introduced in v4.13-rc1 and still exists in
>> >>>>>>>> mainline. The bug causes the battery to drain when the system is
>> >>>>>>>> powered down and unplugged, which does not happed prior to these two
>> >>>>>>>> commits.
>> >>>>>>> What system and what do you mean by "powered down"? How much time
>> >>>>>>> does it take for the battery to drain now?
>> >>>>>> By powered down, the bug reporter is saying physically powered off and
>> >>>>>> unplugged. The system is a HP laptop:
>> >>>>>>
>> >>>>>> dmi.chassis.vendor: HP
>> >>>>>> dmi.product.family: 103C_5335KV HP Notebook
>> >>>>>> dmi.product.name: HP Notebook
>> >>>>>> vendor_id : GenuineIntel
>> >>>>>> cpu family : 6
>> >>>>>>
>> >>>>>>
>> >>>>>>>> The bisect actually pointed to commit de3ef1e, but reverting
>> >>>>>>>> these two commits fixes the issue.
>> >>>>>>>>
>> >>>>>>>> I was hoping to get your feedback, since you are the patch author. Do
>> >>>>>>>> you think gathering any additional data will help diagnose this issue,
>> >>>>>>>> or would it be best to submit a revert request?
>> >>>>>>> First, reverting these is not an option or you will break systems
>> >>>>>>> relying on them now. 4.13 is three releases back at this point.
>> >>>>>>>
>> >>>>>>> Second, your issue appears to be related to the suspend/shutdown path
>> >>>>>>> whereas commit 0ce3fcaff929 is mostly about resume, so presumably the
>> >>>>>>> change in pci_enable_wake() causes the problem to happen. Can you try
>> >>>>>>> to revert this one alone and see if that helps?
>> >>>>>> A test kernel with commits 0ce3fcaff929 and de3ef1eb1cd0 reverted was
>> >>>>>> tested. However, the test kernel still exhibited the bug.
>> >>>>> So essentially the bisection result cannot be trusted.
>> >>>> We performed some more testing and confirmed just a revert of the
>> >>>> following commit resolves the bug:
>> >>>>
>> >>>> 0847684cfc5f0 ("PCI / PM: Simplify device wakeup settings code")
>> >>> Thanks for confirming ...

Read more...

Joseph Salisbury (jsalisbury) wrote :

Hi Gopal,

I built a test kernel with the v2 version of the patch requested by upstream.

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1745646

Can you test this kernel and see if it resolves this bug?

Thanks in advance!

Gopal (s10gopal) wrote :

4.15.0-20-generic #21~lp1745646v2Upstream SMP Fri May 4 17:34:07 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Before Shutdown : http://paste.ubuntu.com/p/MWJGsStP59/

Gopal (s10gopal) wrote :

4.15.0-20-generic #21~lp1745646v2Upstream is good. http://paste.ubuntu.com/p/CdXSMBSQB5/

Joseph Salisbury (jsalisbury) wrote :
Download full text (8.2 KiB)

On 05/04/2018 07:14 AM, Rafael J. Wysocki wrote:
> On Thursday, May 3, 2018 11:29:18 PM CEST Rafael J. Wysocki wrote:
>> On Thu, May 3, 2018 at 9:11 PM, Bjorn Helgaas <email address hidden> wrote:
>>> On Thu, May 03, 2018 at 02:29:02PM -0400, Joseph Salisbury wrote:
>>>> On 05/02/2018 06:41 AM, Rafael J. Wysocki wrote:
>>>>> On Tue, May 1, 2018 at 9:55 PM, Bjorn Helgaas <email address hidden> wrote:
>>>>>> On Tue, May 01, 2018 at 10:34:29AM +0200, Rafael J. Wysocki wrote:
>>>>>>> On Mon, Apr 30, 2018 at 4:22 PM, Joseph Salisbury
>>>>>>> <email address hidden> wrote:
>>>>>>>> On 04/16/2018 11:58 AM, Rafael J. Wysocki wrote:
>>>>>>>>> On Mon, Apr 16, 2018 at 5:31 PM, Joseph Salisbury
>>>>>>>>> <email address hidden> wrote:
>>>>>>>>>> On 04/13/2018 05:34 PM, Rafael J. Wysocki wrote:
>>>>>>>>>>> On Fri, Apr 13, 2018 at 7:56 PM, Joseph Salisbury
>>>>>>>>>>> <email address hidden> wrote:
>>>>>>>>>>>> Hi Rafael,
>>>>>>>>>>>>
>>>>>>>>>>>> A kernel bug report was opened against Ubuntu [0]. After a kernel
>>>>>>>>>>>> bisect, it was found that reverting the following two commits resolved
>>>>>>>>>>>> this bug:
>>>>>>>>>>>>
>>>>>>>>>>>> 0ce3fcaff929 ("PCI / PM: Restore PME Enable after config space restoration")
>>>>>>>>>>>> 0847684cfc5f("PCI / PM: Simplify device wakeup settings code")
>>>>>>>>>>>>
>>>>>>>>>>>> This is a regression introduced in v4.13-rc1 and still exists in
>>>>>>>>>>>> mainline. The bug causes the battery to drain when the system is
>>>>>>>>>>>> powered down and unplugged, which does not happed prior to these two
>>>>>>>>>>>> commits.
>>>>>>>>>>> What system and what do you mean by "powered down"? How much time
>>>>>>>>>>> does it take for the battery to drain now?
>>>>>>>>>> By powered down, the bug reporter is saying physically powered off and
>>>>>>>>>> unplugged. The system is a HP laptop:
>>>>>>>>>>
>>>>>>>>>> dmi.chassis.vendor: HP
>>>>>>>>>> dmi.product.family: 103C_5335KV HP Notebook
>>>>>>>>>> dmi.product.name: HP Notebook
>>>>>>>>>> vendor_id : GenuineIntel
>>>>>>>>>> cpu family : 6
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>> The bisect actually pointed to commit de3ef1e, but reverting
>>>>>>>>>>>> these two commits fixes the issue.
>>>>>>>>>>>>
>>>>>>>>>>>> I was hoping to get your feedback, since you are the patch author. Do
>>>>>>>>>>>> you think gathering any additional data will help diagnose this issue,
>>>>>>>>>>>> or would it be best to submit a revert request?
>>>>>>>>>>> First, reverting these is not an option or you will break systems
>>>>>>>>>>> relying on them now. 4.13 is three releases back at this point.
>>>>>>>>>>>
>>>>>>>>>>> Second, your issue appears to be related to the suspend/shutdown path
>>>>>>>>>>> whereas commit 0ce3fcaff929 is mostly about resume, so presumably the
>>>>>>>>>>> change in pci_enable_wake() causes the problem to happen. Can you try
>>>>>>>>>>> to revert this one alone and see if that helps?
>>>>>>>>>> A test kernel with commits 0ce3fcaff929 and de3ef1eb1cd0 reverted was
>>>>>>>>>> tested. However, the test kernel still exhibited the bug.
>>>>>>>>> So essentially the bisection result cannot be trusted.
>>>>>>>> We performed some more testing and confirm...

Read more...

Joseph Salisbury (jsalisbury) wrote :

I built a test kernel with the final version of the patch from upstream.

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1745646

Can you test this kernel and see if it resolves this bug?

Thanks in advance!

Gopal (s10gopal) wrote :

4.15.0-20-generic #21~lp1745646FinalPatch SMP Wed May 9 12:45:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Before Shutdown : http://paste.ubuntu.com/p/pdyZCvZKpc/

Gopal (s10gopal) wrote :

4.15.0-20-generic #21~lp1745646FinalPatch is good.

Download full text (4.0 KiB)

On Wed, May 9, 2018 at 12:18 AM, Rafael J. Wysocki <email address hidden> wrote:
> From: Rafael J. Wysocki <email address hidden>
>
> Commit 0847684cfc5f0 (PCI / PM: Simplify device wakeup settings code)
> went too far and dropped the device_may_wakeup() check from
> pci_enable_wake() which causes wakeup to be enabled during system
> suspend, hibernation or shutdown for some PCI devices that are not
> allowed by user space to wake up the system from sleep (or power off).
>
> As a result of this excessive power is drawn by some of the affected
> systems while in sleep states or off.
>
> Restore the device_may_wakeup() check in pci_enable_wake(), but make
> sure that the PCI bus type's runtime suspend callback will not call
> device_may_wakeup() which is about system wakeup from sleep and not
> about device wakeup from runtime suspend.
>
> Fixes: 0847684cfc5f0 (PCI / PM: Simplify device wakeup settings code)
> Reported-by: Joseph Salisbury <email address hidden>
> Signed-off-by: Rafael J. Wysocki <email address hidden>

Bjorn, any concerns here?

> ---
> drivers/pci/pci.c | 29 +++++++++++++++++++++++------
> 1 file changed, 23 insertions(+), 6 deletions(-)
>
> Index: linux-pm/drivers/pci/pci.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci.c
> +++ linux-pm/drivers/pci/pci.c
> @@ -1910,7 +1910,7 @@ void pci_pme_active(struct pci_dev *dev,
> EXPORT_SYMBOL(pci_pme_active);
>
> /**
> - * pci_enable_wake - enable PCI device as wakeup event source
> + * __pci_enable_wake - enable PCI device as wakeup event source
> * @dev: PCI device affected
> * @state: PCI state from which device will issue wakeup events
> * @enable: True to enable event generation; false to disable
> @@ -1928,7 +1928,7 @@ EXPORT_SYMBOL(pci_pme_active);
> * Error code depending on the platform is returned if both the platform and
> * the native mechanism fail to enable the generation of wake-up events
> */
> -int pci_enable_wake(struct pci_dev *dev, pci_power_t state, bool enable)
> +static int __pci_enable_wake(struct pci_dev *dev, pci_power_t state, bool enable)
> {
> int ret = 0;
>
> @@ -1969,6 +1969,23 @@ int pci_enable_wake(struct pci_dev *dev,
>
> return ret;
> }
> +
> +/**
> + * pci_enable_wake - change wakeup settings for a PCI device
> + * @pci_dev: Target device
> + * @state: PCI state from which device will issue wakeup events
> + * @enable: Whether or not to enable event generation
> + *
> + * If @enable is set, check device_may_wakeup() for the device before calling
> + * __pci_enable_wake() for it.
> + */
> +int pci_enable_wake(struct pci_dev *pci_dev, pci_power_t state, bool enable)
> +{
> + if (enable && !device_may_wakeup(&pci_dev->dev))
> + return -EINVAL;
> +
> + return __pci_enable_wake(pci_dev, state, enable);
> +}
> EXPORT_SYMBOL(pci_enable_wake);
>
> /**
> @@ -1981,9 +1998,9 @@ EXPORT_SYMBOL(pci_enable_wake);
> * should not be called twice in a row to enable wake-up due to PCI PM vs ACPI
> * ordering constraints.
> *
> - * This function only returns error code if the device is not capable of
> - * generating PME# ...

Read more...

Rafael J. Wysocki (rjwysocki) wrote :

On Thu, May 10, 2018 at 3:03 PM, Bjorn Helgaas <email address hidden> wrote:
> On Wed, May 09, 2018 at 12:18:32AM +0200, Rafael J. Wysocki wrote:
>> From: Rafael J. Wysocki <email address hidden>
>>
>> Commit 0847684cfc5f0 (PCI / PM: Simplify device wakeup settings code)
>> went too far and dropped the device_may_wakeup() check from
>> pci_enable_wake() which causes wakeup to be enabled during system
>> suspend, hibernation or shutdown for some PCI devices that are not
>> allowed by user space to wake up the system from sleep (or power off).
>>
>> As a result of this excessive power is drawn by some of the affected
>> systems while in sleep states or off.
>>
>> Restore the device_may_wakeup() check in pci_enable_wake(), but make
>> sure that the PCI bus type's runtime suspend callback will not call
>> device_may_wakeup() which is about system wakeup from sleep and not
>> about device wakeup from runtime suspend.
>>
>> Fixes: 0847684cfc5f0 (PCI / PM: Simplify device wakeup settings code)
>> Reported-by: Joseph Salisbury <email address hidden>
>> Signed-off-by: Rafael J. Wysocki <email address hidden>
>
> Acked-by: Bjorn Helgaas <email address hidden>
>
> 0847684cfc5f0 appeared in v4.13, which raises the question of whether
> this problem is important enough for a stable backport. Up to you :)

Yes, it is IMO, thank you!

Joseph Salisbury (jsalisbury) wrote :
description: updated
no longer affects: acpi (Ubuntu Artful)
no longer affects: acpi (Ubuntu Bionic)
no longer affects: acpi (Ubuntu Cosmic)
no longer affects: acpi (Ubuntu)
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Artful):
status: In Progress → Fix Committed
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-artful' to 'verification-done-artful'. If the problem still exists, change the tag 'verification-needed-artful' to 'verification-failed-artful'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-artful
Gopal (s10gopal) on 2018-05-25
description: updated
tags: added: verification-done-bionic
removed: verification-needed-bionic
Gopal (s10gopal) on 2018-05-26
tags: added: verification-done-artful
removed: verification-needed-artful
Launchpad Janitor (janitor) wrote :
Download full text (11.4 KiB)

This bug was fixed in the package linux - 4.15.0-23.25

---------------
linux (4.15.0-23.25) bionic; urgency=medium

  * linux: 4.15.0-23.25 -proposed tracker (LP: #1772927)

  * arm64 SDEI support needs trampoline code for KPTI (LP: #1768630)
    - arm64: mmu: add the entry trampolines start/end section markers into
      sections.h
    - arm64: sdei: Add trampoline code for remapping the kernel

  * Some PCIe errors not surfaced through rasdaemon (LP: #1769730)
    - ACPI: APEI: handle PCIe AER errors in separate function
    - ACPI: APEI: call into AER handling regardless of severity

  * qla2xxx: Fix page fault at kmem_cache_alloc_node() (LP: #1770003)
    - scsi: qla2xxx: Fix session cleanup for N2N
    - scsi: qla2xxx: Remove unused argument from qlt_schedule_sess_for_deletion()
    - scsi: qla2xxx: Serialize session deletion by using work_lock
    - scsi: qla2xxx: Serialize session free in qlt_free_session_done
    - scsi: qla2xxx: Don't call dma_free_coherent with IRQ disabled.
    - scsi: qla2xxx: Fix warning in qla2x00_async_iocb_timeout()
    - scsi: qla2xxx: Prevent relogin trigger from sending too many commands
    - scsi: qla2xxx: Fix double free bug after firmware timeout
    - scsi: qla2xxx: Fixup locking for session deletion

  * Several hisi_sas bug fixes (LP: #1768974)
    - scsi: hisi_sas: dt-bindings: add an property of signal attenuation
    - scsi: hisi_sas: support the property of signal attenuation for v2 hw
    - scsi: hisi_sas: fix the issue of link rate inconsistency
    - scsi: hisi_sas: fix the issue of setting linkrate register
    - scsi: hisi_sas: increase timer expire of internal abort task
    - scsi: hisi_sas: remove unused variable hisi_sas_devices.running_req
    - scsi: hisi_sas: fix return value of hisi_sas_task_prep()
    - scsi: hisi_sas: Code cleanup and minor bug fixes

  * [bionic] machine stuck and bonding not working well when nvmet_rdma module
    is loaded (LP: #1764982)
    - nvmet-rdma: Don't flush system_wq by default during remove_one
    - nvme-rdma: Don't flush delete_wq by default during remove_one

  * Warnings/hang during error handling of SATA disks on SAS controller
    (LP: #1768971)
    - scsi: libsas: defer ata device eh commands to libata

  * Hotplugging a SATA disk into a SAS controller may cause crash (LP: #1768948)
    - ata: do not schedule hot plug if it is a sas host

  * ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU
    ATTEMPT TO RE-ENTER FIRMWARE! (LP: #1767927)
    - powerpc/powernv: Handle unknown OPAL errors in opal_nvram_write()
    - powerpc/64s: return more carefully from sreset NMI
    - powerpc/64s: sreset panic if there is no debugger or crash dump handlers

  * fsnotify: Fix fsnotify_mark_connector race (LP: #1765564)
    - fsnotify: Fix fsnotify_mark_connector race

  * Hang on network interface removal in Xen virtual machine (LP: #1771620)
    - xen-netfront: Fix hang on device removal

  * HiSilicon HNS NIC names are truncated in /proc/interrupts (LP: #1765977)
    - net: hns: Avoid action name truncation

  * Ubuntu 18.04 kernel crashed while in degraded mode (LP: #1770849)
    - SAUCE: powerpc/perf: Fix memory allocation for...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (4.3 KiB)

This bug was fixed in the package linux - 4.13.0-45.50

---------------
linux (4.13.0-45.50) artful; urgency=medium

  * linux: 4.13.0-45.50 -proposed tracker (LP: #1774124)

  * CVE-2018-3639 (x86)
    - SAUCE: Set generic SSBD feature for Intel cpus

linux (4.13.0-44.49) artful; urgency=medium

  * linux: 4.13.0-44.49 -proposed tracker (LP: #1772951)

  * CVE-2018-3639 (x86)
    - x86/cpu: Make alternative_msr_write work for 32-bit code
    - x86/cpu/AMD: Fix erratum 1076 (CPB bit)
    - x86/bugs: Fix the parameters alignment and missing void
    - KVM: SVM: Move spec control call after restore of GS
    - x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP
    - x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS
    - x86/cpufeatures: Disentangle SSBD enumeration
    - x86/cpufeatures: Add FEATURE_ZEN
    - x86/speculation: Handle HT correctly on AMD
    - x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL
    - x86/speculation: Add virtualized speculative store bypass disable support
    - x86/speculation: Rework speculative_store_bypass_update()
    - x86/bugs: Unify x86_spec_ctrl_{set_guest,restore_host}
    - x86/bugs: Expose x86_spec_ctrl_base directly
    - x86/bugs: Remove x86_spec_ctrl_set()
    - x86/bugs: Rework spec_ctrl base and mask logic
    - x86/speculation, KVM: Implement support for VIRT_SPEC_CTRL/LS_CFG
    - KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD
    - x86/bugs: Rename SSBD_NO to SSB_NO
    - KVM: VMX: Expose SSBD properly to guests.

  * [Ubuntu 16.04] kernel: fix rwlock implementation (LP: #1761674)
    - SAUCE: (no-up) s390: fix rwlock implementation

  * CVE-2018-7492
    - rds: Fix NULL pointer dereference in __rds_rdma_map

  * CVE-2018-8781
    - drm: udl: Properly check framebuffer mmap offsets

  * fsnotify: Fix fsnotify_mark_connector race (LP: #1765564)
    - fsnotify: Fix fsnotify_mark_connector race

  * Kernel panic on boot (m1.small in cn-north-1) (LP: #1771679)
    - x86/xen: Reset VCPU0 info pointer after shared_info remap

  * Suspend to idle: Open lid didn't resume (LP: #1771542)
    - ACPI / PM: Do not reconfigure GPEs for suspend-to-idle

  * CVE-2018-1092
    - ext4: fail ext4_iget for root directory if unallocated

  * [SRU][Artful] using vfio-pci on a combination of cn8xxx and some PCI devices
    results in a kernel panic. (LP: #1770254)
    - PCI: Avoid bus reset if bridge itself is broken
    - PCI: Mark Cavium CN8xxx to avoid bus reset
    - PCI: Avoid slot reset if bridge itself is broken

  * Battery drains when laptop is off (shutdown) (LP: #1745646)
    - PCI / PM: Check device_may_wakeup() in pci_enable_wake()

  * perf record crash: refcount_inc assertion failed (LP: #1769027)
    - perf cgroup: Fix refcount usage
    - perf xyarray: Fix wrong processing when closing evsel fd

  * Dell Latitude 5490/5590 BIOS update 1.1.9 causes black screen at boot
    (LP: #1764194)
    - drm/i915/bios: filter out invalid DDC pins from VBT child devices

  * Fix an issue that some PCI devices get incorrectly suspended (LP: #1764684)
    - PCI / PM: Always check PME wakeup capability for runtime wakeup support

  * [SRU][Bionic/Artful] fix false positives in W...

Read more...

Changed in linux (Ubuntu Artful):
status: Fix Committed → Fix Released
status: Fix Committed → Fix Released
1 comments hidden view all 124 comments
Launchpad Janitor (janitor) wrote :
Download full text (11.4 KiB)

This bug was fixed in the package linux - 4.15.0-23.25

---------------
linux (4.15.0-23.25) bionic; urgency=medium

  * linux: 4.15.0-23.25 -proposed tracker (LP: #1772927)

  * arm64 SDEI support needs trampoline code for KPTI (LP: #1768630)
    - arm64: mmu: add the entry trampolines start/end section markers into
      sections.h
    - arm64: sdei: Add trampoline code for remapping the kernel

  * Some PCIe errors not surfaced through rasdaemon (LP: #1769730)
    - ACPI: APEI: handle PCIe AER errors in separate function
    - ACPI: APEI: call into AER handling regardless of severity

  * qla2xxx: Fix page fault at kmem_cache_alloc_node() (LP: #1770003)
    - scsi: qla2xxx: Fix session cleanup for N2N
    - scsi: qla2xxx: Remove unused argument from qlt_schedule_sess_for_deletion()
    - scsi: qla2xxx: Serialize session deletion by using work_lock
    - scsi: qla2xxx: Serialize session free in qlt_free_session_done
    - scsi: qla2xxx: Don't call dma_free_coherent with IRQ disabled.
    - scsi: qla2xxx: Fix warning in qla2x00_async_iocb_timeout()
    - scsi: qla2xxx: Prevent relogin trigger from sending too many commands
    - scsi: qla2xxx: Fix double free bug after firmware timeout
    - scsi: qla2xxx: Fixup locking for session deletion

  * Several hisi_sas bug fixes (LP: #1768974)
    - scsi: hisi_sas: dt-bindings: add an property of signal attenuation
    - scsi: hisi_sas: support the property of signal attenuation for v2 hw
    - scsi: hisi_sas: fix the issue of link rate inconsistency
    - scsi: hisi_sas: fix the issue of setting linkrate register
    - scsi: hisi_sas: increase timer expire of internal abort task
    - scsi: hisi_sas: remove unused variable hisi_sas_devices.running_req
    - scsi: hisi_sas: fix return value of hisi_sas_task_prep()
    - scsi: hisi_sas: Code cleanup and minor bug fixes

  * [bionic] machine stuck and bonding not working well when nvmet_rdma module
    is loaded (LP: #1764982)
    - nvmet-rdma: Don't flush system_wq by default during remove_one
    - nvme-rdma: Don't flush delete_wq by default during remove_one

  * Warnings/hang during error handling of SATA disks on SAS controller
    (LP: #1768971)
    - scsi: libsas: defer ata device eh commands to libata

  * Hotplugging a SATA disk into a SAS controller may cause crash (LP: #1768948)
    - ata: do not schedule hot plug if it is a sas host

  * ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU
    ATTEMPT TO RE-ENTER FIRMWARE! (LP: #1767927)
    - powerpc/powernv: Handle unknown OPAL errors in opal_nvram_write()
    - powerpc/64s: return more carefully from sreset NMI
    - powerpc/64s: sreset panic if there is no debugger or crash dump handlers

  * fsnotify: Fix fsnotify_mark_connector race (LP: #1765564)
    - fsnotify: Fix fsnotify_mark_connector race

  * Hang on network interface removal in Xen virtual machine (LP: #1771620)
    - xen-netfront: Fix hang on device removal

  * HiSilicon HNS NIC names are truncated in /proc/interrupts (LP: #1765977)
    - net: hns: Avoid action name truncation

  * Ubuntu 18.04 kernel crashed while in degraded mode (LP: #1770849)
    - SAUCE: powerpc/perf: Fix memory allocation for...

Changed in linux (Ubuntu Cosmic):
status: In Progress → Fix Released
Displaying first 40 and last 40 comments. View all 124 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.