Power consumption higher after suspend - Ubuntu 20.10 on Asus UX425JA

Bug #1912057 reported by Henrik Juul Hansen
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
Undecided
koba

Bug Description

After boot the idle power consumption is around 2.3 W but after suspend it is 4.7 W.
The measurement is 5-10 min after boot/resume - looking at "top" to see system was idle.
This is consistent - I have tried more that 10 times.
BIOS is updated to latest version.
I have checked power consumption with external meter - and confirmed that it is higher.

I have attached output of:
sudo powertop -t 60 -r --html=before.html (and after)

I have noticed that Package C-state 6 is reached before suspend but not after.

I expect that idle power consumption after suspend should be the same as before.

ProblemType: Bug
DistroRelease: Ubuntu 20.10
Package: linux-image-5.8.0-38-generic 5.8.0-38.43
ProcVersionSignature: Ubuntu 5.8.0-38.43-generic 5.8.18
Uname: Linux 5.8.0-38-generic x86_64
ApportVersion: 2.20.11-0ubuntu50.3
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: hjh 1448 F.... pulseaudio
CasperMD5CheckResult: skip
CurrentDesktop: ubuntu:GNOME
Date: Sat Jan 16 15:49:32 2021
EcryptfsInUse: Yes
InstallationDate: Installed on 2021-01-03 (13 days ago)
InstallationMedia: Ubuntu 20.10 "Groovy Gorilla" - Release amd64 (20201022)
MachineType: ASUSTeK COMPUTER INC. ZenBook UX425JA_UX425JA
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=da_DK.UTF-8
 SHELL=/bin/bash
ProcFB: 0 i915drmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.8.0-38-generic root=UUID=6296ce1c-eec8-4302-beb1-b4a228780075 ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-5.8.0-38-generic N/A
 linux-backports-modules-5.8.0-38-generic N/A
 linux-firmware 1.190.2
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 10/28/2020
dmi.bios.release: 5.14
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: UX425JA.304
dmi.board.asset.tag: ATN12345678901234567
dmi.board.name: UX425JA
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: 1.0
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 10
dmi.chassis.vendor: ASUSTeK COMPUTER INC.
dmi.chassis.version: 1.0
dmi.ec.firmware.release: 3.4
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrUX425JA.304:bd10/28/2020:br5.14:efr3.4:svnASUSTeKCOMPUTERINC.:pnZenBookUX425JA_UX425JA:pvr1.0:rvnASUSTeKCOMPUTERINC.:rnUX425JA:rvr1.0:cvnASUSTeKCOMPUTERINC.:ct10:cvr1.0:
dmi.product.family: ZenBook
dmi.product.name: ZenBook UX425JA_UX425JA
dmi.product.version: 1.0
dmi.sys.vendor: ASUSTeK COMPUTER INC.

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :
Revision history for this message
Henrik Juul Hansen (juuligen) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Can you please run the following and attach "active" here:
$ find /sys/devices 2>/dev/null | grep runtime_status | while read i; do grep -H . $i; done | egrep -v 'suspended|unsupported' > active

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

I have run the command and attached the output. I did it both just after boot and after suspend/resume - same result both times, so I have attached just one output "active".

I have been poking around to see any differences in drivers/system state between after boot and after suspend/resume. I have found one difference - using the following command:
sudo lspci -vv | grep -E '(^\S|\s+LnkCtl:)'
For the Nvme device the LnkCtl - changes from ASPM L1 Enabled to ASPM Disabled. I have attached output of the command before and after suspend/resume. If it is not relevant please just ignore.

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :
Revision history for this message
Henrik Juul Hansen (juuligen) wrote :
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

The NVMe device didn't restore the LTR value.
Can you please boot with kernel parameter "pci.dyndbg log_buf_len=16M", reproduce the issue and attach dmesg?

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

I have done the following - turned off pc. At boot pressed "e" added - pci.dyndbg log_buf_len=16M - to the parameteres after VT-handoff. Logged in, suspend/resumed and added output of "sudo dmesg".

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Thanks for the log. The NVMe drive dropped off the PCI bus after resume.

Thanks, please test latest mainline kernel:
https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.11-rc4/amd64/

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

Thanks a lot for the help!
I have tried kernel 5.11-rc4.

It does not solve the issue. I have tried 2 test/reboots.
Power consumption seems better with new kernel just after boot - around 1.5W but after suspend/resume it is around 4.6W witch is much the same. The ASPM state is like with the old kernel - ASPM L1 Enabled before suspend/resume and after ASPM Disabled.

I will be off for tonight.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Can you please attach "acpidump"?

Also, what does "cat /sys/bus/pci/devices/0000:00:1d.0/firmware_node/path" say?

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

Output of "sudo acpidump > acpidump" is attached.
"cat /sys/bus/pci/devices/0000:00:1d.0/firmware_node/path" - outputs "\_SB_.PCI0.RP09"

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Thanks, please test this kernel:
https://people.canonical.com/~khfeng/lp1912057/

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

I have tested the rc4+ kernel (two reboots with similar results).

I will try to summarize the results:

Rc4+: ASPM L1 enabled both before and after suspend/resume!

Rc4+ before suspend/resume:
Power consumption around 4.3 W
Idle stats: pc2-pc3 is reached
ASPM L1 enabled for NVMe (and all others)

Rc4+ after suspend/resume:
Power consumption around 4.3 W
Idle stats: pc2-pc3 is reached
ASPM L1 enabled for NVMe (and all others)

Rc4+ breaks some Fn-keys and touchpad. In powertop I get 600 wakeups/s and 10% CPU usage. The Webcam is active 100%. I have attached output from powertop.

Rc4 before suspend/resume:
Power consumption around 1.4W
Idle stats: pc2-pc3-pc7-pc8-pc10 is reached
ASPM L1 enabled for NVMe (and all others)

Rc4 after suspend/resume:
Power consumption around 4.3W
Idle stats: pc2-pc3 is reached
ASPM Disabled for NVMe (and L1 enabled for all others)

Rc4 - In powertop I get 70 wakeups/s and 1% CPU usage.

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Can you please also attach dmesg under test kernel?

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

Thanks for the new kernel. I have tried it.

Before suspend/resume:
Power consumption around 2.4 W
Idle stats: pc2-pc3-pc6 is reached
ASPM L1 enabled for NVMe

After suspend/resume:
Power consumption around 4.9 W
Idle stats: pc2-pc3 is reached
ASPM Disabled for NVMe

and keys & touchpad is working.

dmesg is attached.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Can you please boot the vanilla 5.11-rc4 kernel, run `# echo 'file drivers/pci/* +p' > /sys/kernel/debug/dynamic_debug/control`, suspend/resume, and attach dmesg?

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

I have not used dynamic_debug before - hope I did it right.
As regular user: "sudo -i"
Then as root: "echo 'file drivers/pci/* +p' > /sys/kernel/debug/dynamic_debug/control"
followed by "dmesg > dmesg.rc4".

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Thanks! Can you please also do the same for the kernel in comment #14?

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

That's really helpful!
Can you please test this kernel, which solves another similar issue I am working on:
https://people.canonical.com/~khfeng/acs-hack-take-2/

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

I did a similar test as you asked in comment #20.

Before suspend/resume:
Powertop: around 1.5W - pc2,3,7,8,10
ASPM L1 enabled for all controllers

After suspend/resume:
Powertop: around 4.8W - pc2,3
ASPM Disabled for NVMe

Fn-keys and touchpad working.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

The test kernel (accidentally) disabled pcieport driver, and made the issue went away.
Two things happened:
1) The rootport stays at D0
2) PME service isn't enabled

Let's try 1) first. Will upload a kernel soon.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :
Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

Before suspend/resume:
Powertop: around 1.4W - pc2,3,7,8,10
ASPM L1 enabled for all controllers

After suspend/resume:
Powertop: around 4.6W - pc2,3
ASPM Disabled for NVMe

Fn-keys and touchpad working.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Thanks, apparently the root port doesn't stay at D0.
Please give this one a try:
https://people.canonical.com/~khfeng/lp1912057-bridge-d0-2/

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

I tried the ~khfeng/lp1912057-bridge-d0-2.

Before suspend/resume:
Powertop: around 1.4W - pc2,3,7,8,10
ASPM L1 enabled for all controllers

After suspend/resume:
Powertop: around 4.6W - pc2,3
ASPM Disabled for NVMe

Attached dmesg after "echo 'file drivers/pci/* +p' > /sys/kernel/debug/dynamic_debug/control" and suspend/resume.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

So, saving parent device stat in child's suspend routine doesn't keep parent device at D0.

Please test this one instead, it prevents NVMe with HMB shutting down:
https://people.canonical.com/~khfeng/lp1912057-keep-hmb-nvme-d0/

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

Good news!

Before suspend/resume:
Powertop: around 1.5W - pc2,3,7,8,10
ASPM L1 enabled for all controllers

After suspend/resume:
Powertop: around 1.5W - pc2,3,7,8,10
ASPM L1 enabled for all controllers

Fn-keys and touchpad working.

Dmesg is attached.

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

After using the kernel from #31 for 3 days and having many suspend/resumes the system has resumed with ASPM L1 enabled and same power as before suspend every time.

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

The issue is resolved in latest mainline kernel:
https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.11/amd64/
I will change the status as resolved.

Changed in linux (Ubuntu):
status: Incomplete → Fix Released
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Sorry for the belated response, I was on vacation.

It's interesting that it's fixed by 5.11, because I am not seeing any change that can help the issue.

Anyway, glad it's sill fixed.

Changed in linux (Ubuntu):
status: Fix Released → Incomplete
Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

Hi again,

I have reopened this issue. It is not fixed in mainline 5.11.0, 5.11.7 or 5.11.10.

I think I was using your kernel and mixed up the version numbers when I sent the message in #34. Since last I have updated BIOS version from V304->V306. For the last month I have been using the laptop with a USB-hub so I have not payed much attention to the power consumption - this is why I did not see this before now.
Let me know if you need me to send more information.

Changed in linux (Ubuntu):
status: Incomplete → Fix Released
status: Fix Released → Incomplete
Changed in linux (Ubuntu):
assignee: nobody → Kai-Heng Feng (kaihengfeng)
Revision history for this message
koba (kobako) wrote :

@Henrik,
When you try the kernel(#31), is issue gone?

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

@koba
Using kernel from #31 - the power consumption is the same before and after suspend/resume and ASPM L1 is enabled for the NVMe after resume. Yes the issue is gone.

Revision history for this message
koba (kobako) wrote :

@Henrik,
Would you please help to try this kernel?
https://drive.google.com/drive/folders/1ixfPnW4N-P_UGS-YkbluCVBuxDnPzXZv?usp=sharing
Please also append the dmesg after you reproduce it.

Changed in linux (Ubuntu):
assignee: Kai-Heng Feng (kaihengfeng) → koba (kobako)
Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

@Koba
I have tested the kernel from #39. It fixes the issue. I have attached dmesg. (Installing the headers failed).

Revision history for this message
koba (kobako) wrote : Re: [Bug 1912057] Re: Power consumption higher after suspend - Ubuntu 20.10 on Asus UX425JA

@Henrik
Thanks for your help.
Would you please try another kernel that disables HMB prior to s2idle? and
also help to collect dmesg, thanks
https://drive.google.com/drive/folders/1xMArRlMkED20kSu1pTKuRPvBT8kh__L2?usp=sharing

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

@Koba
I have tested kernel in #41 (5.10.0-1025lpv1912057b) and attached dmesg. This kernel does not fix the issue. ASPM is Disabled for NVMe and higher power consumption after suspend/resume.

Revision history for this message
koba (kobako) wrote :

@Henrik,
Thanks, would you please try this kernel again that I want to collect
the power state of nvme during suspend&resume?
please also collect dmesg, thanks
https://drive.google.com/drive/folders/1hU4OflO8nE4ShC0rOahpGsu8LhcF_UC-?usp=sharing

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

@Koba
No problem. I will do.
I have tested kernel in #43 (5.10.0-1025lpv1912057c) and attached dmesg. This kernel does not fix the issue. ASPM is Disabled for NVMe and higher power consumption after suspend/resume.

Revision history for this message
koba (kobako) wrote :

@Henrik,
Thanks for your hard work, Would you please try another kernel that
avoids nvme to fall into d3cold.
Please also collect dmesg, thanks
https://drive.google.com/drive/folders/1Du-A8EJYxUwS7zkYkfox7JFeU6u8puEZ?usp=sharing

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

@Koba
I can only see the buildinfo file in the last folder.

Revision history for this message
koba (kobako) wrote :

@Henrik, please check it again. Something wrong during uploading and
it was fixed.

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

@Koba, I have tried kernel from #45 (5.10.0-1025lpv1912057d). After suspend, it fails when resuming. The laptop wakes up and I am able to log in, but then it freezes. So no dmesg.

Revision history for this message
koba (kobako) wrote :

@Henrik, I'm very appreciative you could help me to figure out this
issue, thanks.
Sorry for the crash, would you please help to try this and help to
collect dmesg?
https://drive.google.com/drive/folders/10GpLDqM6OXQ11tDmX5zq6ppA5wsCKeIQ?usp=sharing
Thanks again.

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

Hi Koba,
This time the laptop did not resume at all. It did boot first and seemed to function normally but I did not test and quite fast after 30-60sec suspended it. Do you want a dmesg just after boot?

Revision history for this message
koba (kobako) wrote :

@Henrik,
Please fire the cmd to gather dmesg after you load the last test
kernel(5.10.0-1025lpv1912057e) and reboot at the first time.
#journalctl -k -b -1

After that, please load the official 5.10-oem-1025 and disable d3cold for nvme
#echo 0 | sudo tee /sys/bus/pci/devices/0000\:58\:00.0/d3cold_allowed
Then try to s2idle&resume and measure the power consumption .

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

@Koba,
Here is the dmesg from kernel(5.10.0-1025lpv1912057e).
For the official 5.10-oem-1025 it seems like disabling d3cold for nvme fixes the issue. ASPM L1 Enabled for nvme and power consumption after suspend/resume is as before. I have attached output from powertop.

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :
Revision history for this message
Henrik Juul Hansen (juuligen) wrote :
Revision history for this message
koba (kobako) wrote :

@Henrik,
Thanks.
here's another test kernel that disable d3cold_allowed for nvme with hmb.
https://drive.google.com/drive/folders/1gK67OYm6-jeKLj7GIQNZLCaHuaOwVTr1?usp=sharing

Please check the status of d3cold_allowed(it must be 0) before
s2idle&resume.
sudo cat /sys/bus/pci/devices/0000\:58\:00.0/d3cold_allowed

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

@Koba,
Thanks for your help.
I have tried the kernel from #55.
sudo cat /sys/bus/pci/devices/0000\:58\:00.0/d3cold_allowed showed "0".
The suspend & resume did end up in a freeze. I have attached dmesg (journalctl -k -b -1 on next boot).
Just as a note I saw the power consumption during suspend was around 1.0W compared to 0.5W using the 5.10-oem-1025 and "echo 0 | sudo tee...." from #51. This was using an external AC power meter - so it includes the power loss in the AC adapter.

Revision history for this message
koba (kobako) wrote :

@Henrik,
Sorry for causing your inconvenience.
don't be in a hurry and if you have free time, please help to try this
test kernel remove irrelevant codes.
https://drive.google.com/drive/folders/1wRGJNPU0LmmyJMkVLZvG7xLKqinsOZU0?usp=sharing

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

@Koba,
I have tried kernel 5.10.0-1025lpv1912057g. After boot cat /sys/bus/pci/devices/0000\:58\:00.0/d3cold_allowed showed "0".
After suspend/resume the power consumption is higher. ASPM is disabled for Nvme. I have attached dmesg.

Revision history for this message
koba (kobako) wrote :

@Henrik,
Thanks, would you please try this that follows the original path of
d3cold_allowed to disable nvme d3cold.
https://drive.google.com/drive/folders/19-dxtVTFNq5WtN3KOxjOrAXQ_1kYrnbh?usp=sharing

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

@Koba,
I have tried kernel 5.10.0-1025lpv1912057h. After boot cat /sys/bus/pci/devices/0000\:58\:00.0/d3cold_allowed showed "0".
After suspend/resume power consumption is the same and ASPM L1 is enabled for NVMe. I have attached dmesg. I will test the kernel for a longer time period and see if the power consumption changes.

Revision history for this message
koba (kobako) wrote :

@Henrik,
Thanks a lot for your efforts, can I have you test-by signature?

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

@Koba,
Sorry, but I do not know what "test-by signature" is.

Revision history for this message
koba (kobako) wrote :

@Henrik,
For example, someone help to test a solution and the developer come out a
patch,
The developer would leave the guy's signature in his patch.
If you want this way, please give your email and the full name.

commit 4514d991d99211f225d83b7e640285f29f0755d0
Author: Rafael J. Wysocki <email address hidden>
Date: Tue Mar 16 16:51:40 2021 +0100

    PCI: PM: Do not read power state in pci_enable_device_flags()
...
    Link:
https://<email address hidden>/
    Reported-by: Maximilian Luz <email address hidden>
    Tested-by: Maximilian Luz <email address hidden>
    Signed-off-by: Rafael J. Wysocki <email address hidden>
    Reviewed-by: Mika Westerberg <email address hidden>

koba (kobako)
Changed in linux (Ubuntu):
status: Incomplete → In Progress
Revision history for this message
koba (kobako) wrote :

@Henrik,
would you mind to verify this that wakes nvme from d3.
https://drive.google.com/drive/folders/1O30CIxm4YZVexBUc919yI5D3Jmw8M0V4?usp=sharing

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

@Koba,
I have tested kernel 5.10.0-1025lpv1912057i.
After boot cat /sys/bus/pci/devices/0000\:58\:00.0/d3cold_allowed showed "0".
After suspend/resume power consumption is the same as before and ASPM L1 is enabled for NVMe. I have attached dmesg.

My full name is Henrik Juul Hansen and email is <email address hidden>. You are welcome to put my name on the things I have tested. I have updated my Launchpad info so it should be visible to others.

Revision history for this message
koba (kobako) wrote :

@Henrik,
forget to reset the patch that disables d3cold_allowed and please try this
again. thanks
the d3cold_allowed shoube 1.
https://drive.google.com/drive/folders/17qeSGLKz8hR8TLsr35qS_4RSDUZcpaLu?usp=sharing

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

@Koba
I tried kernel 5.13.0-051300rc2-generic. After boot d3cold_allowed was "1", ASPM L1 enabled for nvme (and bluetooth did not work). After suspend/resume ASPM L1 was disabled for nvme. Power consumption higher. dmesg is attached.

Revision history for this message
koba (kobako) wrote :

@Henrik, would you please help to collect lspci -vvv
#sudo lspci -vvv

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

@Koba, I have attached output from "sudo lspci -vvv" before and after suspend/resume for kernel 5.13.0-051300rc2-generic.

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :
Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

Just a follow-up to anyone else is using this hardware.
I have been using kernel 5.11.0-17-generic and "echo 0 | sudo tee /sys/bus/pci/devices/0000\:58\:00.0/d3cold_allowed" just after boot. Now I have uptime 4 days 23 hours with many suspend/resumes and power consumption is remaining as it is just after boot.

koba (kobako)
Changed in linux (Ubuntu):
status: In Progress → Incomplete
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :
Revision history for this message
Henrik Juul Hansen (juuligen) wrote :

Kai-Heng,
I have tried kernel 5.13.0+ from #73.
Power consumption is the same and NVMe has ASPM L1 enabled both before and after suspend/resume. I have been running the kernel for a few hours now and also done some suspend/resume and it seems to work fine.
I have attached dmesg and powertop output.

Revision history for this message
Henrik Juul Hansen (juuligen) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.