Using an NVMe drive causes huge power drain

Bug #1664602 reported by Francois Thirioux on 2017-02-14
42
This bug affects 7 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Kai-Heng Feng
Xenial
Undecided
Unassigned
Yakkety
Undecided
Unassigned
Zesty
Medium
Kai-Heng Feng

Bug Description

PCIe NVMe drives are very common in new laptops. The Zesty's kernels (including latest 4.10 rc8 from CKT PPA) does not support APST (autonomous power state transitions). A patch does exist :
http://lists.infradead.org/pipermail/linux-nvme/2017-February/008051.html
https://github.com/damige/linux-nvme

It seems that we cannot expect this before kernel 4.11.

Additionally my laptop CPU does never go under PC3 power saving state, powertop says. I don't know if it's related.
---
ApportVersion: 2.20.4-0ubuntu2
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: ft 1900 F.... pulseaudio
CurrentDesktop: GNOME
DistroRelease: Ubuntu 17.04
HibernationDevice: RESUME=UUID=a04b55bf-b1b6-464c-88ce-44b6adbbbc10
InstallationDate: Installed on 2016-12-12 (63 days ago)
InstallationMedia: Ubuntu-GNOME 17.04 "Zesty Zapus" - Alpha amd64 (20161211)
Lsusb:
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Dell Inc. Precision 7510
Package: linux (not installed)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=fr_FR.UTF-8
 SHELL=/bin/bash
ProcFB:
 0 nouveaufb
 1 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.10.0-8-generic root=UUID=0d1f213e-73a9-4097-ae64-9cd963cfba23 ro quiet
ProcVersionSignature: Ubuntu 4.10.0-8.10-generic 4.10.0-rc8
RelatedPackageVersions:
 linux-restricted-modules-4.10.0-8-generic N/A
 linux-backports-modules-4.10.0-8-generic N/A
 linux-firmware 1.163
Tags: zesty
Uname: Linux 4.10.0-8-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
_MarkForUpload: True
dmi.bios.date: 12/22/2016
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 1.9.5
dmi.board.name: 0YH43H
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 9
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr1.9.5:bd12/22/2016:svnDellInc.:pnPrecision7510:pvr:rvnDellInc.:rn0YH43H:rvrA00:cvnDellInc.:ct9:cvr:
dmi.product.name: Precision 7510
dmi.sys.vendor: Dell Inc.

CVE References

Francois Thirioux (fthx) on 2017-02-14
tags: added: apport-collected
description: updated

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Triaged
tags: added: kernel-da-key
Francois Thirioux (fthx) wrote :

Solved upstream with kernel 4.11 RC1 from Ubuntu kernel archive.
I'm back to 4.23 W instead of 8.26 W in same power conditions.

I reach PC8 power state (powertop) instead of PC3.

Hope the APST Andy's patch (I guess it's the key ?) will be backported soon to Zesty's kernel.

tags: added: solved-upstream
Tim Gardner (timg-tpi) wrote :

Francois - I doubt if that is going to happen for 17.04. It appears that single patch requires a substantial amount of infrastructure and prerequisite patches.

Changed in linux (Ubuntu Zesty):
assignee: nobody → Tim Gardner (timg-tpi)
status: Triaged → Won't Fix
Changed in linux (Ubuntu):
status: Triaged → Fix Released
Francois Thirioux (fthx) wrote :

I know that my bugs seem to me more important than other people's bugs because that's MY bugs :-) .

But *IMHO* this lack of NVMe power management (and consequently Skylake CPU low power states usage) is critical since :
1) it voids a lot of power management efforts these last years in Ubuntu/Linux ;
2) low power states are (AFAIK) required for Skylake CPU (and NVMe drive ?) long term durability.

I was hoping that for Zesty as this bug is not wontfixed and seems to be useful only for APST :
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1666401

Francois Thirioux (fthx) on 2017-03-14
tags: added: kernel-fixed-upstream
removed: solved-upstream
Dave Chiluk (chiluk) wrote :

I can confirm Francois experience with my 950 pro. Although I saved one or two watts. It's a bit hard to tell. However my PC states were already reaching PC8 before this patch.

Kai-Heng Feng (kaihengfeng) wrote :

I found that APST patches can be cleanly cherry-picked into zesty tree, please try it:
http://people.canonical.com/~khfeng/lp1664602/

Francois Thirioux (fthx) wrote :

I've tested it now and it seems to work flawlessly.
Power usage is very similar to 4.11 kernel.

It would be great to see your patches in Zesty.

Tim Gardner (timg-tpi) wrote :

Kai-Heng Feng: Please post a git branch from which I can pull your patches (or perhaps just attach them to this bug). Make sure the upstream commit SHA1 is included by using the '-x' option of 'git cherry-pick'.

mahmoh (mahmoh) wrote :

I can confirm Kai-Heng's patched kernel (linux-image-4.10.0-14-generic_4.10.0-14.16_amd64.deb) triggers pc8 state with a Samsung 960 EVO, where the stock kernel only attains pc3 at best.

Dave Chiluk (chiluk) wrote :

@Tim, When I checked the backport it was not a clean cherry-pick. As you mentioned earlier there was some prerequisite patches that git pulled in. I did not continue investigating, but I guess there should only be a few. Either way the patches seem nvme

@Kai-Heng Feng, please make sure you appropriately document and separate out patches that cleanly cherry-pick. Also please be careful for context patches that git pulls in to make the patch apply.

Kai-Heng Feng (kaihengfeng) wrote :

@Dave

I was wrong, you are right, it's not a clean cherry-pick.

I picked 7ac8b7abdb8d903c83fa15cfd952316e824fe6f3 from nvme tree (which can be clean applied) instead of bd4da3abaabffdd2472fb7085fcadd5d1d8c2153 from master.

The later one's diff contextually depends 8a9ae523282f324989850fcf41312b42a2fb9296, which is not a functional dependendency to APST, so the conflict can be easily resolved.

Tim Gardner (timg-tpi) on 2017-03-16
Changed in linux (Ubuntu Zesty):
assignee: Tim Gardner (timg-tpi) → Kai-Heng Feng (kaihengfeng)
status: Won't Fix → In Progress
status: In Progress → Fix Committed
Dave Chiluk (chiluk) wrote :

@Kai-Heng have you been able to make any progress on this? I think Tim is simply waiting on a git branch or git send-email submission to the kernel-team mailing list.

Kai-Heng Feng (kaihengfeng) wrote :

@Dave,

For Zesty, it's already in the tree.
For Yakkety, I already sent patches to kernel-team ML.
For Xenial, I'd like to see if there's any regression after some real world usages in Yakkety and Zesty - after all the patch author made a blacklist table for a reason. If everything goes smoothly, I'll backport it to Xenial.

Launchpad Janitor (janitor) wrote :
Download full text (17.1 KiB)

This bug was fixed in the package linux - 4.10.0-14.16

---------------
linux (4.10.0-14.16) zesty; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1673805

  * msleep() bug causes Nuvoton I2C TPM device driver delays (LP: #1667567)
    - tpm: msleep() delays - replace with usleep_range() in i2c nuvoton driver
    - SAUCE: tpm: add sleep only for retry in i2c_nuvoton_write_status()

  * C++ demangling support missing from perf (LP: #1396654)
    - [Config] added binutils-dev to Build-deps

  * dm-queue-length module is not included in installer/initramfs (LP: #1673350)
    - [Config] d-i: Also add dm-queue-length to multipath modules

  * move aufs.ko from -extra to linux-image package (LP: #1673498)
    - [config] aufs.ko moved to linux-image package

  * Using an NVMe drive causes huge power drain (LP: #1664602)
    - nvme: Add a quirk mechanism that uses identify_ctrl
    - nvme: Enable autonomous power state transitions

  * Broadcom bluetooth modules sometimes fail to initialize (LP: #1483101)
    - Bluetooth: btbcm: Add a delay for module reset

  * Need support of Broadcom bluetooth device [413c:8143] (LP: #1166113)
    - Bluetooth: btusb: Add support for 413c:8143

  * Zesty update to v4.10.3 stable release (LP: #1673118)
    - serial: 8250_pci: Add MKS Tenta SCOM-0800 and SCOM-0801 cards
    - KVM: s390: Disable dirty log retrieval for UCONTROL guests
    - KVM: VMX: use correct vmcs_read/write for guest segment selector/base
    - Bluetooth: Add another AR3012 04ca:3018 device
    - phy: qcom-ufs: Don't kfree devres resource
    - phy: qcom-ufs: Fix misplaced jump label
    - s390/qdio: clear DSCI prior to scanning multiple input queues
    - s390/dcssblk: fix device size calculation in dcssblk_direct_access()
    - s390/kdump: Use "LINUX" ELF note name instead of "CORE"
    - s390/chsc: Add exception handler for CHSC instruction
    - s390: TASK_SIZE for kernel threads
    - s390/topology: correct allocation of topology information
    - s390: make setup_randomness work
    - s390: use correct input data address for setup_randomness
    - net: mvpp2: fix DMA address calculation in mvpp2_txq_inc_put()
    - cxl: Prevent read/write to AFU config space while AFU not configured
    - cxl: fix nested locking hang during EEH hotplug
    - brcmfmac: fix incorrect event channel deduction
    - mnt: Tuck mounts under others instead of creating shadow/side mounts.
    - IB/ipoib: Fix deadlock between rmmod and set_mode
    - IB/IPoIB: Add destination address when re-queue packet
    - IB/mlx5: Fix out-of-bound access
    - IB/SRP: Avoid using IB_MR_TYPE_SG_GAPS
    - IB/srp: Avoid that duplicate responses trigger a kernel bug
    - IB/srp: Fix race conditions related to task management
    - Btrfs: fix data loss after truncate when using the no-holes feature
    - orangefs: Use RCU for destroy_inode
    - memory/atmel-ebi: Fix ns <-> cycles conversions
    - tracing: Fix return value check in trace_benchmark_reg()
    - ktest: Fix child exit code processing
    - ceph: remove req from unsafe list when unregistering it
    - target: Fix NULL dereference during LUN lookup + active I/O shutdown
    - drivers/pci/hotplug: Han...

Changed in linux (Ubuntu Zesty):
status: Fix Committed → Fix Released
Francois Thirioux (fthx) wrote :

After some reasonable big time of use, I get (as far I noticed in user interface or system journals) no errors and not any hang. I did not notice any slowdown either.

Thanks a lot, my battery is smiling again.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu Xenial):
status: New → Confirmed
Changed in linux (Ubuntu Yakkety):
status: New → Confirmed
Changed in linux (Ubuntu Yakkety):
status: Confirmed → Fix Committed
tags: removed: kernel-da-key

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-yakkety' to 'verification-done-yakkety'. If the problem still exists, change the tag 'verification-needed-yakkety' to 'verification-failed-yakkety'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-yakkety
tags: added: verification-done-yakkety
removed: verification-needed-yakkety
Launchpad Janitor (janitor) wrote :
Download full text (14.5 KiB)

This bug was fixed in the package linux - 4.8.0-49.52

---------------
linux (4.8.0-49.52) yakkety; urgency=low

  * linux: 4.8.0-49.52 -proposed tracker (LP: #1684427)

  * [Hyper-V] hv: util: move waiting for release to hv_utils_transport itself
    (LP: #1682561)
    - Drivers: hv: util: move waiting for release to hv_utils_transport itself

linux (4.8.0-48.51) yakkety; urgency=low

  * linux: 4.8.0-48.51 -proposed tracker (LP: #1682034)

  * [Hyper-V] hv: vmbus: Raise retry/wait limits in vmbus_post_msg()
    (LP: #1681893)
    - Drivers: hv: vmbus: Raise retry/wait limits in vmbus_post_msg()

linux (4.8.0-47.50) yakkety; urgency=low

  * linux: 4.8.0-47.50 -proposed tracker (LP: #1679678)

  * CVE-2017-6353
    - sctp: deny peeloff operation on asocs with threads sleeping on it

  * CVE-2017-5986
    - sctp: avoid BUG_ON on sctp_wait_for_sndbuf

  * vfat: missing iso8859-1 charset (LP: #1677230)
    - [Config] NLS_ISO8859_1=y

  * [Hyper-V] pci-hyperv: Use device serial number as PCI domain (LP: #1667527)
    - net/mlx4_core: Use cq quota in SRIOV when creating completion EQs

  * Regression: KVM modules should be on main kernel package (LP: #1678099)
    - [Config] powerpc: Add kvm-hv and kvm-pr to the generic inclusion list

  * linux-lts-xenial 4.4.0-63.84~14.04.2 ADT test failure with linux-lts-xenial
    4.4.0-63.84~14.04.2 (LP: #1664912)
    - SAUCE: apparmor: fix link auditing failure due to, uninitialized var

  * regession tests failing after stackprofile test is run (LP: #1661030)
    - SAUCE: fix regression with domain change in complain mode

  * Permission denied and inconsistent behavior in complain mode with 'ip netns
    list' command (LP: #1648903)
    - SAUCE: fix regression with domain change in complain mode

  * unexpected errno=13 and disconnected path when trying to open /proc/1/ns/mnt
    from a unshared mount namespace (LP: #1656121)
    - SAUCE: apparmor: null profiles should inherit parent control flags

  * apparmor refcount leak of profile namespace when removing profiles
    (LP: #1660849)
    - SAUCE: apparmor: fix ns ref count link when removing profiles from policy

  * tor in lxd: apparmor="DENIED" operation="change_onexec"
    namespace="root//CONTAINERNAME_<var-lib-lxd>" profile="unconfined"
    name="system_tor" (LP: #1648143)
    - SAUCE: apparmor: Fix no_new_privs blocking change_onexec when using stacked
      namespaces

  * apparmor oops in bind_mnt when dev_path lookup fails (LP: #1660840)
    - SAUCE: apparmor: fix oops in bind_mnt when dev_path lookup fails

  * apparmor auditing denied access of special apparmor .null fi\ le
    (LP: #1660836)
    - SAUCE: apparmor: Don't audit denied access of special apparmor .null file

  * apparmor label leak when new label is unused (LP: #1660834)
    - SAUCE: apparmor: fix label leak when new label is unused

  * apparmor reference count bug in label_merge_insert() (LP: #1660833)
    - SAUCE: apparmor: fix reference count bug in label_merge_insert()

  * apparmor's raw_data file in securityfs is sometimes truncated (LP: #1638996)
    - SAUCE: apparmor: fix replacement race in reading rawdata

  * unix domain socket cross permission check failing with n...

Changed in linux (Ubuntu Yakkety):
status: Fix Committed → Fix Released

Unfortunately, this fix breaks Dell XPS 9550 with Samsung PM951 drives. I now get a hung system due to the Samsung controller getting reset. Please see https://bugzilla.kernel.org/show_bug.cgi?id=195039 for a discussion of this issue.

The quirk detection is not complete, I fear

Dmitry Gutov (dgutov) wrote :

I'm pretty sure this is not the first time a power-saving feature breaks some subset of hardware. Why even add it in a bugfix update?

After applying the workaround from LP#1678184, I'm happy about the somewhat improved battery life, but I'll never get back the several hours it took to figure out what's going on, and considering whether I should contact Dell support about my failing storage device (which wouldn't have helped, but would probably have wasted several weeks on top of that).

And I'm still on 16.10. Why was this even released in Yakkety, with LP#1678184 having been reported on March 31?

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.