[lunar] Nvidia GPU unused but still powered

Bug #2009687 reported by Francois Thirioux
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mutter
New
Unknown
mutter (Ubuntu)
Invalid
Medium
Unassigned

Bug Description

Hi,

Lenovo Legion 5 16"
i7 12700H
RTX 3070 Ti
Nvidia 525 proprietary driver (official Ubuntu packages)
Nvidia on-demand mode
Ubuntu Wayland or Xorg sessions
Ubuntu Lunar up to date

I noticed a high power usage using powertop (or simply remaining time in g-c-c...).
Nvidia driver seems offloaded but gpu power usage still shows ~ 9W -> 23W, random.

uname -a
Linux ****** 6.1.0-16-generic #16-Ubuntu SMP PREEMPT_DYNAMIC Fri Feb 24 14:37:30 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

nvidia-smi
Wed Mar 8 10:59:21 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.05 Driver Version: 525.85.05 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A |
| N/A 34C P8 9W / 50W | 4MiB / 8192MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 5445 G /usr/bin/gnome-shell 3MiB |
+-----------------------------------------------------------------------------+

glxinfo -B
name of display: :1
display: :1 screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: Intel (0x8086)
    Device: Mesa Intel(R) Graphics (ADL GT2) (0x46a6)
    Version: 22.3.6
    Accelerated: yes
    Video memory: 31821MB
    Unified memory: yes
    Preferred profile: core (0x1)
    Max core profile version: 4.6
    Max compat profile version: 4.6
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2
OpenGL vendor string: Intel
OpenGL renderer string: Mesa Intel(R) Graphics (ADL GT2)
OpenGL core profile version string: 4.6 (Core Profile) Mesa 22.3.6
OpenGL core profile shading language version string: 4.60
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile

OpenGL version string: 4.6 (Compatibility Profile) Mesa 22.3.6
OpenGL shading language version string: 4.60
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile

OpenGL ES profile version string: OpenGL ES 3.2 Mesa 22.3.6
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20

lspci
00:00.0 Host bridge: Intel Corporation 12th Gen Core Processor Host Bridge/DRAM Registers (rev 02)
00:01.0 PCI bridge: Intel Corporation 12th Gen Core Processor PCI Express x16 Controller #1 (rev 02)
00:02.0 VGA compatible controller: Intel Corporation Alder Lake-P Integrated Graphics Controller (rev 0c)
00:04.0 Signal processing controller: Intel Corporation Alder Lake Innovation Platform Framework Processor Participant (rev 02)
00:06.0 PCI bridge: Intel Corporation 12th Gen Core Processor PCI Express x4 Controller #0 (rev 02)
00:06.2 PCI bridge: Intel Corporation 12th Gen Core Processor PCI Express x4 Controller #2 (rev 02)
00:07.0 PCI bridge: Intel Corporation Alder Lake-P Thunderbolt 4 PCI Express Root Port #0 (rev 02)
00:0a.0 Signal processing controller: Intel Corporation Platform Monitoring Technology (rev 01)
00:0d.0 USB controller: Intel Corporation Alder Lake-P Thunderbolt 4 USB Controller (rev 02)
00:0d.2 USB controller: Intel Corporation Alder Lake-P Thunderbolt 4 NHI #0 (rev 02)
00:14.0 USB controller: Intel Corporation Alder Lake PCH USB 3.2 xHCI Host Controller (rev 01)
00:14.2 RAM memory: Intel Corporation Alder Lake PCH Shared SRAM (rev 01)
00:14.3 Network controller: Intel Corporation Alder Lake-P PCH CNVi WiFi (rev 01)
00:15.0 Serial bus controller: Intel Corporation Alder Lake PCH Serial IO I2C Controller #0 (rev 01)
00:15.1 Serial bus controller: Intel Corporation Alder Lake PCH Serial IO I2C Controller #1 (rev 01)
00:16.0 Communication controller: Intel Corporation Alder Lake PCH HECI Controller (rev 01)
00:1d.0 PCI bridge: Intel Corporation Device 51b0 (rev 01)
00:1d.1 PCI bridge: Intel Corporation Alder Lake PCI Express x1 Root Port #10 (rev 01)
00:1f.0 ISA bridge: Intel Corporation Alder Lake PCH eSPI Controller (rev 01)
00:1f.3 Audio device: Intel Corporation Alder Lake PCH-P High Definition Audio Controller (rev 01)
00:1f.4 SMBus: Intel Corporation Alder Lake PCH-P SMBus Host Controller (rev 01)
00:1f.5 Serial bus controller: Intel Corporation Alder Lake-P PCH SPI Controller (rev 01)
01:00.0 VGA compatible controller: NVIDIA Corporation GA104M [Geforce RTX 3070 Ti Laptop GPU] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GA104 High Definition Audio Controller (rev a1)
05:00.0 Non-Volatile memory controller: SK hynix Gold P31/PC711 NVMe Solid State Drive
06:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
34:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

In Nvidia terms, "offloaded" means using more GPUs and more GPU power, not less.

What exactly do you mean by "offloaded" though?

Changed in nvidia-graphics-drivers-525 (Ubuntu):
status: New → Incomplete
tags: added: lunar nvidia
Revision history for this message
Francois Thirioux (fthx) wrote :

Ok, ehm. :-)

Well: I mean my gpu is supposed to be not used since I use hybrid graphics with on-demand nvidia mode. But I see that the power usage is quite high (in Jammy I had ~ 8-9W, here ~ 19-20 W in idle usage) and I see nvidia-smi is showing me some power usage from the nvidia gpu.

Revision history for this message
Francois Thirioux (fthx) wrote :

In logs for "nvidia" (using here a wayland Ubuntu session):

10:19:35 Xorg: (II) NVIDIA(GPU-0): Deleting GPU-0
10:19:35 Xorg: (II) NVIDIA(GPU-0): Deleting GPU-0
10:19:35 Xorg: (WW) NVIDIA(G0): - Setting a mode on head 3 failed: Insufficient permissions
10:19:35 Xorg: (WW) NVIDIA(G0): - Setting a mode on head 2 failed: Insufficient permissions
10:19:35 Xorg: (WW) NVIDIA(G0): - Setting a mode on head 1 failed: Insufficient permissions
10:19:35 Xorg: (WW) NVIDIA(G0): - Setting a mode on head 0 failed: Insufficient permissions
10:19:35 Xorg: (WW) NVIDIA(G0): Failed to set the display configuration
10:19:34 systemd: Started app-gnome-nvidia\x2dsettings\x2dautostart-3363.scope - Application launched by gnome-session-binary.
10:19:32 gnome-shell: Added device '/dev/dri/card1' (nvidia-drm) using atomic mode setting.
10:19:23 Xorg: (--) NVIDIA(GPU-0):
10:19:20 systemd: Finished systemd-backlight@backlight:nvidia_0.service - Load/Save Screen Backlight Brightness of backlight:nvidia_0.
10:17:49 kernel: nvidia-uvm: Loaded the UVM driver, major device number 507.
10:17:49 kernel: nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
10:17:49 systemd: Starting systemd-backlight@backlight:nvidia_0.service - Load/Save Screen Backlight Brightness of backlight:nvidia_0...
10:17:49 kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1
10:17:49 kernel: nvidia-modeset: WARNING: GPU:0: Unable to read EDID for display device DP-4
10:17:48 kernel: audit: type=1400 audit(1678353468.692:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=1429 comm="apparmor_parser"
10:17:48 apparmor_parser: AVC apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=1429 comm="apparmor_parser"
10:17:48 kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
10:17:48 kernel: input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input18
10:17:48 kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 525.85.05 Sat Jan 14 00:40:03 UTC 2023
10:17:48 kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 525.85.05 Sat Jan 14 00:49:50 UTC 2023
10:17:47 kernel: nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
10:17:47 kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 509
10:17:47 kernel: nvidia: module license 'NVIDIA' taints kernel.

summary: - [lunar] Nvidia GPU offloaded but still powered
+ [lunar] Nvidia GPU unused but still powered
Changed in nvidia-graphics-drivers-525 (Ubuntu):
status: Incomplete → New
tags: added: hybrid multigpu
Revision history for this message
Daniel van Vugt (vanvugt) wrote :
Changed in gnome-shell (Ubuntu):
status: New → Confirmed
Changed in ubuntu-power-consumption:
status: New → Confirmed
importance: Undecided → Medium
Changed in gnome-shell (Ubuntu):
importance: Undecided → Medium
no longer affects: nvidia-graphics-drivers-525 (Ubuntu)
Changed in ubuntu-power-consumption:
status: Confirmed → Triaged
Changed in gnome-shell (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
Francois Thirioux (fthx) wrote :

Using new 6.2 kernel (from bootstrap ckt ppa) and Nouveau, I do have a very normal ~ 8 W power usage.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

This might be related to a triple buffering issue:
https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1441#note_1679803

If it is caused by triple buffering then you will find the issue is temporarily fixed in mutter 44~rc (which lacks triple buffering), and will return later in 44.0.

Revision history for this message
Francois Thirioux (fthx) wrote :

I do not experience this issue anymore (GS 44rc, 6.2 kernel, 525 Ubuntu main driver).
Using powertop does not seem to be reliable but looking at GNOME's energy panel, estimated remaining time seems ok. But it varies often with a 0.5 factor (without any visible reason).

Changed in gnome-shell:
status: Unknown → New
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

We've narrowed a related (hopefully the same) issue down to being introduced in mutter!1968, which is the precursor to triple buffering. Not triple buffering itself, but part of the same patch.

https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1441#note_1757415

Changed in gnome-shell (Ubuntu):
assignee: nobody → Daniel van Vugt (vanvugt)
Changed in gnome-shell:
status: New → Fix Released
Changed in mutter:
status: Unknown → New
no longer affects: gnome-shell
affects: gnome-shell (Ubuntu) → mutter (Ubuntu)
Changed in mutter (Ubuntu):
assignee: Daniel van Vugt (vanvugt) → nobody
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Closing per comment #7. But if anyone experiences similar issues then it's still open in:

https://gitlab.gnome.org/GNOME/mutter/-/issues/2969
and
https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1968#note_1757553

no longer affects: ubuntu-power-consumption
Changed in mutter (Ubuntu):
status: Triaged → Won't Fix
status: Won't Fix → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.