[nvidia] Shell hangs up at random times when using nvidia-drm.modeset=1

Bug #1823301 reported by Didier Roche-Tolomelli
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
mutter (Ubuntu)
Won't Fix
High
Unassigned

Bug Description

Mutter is hanging for multiple seconds (5 or so) at random times, especially when your user session starts and you open new applications from the launcher:
- the screen freeze (apps & Shell) completely
- you have a second cursor that you can move and refresh during that period

It seems that disabling drm nvidia support prevent the issue from happening so far. More update if it's not the case anymore.

ProblemType: Bug
DistroRelease: Ubuntu 19.04
Package: mutter 3.32.0-1ubuntu1
ProcVersionSignature: Ubuntu 5.0.0-8.9-generic 5.0.1
Uname: Linux 5.0.0-8-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair nvidia_modeset nvidia
ApportVersion: 2.20.10-0ubuntu23
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Fri Apr 5 10:21:09 2019
InstallationDate: Installed on 2018-05-24 (315 days ago)
InstallationMedia: Ubuntu 18.04 LTS "Bionic Beaver" - Release amd64 (20180426)
SourcePackage: mutter
UpgradeStatus: Upgraded to disco on 2019-01-08 (86 days ago)

Revision history for this message
Didier Roche-Tolomelli (didrocks) wrote :
summary: - Shell hangs up at random times
+ [nvidia] Shell hangs up at random times when using nvidia-drm.modeset=1
tags: added: nvidia performance
Revision history for this message
Didier Roche-Tolomelli (didrocks) wrote :

00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 630 (Mobile)
01:00.0 VGA compatible controller: NVIDIA Corporation GP107M [GeForce GTX 1050 Mobile] (rev a1)

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Didier:

 * Are you using multiple monitors?

 * Which GPUs have a monitor plugged in?

 * (from an SSH login) Do you see high CPU or memory usage of the gnome-shell process?

 * Is the problem only Xorg sessions, only Wayland sessions, or both?

tags: added: nvidia-drm.modeset
Changed in mutter (Ubuntu):
status: New → Incomplete
Revision history for this message
Didier Roche-Tolomelli (didrocks) wrote :

-> I'm using 2 monitors (1 internal and 1 external).
-> The nvidia GPU is enabled, monitor is plugged to it through HDMI.
-> The CPU usage of GNOME Shell spikes. There is no writed at the same time (the led doesn't blink) and no other process seems to run for it. The memory usages seems stable though. See log attachement which is a snapshot of CPU and MEM usage every second (you can see at the end a freeze for multiple seconds corresponding to GNOME Shell CPU spike). CPU spike can go up to 200%.
-> I've run a wayland session for a period of time and didn't notice any issues. So seems Xorg specific.

Note that this is a regression as I'm running Xorg + drm modeset for a year or so without experiencing that previously.

Changed in mutter (Ubuntu):
status: Incomplete → New
Revision history for this message
Didier Roche-Tolomelli (didrocks) wrote :
Revision history for this message
Dylan Borg (borgdylan) wrote :

Would I encounter this bug if I upgrade to disco?

I have an nvidia GPU, nvidia_modeset module is shown in lsmod but I have not set "nvidia-drm.modeset=1" in the kernel command line.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

didrocks,

While experiencing the bug, what output do you get from:

1. glxinfo | grep OpenGL

2. xrandr

?

Changed in mutter (Ubuntu):
status: New → Incomplete
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Dylan,

It's unlikely you will encounter this bug since so far it only affects one person. But also no one can say for sure. You would have to try it.

Changed in mutter (Ubuntu):
importance: Undecided → High
Revision history for this message
Dylan Borg (borgdylan) wrote :

So after upgrading, I cannot see any abnormal rise in memory consumption. I'm glad I risked it and upgraded. I do not use zsh and am using the latest stable kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/ and the latest available nvidia driver from https://launchpad.net/~graphics-drivers.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Dylan,

Please don't comment on this bug for now. If you ever have any problems then please open a new bug by running:

   ubuntu-bug mutter

Revision history for this message
Didier Roche-Tolomelli (didrocks) wrote :
Download full text (3.3 KiB)

here is the output (reenabled drm to get it back):
$ glxinfo | grep OpenGL
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: GeForce GTX 1050/PCIe/SSE2
OpenGL core profile version string: 4.6.0 NVIDIA 390.116
OpenGL core profile shading language version string: 4.60 NVIDIA
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 4.6.0 NVIDIA 390.116
OpenGL shading language version string: 4.60 NVIDIA
OpenGL context flags: (none)
OpenGL profile mask: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.2 NVIDIA 390.116
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
OpenGL ES profile extensions:
$ xrandr
Screen 0: minimum 8 x 8, current 3840 x 1080, maximum 32767 x 32767
eDP-1-1 connected 1920x1080+0+0 (normal left inverted right x axis y axis) 344mm x 194mm
   1920x1080 60.02*+ 60.01 59.97 59.96 59.93
   1680x1050 59.95 59.88
   1600x1024 60.17
   1400x1050 59.98
   1600x900 59.99 59.94 59.95 59.82
   1280x1024 60.02
   1440x900 59.89
   1400x900 59.96 59.88
   1280x960 60.00
   1440x810 60.00 59.97
   1368x768 59.88 59.85
   1360x768 59.80 59.96
   1280x800 59.99 59.97 59.81 59.91
   1152x864 60.00
   1280x720 60.00 59.99 59.86 59.74
   1024x768 60.04 60.00
   960x720 60.00
   928x696 60.05
   896x672 60.01
   1024x576 59.95 59.96 59.90 59.82
   960x600 59.93 60.00
   960x540 59.96 59.99 59.63 59.82
   800x600 60.00 60.32 56.25
   840x525 60.01 59.88
   864x486 59.92 59.57
   800x512 60.17
   700x525 59.98
   800x450 59.95 59.82
   640x512 60.02
   720x450 59.89
   700x450 59.96 59.88
   640x480 60.00 59.94
   720x405 59.51 58.99
   684x384 59.88 59.85
   680x384 59.80 59.96
   640x400 59.88 59.98
   576x432 60.06
   640x360 59.86 59.83 59.84 59.32
   512x384 60.00
   512x288 60.00 59.92
   480x270 59.63 59.82
   400x300 60.32 56.34
   432x243 59.92 59.57
   320x240 60.05
   360x202 59.51 59.13
   320x180 59.84 59.32
DP-1-1 disconnected (normal left inverted right x axis y axis)
HDMI-1-1 disconnected (normal left inverted right x axis y axis)
HDMI-1-2 connected primary 1920x1080+1920+0 (normal left inverted right x axis y axis) 160mm x 90mm
   1920x1080 60.00*+ 60.00 50.00 59.94
   1920x1080i 60.00 60.00 50.00 50.00 59.94
   1600x1200 60.00
   1680x1050 59.88
   1280x1024 75.02 60.02
   1440x900 74.98 59.90
   1280x960 60.00
   1280x800 59.91
   1152x864 75.00
   1280x720 60.00 60.00 50.00 50.00 59.94
   1024x768 75.03 70.07 60.00
   832x624 74.55
   800x600 72.19 75.00 60.32 56.25
   720...

Read more...

Changed in mutter (Ubuntu):
status: Incomplete → New
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Thanks. Driver version 390 is one thing I haven't tried this week.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Sorry, I still can't reproduce this bug. I am using Nvidia 390.116 on disco with nvidia-drm.modeset=1. Actually, that combination instead just makes Xorg crash for me (bug 1822616).

If I apply the workaround for bug 1822616 and keep drm.modeset=1, I can log into Xorg sessions but experience no hangs at all. So there's something about our machines that is different.

I have some theories:

  1. The problem might be unique to PRIME, since you are using Intel and Nvidia GPUs simultaneously. I am only using Nvidia (otherwise my machine can't start up).

  2. The problem might be unique to your newer GPU. I don't have anything like a 1050 that fits in my desktop.

  3. Maybe you are hitting bug 1823516 (you should be). And for some reason it causes freezes on your machine, but not mine. I would expect bug 1823516 to overwhelm many machines :(

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Actually, your comment that:

> - you have a second cursor that you can move and refresh during that period

is definitely something I don't see. That sounds like a PRIME or multimonitor bug in mutter.

tags: added: multimonitor
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in mutter (Ubuntu):
status: New → Confirmed
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Thank you for reporting this bug to Ubuntu.
Ubuntu 19.04 (disco) reached end-of-life on January 23, 2020.

See this document for currently supported Ubuntu releases:
https://wiki.ubuntu.com/Releases

We appreciate that this bug may be old and you might not be interested in discussing it any more. But if you are then please upgrade to the latest Ubuntu version and re-test. If you then find the bug is still present in the newer Ubuntu version, please add a comment here telling us which new version it is in and change the bug status to Confirmed.

Changed in mutter (Ubuntu):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.