[nvidia] Fail to launch gnome-shell (Wayland) on Ubuntu with EGLDevice backend

Bug #1805444 reported by Yogish Kulkarni on 2018-11-27
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
mutter (Ubuntu)
Medium
Unassigned
Bionic
Medium
Marco Trevisan (Treviño)

Bug Description

Wayland sessions don't work with the Nvidia driver in 18.04, even after enabling them via the kernel command line 'nvidia-drm.modeset=1'.

[Original description]
Fail to launch gnome-wayland-shell on Ubuntu-18.04 with EGLDevice backend due to following missing changes in mutter 3.28.3-2~ubuntu18.04.2

commit 1bf2eb95b502ed0419b0fe8979c022cacaf79e84
Author: Miguel A. Vico <email address hidden>
Date: Thu Jun 7 16:29:44 2018 -0700

renderer/native: Choose first EGL config for non-GBM backends

Commit 712ec30cd9be1f180c3789e7e6a042c5f7b5781d added the logic to only
choose EGL configs that match the GBM_FORMAT_XRGB8888 pixel format.
However, there won't be any EGL config satisfying such criteria for
non-GBM backends, such as EGLDevice.

This change will let us choose the first EGL config for the EGLDevice
backend, while still forcing GBM_FORMAT_XRGB8888 configs for the GBM
one.

Related to: https://gitlab.gnome.org/GNOME/mutter/issues/2

commit 8ee14a7cb7e8f072d2731d59c7dc735f83a9bb0b
Author: Jonas Ådahl <email address hidden>
Date: Tue Nov 14 16:08:52 2017 +0800

renderer/native: Also wrap flip closures for EGLStreams

When using the EGLStream backend, the MetaRendererNative passed a
GClosure to KMS when using EGLStreams, but KMS flip callback event
handler in meta-gpu-kms.c expected a closure wrapped in a closure
container, meaning it'd instead crash when using EGLStreams. Make the
flip handler get what it expects also when using EGLStreams by wrapping
the flip closure in the container before handing it over to EGL.

https://bugzilla.gnome.org/show_bug.cgi?id=790316

apt-cache policy mutter
mutter:
Installed: 3.28.3-2~ubuntu18.04.2
Candidate: 3.28.3-2~ubuntu18.04.2
Version table:
*** 3.28.3-2~ubuntu18.04.2 500
500 http://ports.ubuntu.com/ubuntu-ports bionic-updates/main arm64 Packages
100 /var/lib/dpkg/status
3.28.1-1ubuntu1 500
500 http://ports.ubuntu.com/ubuntu-ports bionic/main arm64 Packages

Changed in mutter (Ubuntu):
status: New → Incomplete
status: Incomplete → New
Daniel van Vugt (vanvugt) wrote :

The bigger problem (than any missing commits) is that Ubuntu builds of mutter don't support EGLStreams:

mutter-3.30.1

 prefix: /usr/local
 source code location: .
 compiler: gcc

 Startup notification: yes
 libcanberra: yes
 libwacom: yes
 gudev yes
 Introspection: yes
 Session management: yes
 Wayland: yes
 Wayland EGLStream: no <----------
 Native (KMS) backend: yes
 EGLDevice: yes
 Remote desktop: no

I think the missing build-dep isn't even in main. I can only find it in https://github.com/NVIDIA/egl-wayland but would love to be proven wrong.

This is a problem on Ubuntu even when building upstream mutter with the "missing" commits. The required file wayland-eglstream-protocols.pc and friends is still missing from the Ubuntu distro.

Daniel van Vugt (vanvugt) wrote :

... so even when you do get EGLDevice working, you don't have EGLStreams and so will be stuck with software rendering.

tags: added: nvidia
summary: - Fail to launch gnome-wayland-shell on Ubuntu-18.04 with EGLDevice
- backend
+ [nvidia] Fail to launch gnome-wayland-shell on Ubuntu-18.04 with
+ EGLDevice backend

I think on aarch64 mutter is compiled with EGLDevice backend enabled. I have opened this bug NVIDIA Tegra SoC.

Daniel van Vugt (vanvugt) wrote :

Yes, Ubuntu builds with EGLDevice support always enabled. But you can't explicitly select the "Wayland EGLStream" option - that's automatically detected.

Yogish Kulkarni (yogishk) wrote :

Okay. Looks like this is the case for mutter-3.30.1. In the current version in bionic mutter-3.28.3, I don't see "Wayland EGLStream: ${have_wayland_eglstream}" option in configure.

summary: - [nvidia] Fail to launch gnome-wayland-shell on Ubuntu-18.04 with
- EGLDevice backend
+ Wayland sessions don't work with the Nvidia driver yet (even with
+ nvidia-drm.modeset=1)

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in mutter (Ubuntu):
status: New → Confirmed
Changed in mutter (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Triaged

Sorry, it seems my concerns about EGLStreams are quite different and not blocking your concerns about EGLDevice. Now restoring the original bug title.

Since this bug is fixed in Ubuntu 18.10 onward I'm going to mark it as fixed and we can nominate 18.04 to get the fix later.

summary: - Wayland sessions don't work with the Nvidia driver yet (even with
- nvidia-drm.modeset=1)
+ [nvidia] Fail to launch gnome-shell (Wayland) on Ubuntu-18.04 with
+ EGLDevice backend
summary: - [nvidia] Fail to launch gnome-shell (Wayland) on Ubuntu-18.04 with
- EGLDevice backend
+ [nvidia] Fail to launch gnome-shell (Wayland) on Ubuntu with EGLDevice
+ backend
Changed in mutter (Ubuntu):
status: Triaged → Fix Released
Will Cooke (willcooke) on 2018-12-10
tags: added: rls-bb-incomin
tags: added: rls-bb-incoming
removed: rls-bb-incomin
Daniel van Vugt (vanvugt) wrote :

As I understand it, we presently have:

mutter in 18.04:
  * EGLDevice compiled but not working (this bug)
  * EGLStreams not compiled

mutter in 18.10:
  * EGLDevice compiled and working
  * EGLStreams not compiled

mutter in 19.04:
  * EGLDevice compiled and working
  * EGLStreams not compiled (work in progress outside of this bug, starting with https://launchpad.net/ubuntu/disco/+queue?queue_state=0&queue_text=)

mutter from upstream git that I built locally:
  * EGLDevice compiled and working
  * EGLStreams compiled and apparently still not working (will investigate this further as the other issues get resolved).

description: updated
Daniel van Vugt (vanvugt) wrote :

It appears the backport has already been done upstream. Both the missing commits are already in the gnome-3-28 branch of mutter, scheduled for inclusion in mutter version 3.28.4 (whenever that may be).

I have also verified that mutter version 3.28.3 fails, and the gnome-3-28 branch (721de281) works (with Nvidia driver 410).

This all means we should now wait for an official tag 3.28.4 from upstream and get that into bionic eventually.

Daniel van Vugt (vanvugt) wrote :

Fact is stranger than fiction. I failed to consider that maybe EGLStreams was working in mutter 3.28 and it got broken later on. Actually there were TWO separate regressions (separate to this bug) this year. So here are corrections to comment #8 after a more detailed analysis:

mutter 3.28.3 in 18.04:
  * EGLDevice configured, compiled, but not working (this bug)
  * EGLStreams not configurable, always built, but also not working (this bug)

mutter 3.28.4 (prerelease):
  * EGLDevice compiled and working
  * EGLStreams not configurable, always built, and working**

mutter 3.30.2 in 18.10 & 19.04:
  * EGLDevice compiled and working
  * EGLStreams' new build deps are missing so not configured, but compiled in anyway by accident because the new build deps were weak, and it's working**

mutter 3.32 (prerelease) from upstream git:
  * EGLDevice compiled and working
  * EGLStreams' new build deps are missing so not configured, and not compiled any more thanks to a recent "fix" 417c00b8fa. We can get the missing build deps either with libnvidia-egl-wayland-dev (soon-to-be in 19.04), or do some kind of revert of 417c00b8fa (because the new build deps are pretty weak and probably shouldn't have been implemented that way in mutter 3.30+).

** EGLStreams working means Wayland clients get full Nvidia acceleration but I thought it was broken for a while because that also includes 100% CPU use by [kworker/u16:3+events_unbound]. Possible driver bug?

Sebastien Bacher (seb128) wrote :

Tagging rls-bb-notfixing after team meeting discussion, that's a bit change and wayland is not the default desktop in bionic. That can change/be reconsidered if it's customer escalated or such though

tags: added: rls-bb-notfixing
removed: rls-bb-incoming
Daniel van Vugt (vanvugt) wrote :

More notes about EGLStreams support:

- mutter 3.28.4 works, but not perfectly. Some clients (e.g. glmark2-wayland) don't resize properly. That bug appears to be fixed in mutter 3.30+. I don't yet know what commit fixed it.

- mutter 3.32 in future Ubuntu 19.04 won't work at all till we get this package into main: https://launchpad.net/ubuntu/+source/egl-wayland and then mutter rebuilt.

- mutter 3.30 in Ubuntu 18.10 and 19.04 work well enough right now except for the below caveats...

- GLX clients are still software rendered in all releases, which should be fixed by releasing egl-wayland into 19.04 and then rebuilding Xwayland with the new build dependency.

- There is very high CPU usage in the kernel when using EGLStreams and nvidia-drm.modeset=1 -> bug 1808108 (NVIDIA please investigate).

Doug McMahon (mc3man) wrote :

In regard to caveats above:
There is no mention of specific Nvidia series, is it presumed that XXXm devices should work now in 18.10 & 19.04?
(- I've never seem any Nvidia m series work in wayland..

Doug McMahon (mc3man) wrote :

Additionally what about the newer 10XX mobile devices?
(- I've none to try.

Daniel van Vugt (vanvugt) wrote :

Doug, hardware model numbers don't matter here. Only software version numbers.

Doug McMahon (mc3man) wrote :

Maybe I'm missing the point of this bug, if so sorry about that.
The bug seems to be basically about being able to log into a wayland session while using nvidia drivers. Though the description has changed a time or two.

So question could be simply rephrased - Does the fix released imply that nvidia should be working on 19.04 on all nvidia devices?

It's stated above that -
"mutter 3.30.2 in 18.10 & 19.04:
  * EGLDevice compiled and working
  * EGLStreams' new build deps are missing so not configured, but compiled in anyway by accident because the new build deps were weak, and it's working**"

 "** EGLStreams working means Wayland clients get full Nvidia acceleration but I thought it was broken for a while because that also includes 100% CPU use by [kworker/u16:3+events_unbound]. Possible driver bug? "

What I mentioned is that no m series nvidia i've tested gets nvidia support in a wayland session.
(- when logging into wayland one is automatically switched to Intel

Daniel van Vugt (vanvugt) wrote :

Doug,

This bug was a request to get two missing commits into Ubuntu. They are already in some versions of Ubuntu so:

> Since this bug is fixed in Ubuntu 18.10 onward I'm going to mark it as fixed and we can nominate 18.04 to get the fix later.

This bug is Fix Released in 18.10 onward, but not 18.04 yet. "Fix Released" is still the correct status since that tracks the current Ubuntu release.

That all said, there are still problems with Wayland support in the Nvidia driver outlined above so we don't yet recommend it for anyone, even in 19.04, and even though this bug is fixed.

Please log your own new bug to describe any problems you have rather than adding comments here. This bug is for the original reporter only.

Iain Lane (laney) on 2019-01-25
Changed in mutter (Ubuntu Bionic):
status: New → In Progress
assignee: nobody → Daniel van Vugt (vanvugt)
Daniel van Vugt (vanvugt) wrote :

Not really "in progress". Generally though, all that bionic needs here is mutter version 3.28.4.

Changed in mutter (Ubuntu Bionic):
importance: Undecided → Medium
assignee: Daniel van Vugt (vanvugt) → nobody
status: In Progress → Triaged
Daniel van Vugt (vanvugt) wrote :

Although bug 1811900 suggests it is in progress.

Changed in mutter (Ubuntu Bionic):
assignee: nobody → Marco Trevisan (Treviño) (3v1n0)
status: Triaged → In Progress

Thanks for uploading the fix for this bug report to -proposed. However, when reviewing the package in -proposed and the details of this bug report I noticed that the bug description is missing information required for the SRU process. You can find full details at http://wiki.ubuntu.com/StableReleaseUpdates#Procedure but essentially this bug is missing some of the following: a statement of impact, a test case and details regarding the regression potential. Thanks in advance!

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers