GNOME apps crash with "Protocol error" in NVIDIA Wayland sessions

Bug #1965563 reported by Gunnar Hjalmarsson
144
This bug affects 28 people
Affects Status Importance Assigned to Milestone
GTK+
Fix Released
Unknown
NVIDIA / egl-wayland
Fix Released
Unknown
egl-wayland (Ubuntu)
Fix Released
High
Unassigned
Jammy
Fix Committed
High
Alessandro Astone

Bug Description

[ Impact ]

 * GTK applications fail to start on a hybrid graphics machine with an NVIDIA discrete GPU.

 * The NVIDIA egl-wayland extension 1.1.9 only supports rendering to the NVIDIA GPU if it is the primary GPU. On a hybrid system the primary GPU would be the integrated graphics, while the NVIDIA GPU should only be used for offloading of specific applications. On such a setup, the egl-wayland extension still incorrectly advertised the driver as compatible and attempted to use the NVIDIA GPU for all EGL applications.

 * To fix the issue, backport a commit from a newer version of the NVIDIA egl-wayland extension that reports the driver as incompatible when the NVIDIA GPU is not the primary GPU. This effectively ensures that all EGL applications run on integrated graphics by default.

[ Test Plan ]

 * Set up a hybrid graphics machine with Jammy 22.04 LTS and the proprietary NVIDIA drivers version 535 or 550.

 * Install `libnvidia-egl-wayland1` from the update.

 * Log-in to a Wayland desktop session.

 * Verify that you can start `gnome-text-editor`.

[ Test Plan - Regression ]

 * Set up a Desktop with a single NVIDIA GPU with Jammy 22.04 LTS and the proprietary NVIDIA drivers version 470, 535 or 550.

 * Install `libnvidia-egl-wayland1` from the update.

 * Log-in to GNOME Wayland (in Ubuntu Desktop this implies changing the log-in session on the log-in screen)

 * Verify that `eglinfo | grep -A2 "Wayland platform"` reports "EGL vendor string: NVIDIA"

 * Also verify that GNOME applications run smoothly as GPU accelerated.

[ Where problems could occur ]

 * The scope of the change is limited to the NVIDIA proprietary drivers.

 * A possible regression would be that the driver starts reporting as incompatible on NVIDIA single-GPU systems too. One would notice by all wayland-native applications suddenly being very slow. Note that such systems would not default to Wayland in Jammy.

[ Original Report ]

$ dpkg-query -W gnome-shell-extension-prefs
gnome-shell-extension-prefs 42~beta-1ubuntu3
$ gnome-extensions-app
Gdk-Message: 17:54:19.697: Error reading events from display: Protocol error

Caveat: I currently have a mix of packages from jammy-release and jammy-proposed.

Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :

I couldn't reproduce this in a more clean jammy installation in a VM, so it may be "just me". Please feel free to close it if you can't easily reproduce.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Seems to be working in both Wayland and Xorg sessions here.

Although I must say it wasn't working for me for a couple of days last week, around the time this bug was created.

Changed in gnome-shell (Ubuntu):
status: New → Invalid
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I am seeing this bug again when I build gnome-shell manually.

Changed in gnome-shell (Ubuntu):
status: Invalid → Confirmed
summary: - Can't open extensions app
+ gnome-extensions-app fails to start (Protocol error)
tags: added: wayland wayland-session
Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote : Re: gnome-extensions-app fails to start (Protocol error)

FWIW: On the installation where have the issue, this is what it looks like in a Xorg session:

$ gnome-extensions-app
Segmentation fault (core dumped)

Same as bug #1966221 in other words. (The command shown in the bug description was run in a Wayland session.)

Timo Aaltonen (tjaalton)
affects: gnome-shell (Ubuntu) → mesa (Ubuntu)
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Timo, was it bug 1966221 that you meant to assign to mesa?

Regardless, I think we should keep a gnome-shell task here just to make the bug easier to find, and avoid duplicates.

Changed in gnome-shell (Ubuntu):
status: New → Confirmed
Revision history for this message
Daniel van Vugt (vanvugt) wrote :
Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :

I installed version 42.0-1ubuntu1 of the mutter and gnome-shell packages. Didn't help, neither on Wayland nor Xorg.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

We should probably only discuss Wayland here. Xorg has bug 1966221.

Revision history for this message
Gunnar Hjalmarsson (gunnarhj) wrote :

I see a similar error message when trying to start yelp:

$ yelp
Gdk-Message: 01:23:31.232: Error flushing display: Protocol error
$ gnome-extensions-app
Gdk-Message: 01:23:48.075: Error reading events from display: Protocol error

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

So this is GTK only? GTK4 only?

Revision history for this message
Daniel van Vugt (vanvugt) wrote :
summary: - gnome-extensions-app fails to start (Protocol error)
+ gnome-extensions-app fails to start [Error reading events from display:
+ Protocol error]
Revision history for this message
hackel (hackel) wrote : Re: gnome-extensions-app fails to start [Error reading events from display: Protocol error]

I don't know if this is helpful, but I had the org.gnome.Extensions flatpak installed, and it starts just fine under Wayland/nvidia. Only when I tried to switch to the deb did I get this error.

Revision history for this message
ManOnTheMoon (manonthemoon) wrote (last edit ):

Having the same error message"error reading events from display: Protocol error" when running wayland. However, extension manager having no issue opening in x11. Ubuntu 22.04 with Nividia driver 510

Revision history for this message
Islam (islam) wrote :

This is also happening for me When I launch extension-manager or cheese apps:

[3254489.563] wl_display@1.delete_id([3254489.568] wl_display@1.error(nil, 7, "failed to import supplied dmabufs: Arguments are inconsistent (for example, a valid context requires buffers not supplied by a ")
42)
Gdk-Message: 12:22:58.093: Error flushing display: Protocol error

Revision history for this message
DuckDuckWhale (duckduckwhale) wrote (last edit ):

Also affected (bug 1975647). Notably, this happened after I attempted to dual boot Debian 11 (stable, then testing). The workspace indicator extension worked before then, but disappeared after I switched back.

user@host:~$ gnome-extensions-app
Gdk-Message: 20:03:19.701: Error flushing display: Protocol error
user@host:~$ cheese

(cheese:32950): Gdk-WARNING **: 20:04:45.134: Native Windows taller than 65535 pixels are not supported
Gdk-Message: 20:04:46.246: Error 71 (Protocol error) dispatching to Wayland display.
user@host:~$

In syslog:

gnome-shell[2051]: WL: error in client communication (pid 32877)
cheese[32877]: Error reading events from display: Protocol error

Revision history for this message
pakaoraki (pakaoraki) wrote :

I report also this bug, which seams to be related to nvidia on Wayland.

I made few test on my hybrid graphic laptop (Dell xps with GTX1050 Ti ):

- Fresh Ubuntu install on Wayland (no nvidia driver): OK.
- With Nvidia 510 driver, on-demand profile set, login Xorg: OK.
- With Nvidia 510 driver, on-demand profile set, login with Wayland: FAILED. I got this error:

Gdk-Message: 12:20:26.270: Error 71 (Protocol error) dispatching to Wayland display.
I got similar result also with Nvidia profile set instead of on-demand.

- With Nvidia 510 driver, INTEL profile set, login with Wayland: OK.

With Flatpak app (NVIDIA/Wayland):
- Install gnome-extensions with flatpak run org.gnome.Extensions:

The app start BUT you I can’t access to the extensions settings. The sub-windows can’t open, I got this in logs when I press “setting” button of an listed extension:

Sender gnome-shell, WL: error in client communication (pid 16887)
Sender: gjs ,Error reading events from display: Protocol error
Sender gnome-shell, Window manager warning: Ping serial 2322697 was reused for window W118, previous use was for window W111.

-I also test another similar app on flatpak call com.mattjakeman.ExtensionManager:
The app start but when trying to open extensions settings, the sub-windows does not open and I got this logs too:

Sender: gjs, Error flushing display: Protocol error
Sender gnome-shell, WL: error in client communication (pid 17125)

Revision history for this message
DuckDuckWhale (duckduckwhale) wrote :

OBS is also affected. Log:

$ obs
...
qt.qpa.wayland: Wayland does not support QWindow::requestActivate()
...
info: [pipewire] screencast session created
The Wayland connection experienced a fatal error: Protocol error
Aborted (core dumped)
$

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

If the OBS issue is related to this then it will be in some common component like a library or the shell. We don't need to add a task for each app. This problem is too low level for apps to have any control over it.

no longer affects: obs-studio (Ubuntu)
Revision history for this message
Benjamin Birchman (bbirchman) wrote :

Yes same issue. Can confirm pakaoraki's findings.
Razerblade 15 with rtx2060, nivida 510 drivers.

no longer affects: mesa (Ubuntu)
Changed in gtk+3.0 (Ubuntu):
status: New → Confirmed
tags: added: protocol-error
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Looks like Nvidia's egl-wayland is the main suspect:
https://github.com/NVIDIA/egl-wayland/issues/41

Changed in egl-wayland (Ubuntu):
status: New → Confirmed
Changed in egl-wayland:
status: Unknown → New
Changed in gtk:
status: Unknown → Fix Released
Revision history for this message
DuckDuckWhale (duckduckwhale) wrote :

Prepending __EGL_VENDOR_LIBRARY_FILENAMES=/usr/share/glvnd/egl_vendor.d/50_mesa.json gnome-extensions to affected commands is a workaround (as mentioned in the issue).

tags: added: nvidia-wayland
tags: added: nvidia
Revision history for this message
DuckDuckWhale (duckduckwhale) wrote :

Looks like the bug is fixed in 1.1.10 and what's needed is just an update.

no longer affects: gnome-shell (Ubuntu)
summary: - gnome-extensions-app fails to start [Error reading events from display:
- Protocol error]
+ GNOME apps crash with "Protocol error" in NVIDIA Wayland sessions
no longer affects: gtk+3.0 (Ubuntu)
Changed in egl-wayland (Ubuntu):
status: Confirmed → Fix Released
tags: added: fixed-in-1.1.10 fixed-upstream
Changed in egl-wayland (Ubuntu):
importance: Undecided → High
tags: added: rls-jj-incoming
tags: added: fixed-in-egl-wayland-1.1.10
removed: fixed-in-1.1.10
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Added a jammy task because most people will understand that better than rls-jj-incoming. This is something I'd like to do in bugs more often, but if it's wrong (@vorlon?) then let me know.

Changed in egl-wayland (Ubuntu Jammy):
status: New → Triaged
importance: Undecided → High
Changed in egl-wayland:
status: New → Fix Released
Changed in egl-wayland (Ubuntu Jammy):
assignee: nobody → Alessandro Astone (aleasto)
Revision history for this message
Alessandro Astone (aleasto) wrote :

With bug 2062082 fixed, applications are now allowed to use the NVIDIA GPU on hybrid systems on Wayland.
Together with bug 2080282, all hybrid systems running Jammy will hit this bug making this a high priority for a Jammy SRU

Changed in egl-wayland (Ubuntu Jammy):
importance: High → Critical
Revision history for this message
Alessandro Astone (aleasto) wrote :

@azorin, could you please verify if this build solves your issue? https://launchpad.net/~aleasto/+archive/ubuntu/tests/+build/29088125

Revision history for this message
Artyom Zorin (azorin) wrote :

Yes, your new package appears to resolve the issue. Thanks for your work on this!

For reference, I tested it by installing a new copy of the recently-released Ubuntu 22.04.5 to my hybrid graphics laptop (ASUS Zenbook UX303LB with NVIDIA 940M) with the NVIDIA 550.107.02 drivers. After the initial stock installation, it failed to launch libadwaita apps, as I reported in bug 2080282.

However, after installing your new libnvidia-egl-wayland1 package (https://launchpad.net/~aleasto/+archive/ubuntu/tests/+build/29088125) and restarting your computer, it now launches libadwaita apps (like gnome-text-editor and gnome-sound-recorder) normally again and sets the laptop's integrated graphics (Mesa Intel HD Graphics) as the default GPU: https://i.imgur.com/51eZTZC.png

Revision history for this message
Alessandro Astone (aleasto) wrote :

Yea the libnvidia-egl-wayland1 "fix" upstream was to basically disable it for hybrid systems.

Revision history for this message
Alessandro Astone (aleasto) wrote :
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Next steps:

1. Add SRU documentation (https://canonical-sru-docs.readthedocs-hosted.com/en/latest/reference/bug-template/)

2. Subscribe ubuntu-sponsors

Changed in egl-wayland (Ubuntu Jammy):
importance: Critical → High
status: Triaged → In Progress
milestone: none → jammy-updates
description: updated
Revision history for this message
Robie Basak (racb) wrote :

Alessandro informs me that this was caused by some combination of kernel 6.8.0 and a nvidia-driver-550 update, so tagging regression-update, and I'll review the proposed fix next.

tags: added: regression-update
Revision history for this message
Robie Basak (racb) wrote :

SRU review:

1) debian/patches/egl-wayland-retrieve-DRM-device-name-before-acquiring-.patch is being modified for no good reason (at least, it isn't documented) and the dep3 headers are being dropped. Please explain or fix.

2) Not essential, but the expected package version string would be 1:1.1.9-1.1ubuntu0.1. What you have provided will work but is not our convention. If re-uploading to fix the above, please use that opportunity to fix, but it's not necessary to re-upload just for this.

The fix itself looks fine, but I have one query on testing:

> A possible regression would be that the driver starts reporting as incompatible on NVIDIA single-GPU systems too. One would notice by all wayland-native applications suddenly being very slow. Note that such systems would not default to Wayland in Jammy.

Please add to the Test Plan to verify this scenario. In general I'm concerned about regressing non-NVIDIA systems:

> * The scope of the change is limited to the NVIDIA proprietary drivers.

How can we be sure that this is the case? For example, does this package have limited scope? Is the code being changed protected from affecting non-NVIDIA systems somehow?

Revision history for this message
Alessandro Astone (aleasto) wrote (last edit ):

Both patches come from upstream, egl-wayland-retrieve-DRM-device-name-before-acquiring-.patch originally had to be manually rebased to apply cleanly. but with both patches in this order they both apply cleanly, so i just took both from upstream.
I realize i forgot to re-apply the debian headers.

> > * The scope of the change is limited to the NVIDIA proprietary drivers.

> How can we be sure that this is the case?

This library is exclusively loaded by the userspace libraries of the proprietary nvidia driver, or by display servers like Mir and Mutter for supporting nvidia version 470.

Revision history for this message
Alessandro Astone (aleasto) wrote :

> > A possible regression would be that the driver starts reporting as incompatible on NVIDIA single-GPU systems too. One would notice by all wayland-native applications suddenly being very slow. Note that such systems would not default to Wayland in Jammy.

> Please add to the Test Plan to verify this scenario.

I didn't do so originally because NVIDIA on Wayland is not a suggested nor well supported path in Jammy. Verifying this would mean running a non-default configuration. But the SRU template instructs you to think about all possible side-effects, and this is the only one that came to mind.

description: updated
Revision history for this message
Alessandro Astone (aleasto) wrote :

New debdiff addressing all comments

Revision history for this message
Robie Basak (racb) wrote : Please test proposed package

Hello Gunnar, or anyone else affected,

Accepted egl-wayland into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/egl-wayland/1:1.1.9-1.1ubuntu0.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in egl-wayland (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-jammy
Revision history for this message
Robie Basak (racb) wrote :

Unsubscribing ~ubuntu-sponsors as there is nothing left to sponsor.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

> Alessandro informs me that this was caused by some combination of kernel 6.8.0 and a
> nvidia-driver-550 update, so tagging regression-update, and I'll review the proposed fix next.

The regression-update is probably better tracked in bug 2080498.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Dropped the 'regression-update' tag because this bug really has existed forever. The only regression-update is bug 2080498.

tags: removed: regression-update
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.