Primary GPU displays nothing if the secondary GPU is using nouveau in a Wayland session

Bug #2066931 reported by Steve Langasek
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mutter
Fix Released
Unknown
mutter (Ubuntu)
Status tracked in Oracular
Noble
Fix Committed
High
Daniel van Vugt
Oracular
Fix Released
High
Unassigned

Bug Description

[ Impact ]

Wayland sessions log in to a black screen, if a secondary GPU exists that's using the nouveau driver and has a secondary monitor plugged into it.

[ Test Plan ]

0. Find a machine with hybrid graphics: Intel integrated, Nvidia discrete.
1. Plug a second monitor into one of the ports wired to the discrete GPU.
2. Stick to the default nouveau graphics driver for Nvidia. No proprietary drivers.
3. Reboot.

Verify the login screen appears and you can successfully log into a Wayland session.

[ Where problems could occur ]

Although the bug affects dual GPU multi-monitor systems, what is failing here is the primary monitor on the primary GPU. So the potential risk is that the primary display starts failing on more types of machines in Wayland sessions.

[ Original description ]

after upgrade from mantic to noble, my laptop booted to the firmware splash screen with Ubuntu logo, plus a mouse cursor. The gdm login screen did not appear.

Switching to disable wayland (as in the attached modified config file) fixed this so that the login screen appears.

ProblemType: Bug
DistroRelease: Ubuntu 24.04
Package: gdm3 46.0-2ubuntu1
ProcVersionSignature: Ubuntu 6.8.0-31.31-generic 6.8.1
Uname: Linux 6.8.0-31-generic x86_64
NonfreeKernelModules: zfs
ApportVersion: 2.28.1-0ubuntu3
Architecture: amd64
CasperMD5CheckResult: unknown
CurrentDesktop: ubuntu:GNOME
Date: Thu May 23 08:06:45 2024
InstallationDate: Installed on 2019-12-23 (1613 days ago)
InstallationMedia: Ubuntu 19.10 "Eoan Ermine" - Release amd64 (20191017)
SourcePackage: gdm3
UpgradeStatus: Upgraded to noble on 2024-05-22 (1 days ago)
mtime.conffile..etc.gdm3.custom.conf: 2024-05-22T16:33:18.326823

Revision history for this message
Steve Langasek (vorlon) wrote :
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Can you provide a journal from one of the failed boot attempts?

Changed in gdm3 (Ubuntu):
status: New → Incomplete
Revision history for this message
Steve Langasek (vorlon) wrote :

sorry, missed your question before, apparently my launchpad mail delivery is unreliable.

are you looking for the kernel logs, gdm service, both, something else?

Revision history for this message
Steve Langasek (vorlon) wrote :
Revision history for this message
Steve Langasek (vorlon) wrote :
Revision history for this message
Steve Langasek (vorlon) wrote :
Revision history for this message
Steve Langasek (vorlon) wrote :

Attached the kernel logs (journalctl -b0 -k --until 'May 22 16:31:05', which is when my user login under X began), plus logs for the gdm service as well as the gdm session (which seems to be where most of the actual log activity happens).

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

It's failing to display on the integrated Intel GPU because it couldn't negotiate a pixel format that's also compatible with the discrete GPU (nouveau):

May 22 16:25:39 homer gnome-shell[4809]: Failed to lock front buffer on /dev/dri/card1: drmModeAddFB2 failed (Invalid argument) and drmModeAddFB cannot be used as a fallback because format=0x30335241 (AR30).
May 22 16:25:39 homer gnome-shell[4809]: (../clutter/clutter/clutter-frame-clock.c:495):clutter_frame_clock_notify_ready: code should not be reached

That's upstream bug https://gitlab.gnome.org/GNOME/mutter/-/issues/3389, fixed in mutter 46.1.

affects: gdm3 (Ubuntu) → mutter (Ubuntu)
Changed in mutter (Ubuntu):
status: Incomplete → Fix Released
summary: - gdm login screen in noble does not display on wayland
+ Primary GPU displays nothing if the secondary GPU is using nouveau in a
+ Wayland session
Changed in mutter (Ubuntu Oracular):
importance: Undecided → High
Changed in mutter (Ubuntu Noble):
importance: Undecided → High
assignee: nobody → Daniel van Vugt (vanvugt)
status: New → Triaged
Changed in mutter (Ubuntu Oracular):
milestone: none → ubuntu-24.10
tags: added: fixed-in-mutter-46.1 fixed-upstream
tags: added: hybrid multigpu nouveau wayland-session
Changed in mutter:
status: Unknown → Fix Released
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

AIUI this will only happen if there's an external monitor plugged into the discrete GPU using nouveau. That's common, but hopefully less common than hybrid laptops without an external monitor connected.

So as a workaround you should be able to:

1. Unplug the monitor.
2. Log in using the laptop display.
3. Use the 'Additional Drivers' app to install one of the proprietary Nvidia drivers.

or just use Xorg.

tags: added: multimonitor
Changed in mutter (Ubuntu Noble):
milestone: none → ubuntu-24.04.1
description: updated
Changed in mutter (Ubuntu Noble):
status: Triaged → In Progress
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Please test proposed package

Hello Steve, or anyone else affected,

Accepted mutter into noble-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/mutter/46.2-1ubuntu0.24.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-noble to verification-done-noble. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-noble. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in mutter (Ubuntu Noble):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-noble
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I don't know whether to call this a failure as the same bug, or a new bug. I am seeing desktop freezes when the secondary GPU has a monitor plugged into nouveau. But I can't see the original error message in the log, so it seems like a new bug to me. Also the original bug here was a failure to paint the screen, which is clearly not true anymore.

Steve, do you see similar?

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

False alarm?

My freezes with nouveau are just the typical nouveau repeatedly crashing in the kernel:

kernel: nouveau 0000:65:00.0: timer: stalled at ffffffffffffffff
kernel: ------------[ cut here ]------------
kernel: nouveau 0000:65:00.0: timeout
kernel: WARNING: CPU: 7 PID: 3132 at drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmtu102.c:45 tu102_vmm_flush+0x176/0x180 [nouveau]

While I get the same crash with both old and new Nvidia cards, other kernel log messages are suggesting it's an AMD-specific issue. So I'll change to an Intel system and retest the same cards...

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Using an Intel machine is no better. At least it became clear I get the same freezes even without the SRU installed. So I think my issues are caused by trying to use an eGPU over Thunderbolt. We should verify this fix on a more self-contained system with fewer variables...

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I even set up Noble on my desktop to remove all the eGPU/Thunderbolt problems. And there, this bug doesn't occur at all (even without the fix). So I guess the issue outlined in comment #8 is somewhat hardware specific.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Steve says he will get to verifying this soon.

Revision history for this message
Steve Langasek (vorlon) wrote :

Have confirmed that with the mutter from noble-proposed, I'm able to drop the 'WaylandEnable=false' line from /etc/gdm3/custom.conf and do get a login screen (shown on the internal display only).

tags: added: verification-done-noble
removed: verification-needed verification-needed-noble
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.