Migrate from fbdev drivers to simpledrm and DRM fbdev emulation layer

Bug #1965303 reported by Timo Aaltonen
28
This bug affects 3 people
Affects Status Importance Assigned to Milestone
gdm
Fix Released
Unknown
linux (Fedora)
Fix Released
Undecided
linux (Ubuntu)
Fix Released
Undecided
Timo Aaltonen
Jammy
Won't Fix
Undecided
Unassigned
Lunar
Won't Fix
Undecided
Unassigned
Mantic
Won't Fix
Undecided
Timo Aaltonen
Noble
Fix Released
Undecided
Timo Aaltonen
nvidia-graphics-drivers-470 (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Fix Released
Undecided
Unassigned
Lunar
Fix Released
Undecided
Unassigned
Mantic
Fix Released
Undecided
Unassigned
Noble
Fix Released
Undecided
Unassigned

Bug Description

[ Impact ]

The fbdev subsystem has been deprecated for a long time. We should drop it in favour of using simpledrm with fbdev emulation layer.

This requires Kernel config changes:

FB_EFI=n
FB_VESA=n

fbcon will still require FB to be available, but will use the fbdev emulation layer

[ Test Plan ]

* Ensure that systems currently relying on fbdev (and therefore only allowing Xorg logins) such as some virtual machine types, boot with working Wayland support.

[ Where Problems could occur ]

* When this stack is enabled, it changes boot timing such that some drivers may take a longer time to boot and GDM may hang in a black screen (bug 2039757).
* Race conditions could be exposed to DE environments
* Software that expects to find DRM device at /dev/dri/card0 may have a problem.
* Some older versions of NVIDIA driver might not work properly.

[ Other Info ]

* Fedora bug: https://bugzilla.redhat.com/show_bug.cgi?id=2022385

Revision history for this message
In , bcotton (bcotton-redhat-bugs) wrote :

This is a tracking bug for Change: Replace the fbdev drivers with simpledrm and the DRM fbdev emulation layer
For more details, see: https://fedoraproject.org/wiki/Changes/ReplaceFbdevDrivers

This change replaces the legacy Linux frame buffer device (fbdev) drivers that are still used in Fedora, with the latest simpledrm driver and the DRM fbdev emulation layer.

If you encounter a bug related to this Change, please do not comment here. Instead create a new bug and set it to block this bug.

Revision history for this message
In , fmartine (fmartine-redhat-bugs) wrote :
Revision history for this message
In , fmartine (fmartine-redhat-bugs) wrote :

*** Bug 1986222 has been marked as a duplicate of this bug. ***

Revision history for this message
In , bcotton (bcotton-redhat-bugs) wrote :

This bug appears to have been reported against 'rawhide' during the Fedora Linux 36 development cycle.
Changing version to 36.

Revision history for this message
In , bcotton (bcotton-redhat-bugs) wrote :

We have reached the 'Change complete (100% complete)' deadline in the Fedora Linux 36 release schedule.

At this time, all Changes should be fully complete. Indicate this by setting this tracking bug to ON_QA.

If you need to defer this Change to a subsequent release, please needinfo me.

Timo Aaltonen (tjaalton)
Changed in linux (Ubuntu):
status: New → Confirmed
Timo Aaltonen (tjaalton)
Changed in linux (Ubuntu):
assignee: nobody → Timo Aaltonen (tjaalton)
Revision history for this message
In , bcotton (bcotton-redhat-bugs) wrote :

F36 was released today. If this Change did not land in the release, please notify bcotton as soon as possible.

Revision history for this message
Mario Limonciello (superm1) wrote :

I'm adding a task for gdm to this bug. The reason is that if simpledrm is used, we've found race conditions that occur where the transition from simpledrm to the proper KMS driver may race with gdm starting up the login screen.

When gdm loses the race you end up with a black screen. So please make sure that any kernels that do adopt simpledrm the matching GDM for OS releases that run this kernel (think about HWE!) will need to pick up this GDM fix as well.

https://gitlab.gnome.org/GNOME/gdm/-/commit/04e119c8332a564cfad05dfb9cb6ec547d5ba954

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

thanks for the heads-up, yeah that'd need to be fixed in jammy as well (for gdm) when a hwe kernel supports simpledrm

I'm thinking if mantic 6.5 would be the first candidate

Changed in nvidia-graphics-drivers-470 (Ubuntu):
status: New → Fix Released
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

all current nvidia releases support simpledrm since last March, except 390 which has been EOL for some time now

Changed in linux (Ubuntu Jammy):
status: New → Won't Fix
Changed in nvidia-graphics-drivers-470 (Ubuntu Jammy):
status: New → Fix Released
Changed in gdm (Ubuntu Mantic):
status: New → Triaged
Changed in gdm (Ubuntu Jammy):
status: New → Triaged
affects: gdm (Ubuntu Jammy) → gdm3 (Ubuntu Jammy)
Timo Aaltonen (tjaalton)
description: updated
Revision history for this message
Mario Limonciello (superm1) wrote :

Mantic has 45~beta-1ubuntu1 which picks up the fix for this. Jammy is still open.

Changed in gdm3 (Ubuntu Mantic):
status: Triaged → Fix Released
description: updated
Changed in linux (Fedora):
importance: Unknown → Undecided
status: Unknown → Fix Released
Changed in gdm:
status: Unknown → Fix Released
Revision history for this message
Mario Limonciello (superm1) wrote :

I've uploaded the jammy backport of this patch into the queue, so assigning it to myself. As I have readily seen this problem on my own test kernels I'll validate it with a local kernel with the matching kconfig options when it's accepted. This can then be done in advance of waiting for matching changes landing in OEM-6.5, or HWE-6.5.

Changed in gdm3 (Ubuntu Jammy):
assignee: nobody → Mario Limonciello (superm1)
status: Triaged → In Progress
description: updated
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

This seems like a significant enhancement that will affect the graphics backends chosen by Plymouth and some desktop sessions, particularly virtual machines. Surely this needs a feature freeze exception?

Revision history for this message
Mario Limonciello (superm1) wrote :

The kernel part yes. The bug fix in GDM to prepare for whenever the kernel part happens I would think not.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Yes I was referring to the linux task.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Right, I don't think we'll get this in mantic yet, I'll send it for linux-unstable first (and then it'll get enabled by mainline builds).

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

can't target NN yet, so it'll stay open for mantic for now

Revision history for this message
Steve Langasek (vorlon) wrote :

This has been marked as fixed in mantic, but this bug does not say what the plan is for lunar, which is still the supported upgrade path from jammy.

Changed in gdm3 (Ubuntu Jammy):
status: In Progress → Incomplete
Revision history for this message
Mario Limonciello (superm1) wrote :

The issue won't occur in Lunar. It's only specifically a problem with gdm when a kernel has been configured this way.
It's not a problem in Jammy yet, but will be a problem when such a kernel gets backported as HWE.

Changed in nvidia-graphics-drivers-470 (Ubuntu Lunar):
status: New → Fix Released
Changed in linux (Ubuntu Lunar):
status: New → Won't Fix
Changed in gdm3 (Ubuntu Lunar):
status: New → Invalid
Changed in gdm3 (Ubuntu Jammy):
status: Incomplete → In Progress
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I don't quite follow why we're calling this an issue. It's an enhancement that should target 24.04 only.

Revision history for this message
Mario Limonciello (superm1) wrote :

The problem is specifically when 24.04's kernel backports to 22.04 it will expose the race condition to GDM. If that kernel will never backport then it won't be a problem.

Revision history for this message
Robie Basak (racb) wrote :

> it will expose the race condition to GDM

What race condition, please? If this is the justification for the proposed SRU, then I'd expect to see that explained, but I don't see any explanation.

Revision history for this message
Robie Basak (racb) wrote :

Oh, this?

> When this stack is enabled, it changes boot timing such that some drivers may take a longer time to boot and GDM may hang in a black screen.

In that case, why can you not revert the kernel config change in the HWE backport to Jammy?

Revision history for this message
Robie Basak (racb) wrote :

Oh, wait. You are just fixing the race condition?

Please the SRU documentation then. Right now I read "We should drop it in favour of using simpledrm with fbdev emulation layer" which is very misleading as to what you're actually proposing to do.

Revision history for this message
Mario Limonciello (superm1) wrote :

> Oh, wait. You are just fixing the race condition?

Yeah; I see the confusion. This issue got caught up by making sure the GDM race condition was fixed at the same as when this feature is enabled.

Perhaps it's better to split the GDM part of it off to it's own bug instead then.

> In that case, why can you not revert the kernel config change in the HWE backport to Jammy?

That's certainly a possible solution, but what happens when people run mainline kernels for testing another issue and are exposed to this? That's how I discovered it, reported it upstream and got it fixed in GDM. It's a trivial fix.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

The GDM issue sounds like something we do want fixed, but not here. It should get its own bug.

This bug should involve no SRUs.

Revision history for this message
Mario Limonciello (superm1) wrote :

I've split up the GDM for Jammy change into https://bugs.launchpad.net/ubuntu/+source/gdm3/+bug/2039757

no longer affects: gdm3 (Ubuntu Jammy)
no longer affects: gdm3 (Ubuntu)
no longer affects: gdm3 (Ubuntu Lunar)
no longer affects: gdm3 (Ubuntu Mantic)
description: updated
description: updated
Revision history for this message
Mario Limonciello (superm1) wrote :

Is this going to happen for noble? It should just be kernel config changes at this point.

Changed in linux (Ubuntu Mantic):
status: Confirmed → Won't Fix
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

> This requires Kernel config changes:
>
> FB_EFI=n
> FB_VESA=n

I'm testing this and it's a step backwards. The screen just turns black for the first 6 seconds or so while Plymouth waits for i915drmfb to start instead (might be Plymouth's fault for being too picky?). FB_EFI was better because it at least put the BIOS logo on the screen.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

OK I've got simpledrm working now. It needed:

  FB_EFI=n
  FB_VESA=n
  CONFIG_SYSFB_SIMPLEFB=y
  CONFIG_DRM_SIMPLEDRM=y

So now I have working graphics very early:

[ 0.588237] [drm] Initialized simpledrm 1.0.0 20200625 for simple-framebuffer.0 on minor 0
...
[ 7.153876] [drm] Initialized i915 1.6.0 20230929 for 0000:00:02.0 on minor 1

But this doesn't seem to solve bug 1970069, just reintroduced it on simpledrm instead of efifb:

[ 1.959016] fbcon: Taking over console

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Also to prevent Grub from blanking the screen you need to uncomment:

  GRUB_TERMINAL=console

Then you get a smooth transition from BIOS to kernel graphics (simpledrm). But bug 1970069 still pops up after that.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Thanks for testing, so DRM_SIMPLEDRM needs to be built-in for it to work properly? I guess that's fine then..

Revision history for this message
Mario Limonciello (superm1) wrote :

> Also to prevent Grub from blanking the screen you need to uncomment:

Can you open another bug against GRUB to fix this default for Noble?

> Thanks for testing, so DRM_SIMPLEDRM needs to be built-in for it to work properly

Yeah it should need to be built in.

> But bug 1970069 still pops up after that.

Too bad; but it sounds like between simpledrm, the GRUB policy change and your fbcon patch it would be a great experience!

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

False alarm. The screen blanking that GRUB_TERMINAL=console "fixes" was actually caused by me using GRUB_TIMEOUT=3. Changing it back to the default GRUB_TIMEOUT=0 also prevents the blanking mentioned in comment #30.

tags: added: flickerfreeboot
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I'm told this is in 6.8, and see 6.8 is in proposed.

Changed in linux (Ubuntu Noble):
status: Confirmed → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 6.8.0-11.11

---------------
linux (6.8.0-11.11) noble; urgency=medium

  * noble/linux: 6.8.0-11.11 -proposed tracker (LP: #2053094)

  * Miscellaneous Ubuntu changes
    - [Packaging] riscv64: disable building unnecessary binary debs

 -- Paolo Pisati <email address hidden> Wed, 14 Feb 2024 00:04:31 +0100

Changed in linux (Ubuntu Noble):
status: Fix Committed → Fix Released
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Turns out we didn't even need to modify the kernel for this. By choosing simplefb and the 'tiny' DRM drivers for initrd, we get simpledrm.ko as it existed in our older kernels...

https://code.launchpad.net/~vanvugt/ubuntu/+source/initramfs-tools/+git/initramfs-tools/+merge/462482

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.