[Dell XPS 15 9575] Flickering white/black screen once KMS comes up

Bug #1958620 reported by Chris Halse Rogers
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Linux
New
Unknown
linux (Ubuntu)
Fix Released
High
Unassigned

Bug Description

This looks like a regression from the 5.13.0-20-generic kernel to the 5.15.0-17-generic kernel. The display works fine under efifb, but as soon as the DRM drivers are loaded and KMS kicks in the display is replaced by black with flickering white (apparently when what would be on the output changes).

ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: linux-image-5.15.0-17-generic 5.15.0-17.17
ProcVersionSignature: Ubuntu 5.13.0-20.20-generic 5.13.14
Uname: Linux 5.13.0-20-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
ApportVersion: 2.20.11-0ubuntu75
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: chris 7134 F.... pipewire-media-
 /dev/snd/seq: chris 7133 F.... pipewire
CasperMD5CheckResult: pass
CurrentDesktop: ubuntu:GNOME
Date: Fri Jan 21 17:07:08 2022
InstallationDate: Installed on 2021-06-26 (208 days ago)
InstallationMedia: Ubuntu 21.10.0 2021.05.28 amd64 "bcachefs" (20210622)
MachineType: Dell Inc. XPS 15 9575
ProcFB: 0 i915drmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.13.0-20-generic root=UUID=ff90803a-eedd-429b-bb78-b713f7c661d6 ro quiet splash
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-5.13.0-20-generic N/A
 linux-backports-modules-5.13.0-20-generic N/A
 linux-firmware 1.204
SourcePackage: linux
UpgradeStatus: Upgraded to jammy on 2021-10-28 (84 days ago)
dmi.bios.date: 07/07/2019
dmi.bios.release: 1.7
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 1.7.1
dmi.board.name: 0C32VW
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 31
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr1.7.1:bd07/07/2019:br1.7:svnDellInc.:pnXPS159575:pvr:sku080D:rvnDellInc.:rn0C32VW:rvrA00:cvnDellInc.:ct31:cvr:
dmi.product.family: XPS
dmi.product.name: XPS 15 9575
dmi.product.sku: 080D
dmi.sys.vendor: Dell Inc.

Revision history for this message
Chris Halse Rogers (raof) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
tags: added: impish
Revision history for this message
Daniel van Vugt (vanvugt) wrote : Re: Flickering white/black screen once KMS comes up

Interesting dual GPU setup. In theory GNOME should be using the primary GPU (Intel) only.

One exception to that would be if you have a monitor plugged into one of the AMD GPU ports, in which case scanout would be via AMD but the rendering still on Intel.

Can we be sure amdgpu is not involved at all here?

Revision history for this message
Chris Halse Rogers (raof) wrote :

There is no scanout hardware connected to the AMD GPU, so if there is any involvement by amdgpu it is purely by making i915 misbehave.

This is also not a GNOME bug - the display corrupts as soon as the KMS driver is loaded, which, since I've got an encrypted rootfs, is early in the initramfs. The same behaviour is observed regardless of the kernel commandline options, too, so it's not plymouth doing something weird, or vthandoff. It also happens with break=mount.

Revision history for this message
Chris Halse Rogers (raof) wrote :

Further investigation:
Affects:
1) 5.13.19 does not suffer from this bug
2
) The Ubuntu 5.15 kernels, and mainline 5.16 and 5.17 kernels *do* suffer from this bug.

Also of note: this only occurs the *first* time KMS tries to come up on the panel, and *only* if there is not a second monitor plugged in. Having a second monitor plugged in at boot results in a successful boot, *plugging in* a second monitor after triggering this bug results in *both* displays working fine, and *removing* a second monitor leaves the laptop panel working.

It looks like maybe a bug has been introduced in framebuffer allocation?

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Ah! I was looking for this bug yesterday when duplicate bug 1965968 was reported.

summary: - Flickering white/black screen once KMS comes up
+ [Dell XPS 15 9575] Flickering white/black screen once KMS comes up
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

The other bug mentions that the flickering seems unrelated to the blackness.

Fixing the flickering needed i915.enable_psr=0

Fixing the blackness needed kernel <= 5.13.19 (or generally < 5.14)

tags: added: i915 regression regression-release
Changed in linux (Ubuntu):
importance: Undecided → High
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I feel like the OEM team would have encountered and fixed this already. Can you try an OEM kernel?

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Turns out that Ubuntu never certified the XPS 9575 so no we don't have a kernel fix prepared and hiding anywhere.

Revision history for this message
Joe Barnett (thejoe) wrote :

Was the Precision 5530 2-in-1 version of this machine ever certified? don't know if it would have the exact same issue but its the same chipset.

In the meantime I'm attempting to bisect the mainline kernel to find the commit that broke this.

Revision history for this message
Daniel van Vugt (vanvugt) wrote (last edit ):

Some time ago, yes: https://ubuntu.com/certified/201807-26342
but that system had a different secondary GPU type.

Revision history for this message
Joe Barnett (thejoe) wrote :

after bisection:

6d7a793aabf31d7ba2b16fc13a94ccf0b90e4be0 is the first bad commit
commit 6d7a793aabf31d7ba2b16fc13a94ccf0b90e4be0
Author: José Roberto de Souza <email address hidden>
Date: Fri May 14 16:22:45 2021 -0700

    drm/i915/display: Allow fastsets when DP_SDP_VSC infoframe do not match with PSR enabled

    When PSR is enabled it handles DP_SDP_VSC, changing revision and all
    the other fields as necessary.
    It can also enabled and disable this SDP as needed without a full
    modeset.

    So here masking DP_SDP_VSC bit when previous and future state PSR
    enabled, it will still be checked when comparing the asked state
    to what was programmed to hardware.

    Cc: Gwan-gyeong Mun <email address hidden>
    Cc: Radhakrishna Sripada <email address hidden>
    Reported-by: Ville Syrjälä <email address hidden>
    Fixes: 78b772e1a01f ("drm/i915/display: Fill PSR state during hardware configuration read out")
    Signed-off-by: José Roberto de Souza <email address hidden>
    Reviewed-by: Gwan-gyeong Mun <email address hidden>
    Link: https://patchwork<email address hidden>

 drivers/gpu/drm/i915/display/intel_display.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Nice work. That suggests the problem started in v5.13-rc6 though, which is a whole major kernel version earlier than we thought.

Chris, can you confirm?

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Or maybe it is indeed two bugs and 6d7a793aabf3 was just the first commit to exhibit either.

Revision history for this message
Joe Barnett (thejoe) wrote :

Yeah I was confused that the built kernel versions were in the v5.13-rc range, but according to `git tag --contains 6d7a793aabf31d7ba2b16fc13a94ccf0b90e4be0` (and the list of tags in https://github.com/torvalds/linux/commit/6d7a793aabf31d7ba2b16fc13a94ccf0b90e4be0) v5.14-rc1 is the first tag that has this commit in it?

Revision history for this message
Daniel van Vugt (vanvugt) wrote (last edit ):

Well I just learned something new, thanks. I usually do chronological searches from the offending commit but clearly that's error-prone in a non-linear graph.

Changed in linux (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Next please verify the bug is present in drm-tip:

  https://kernel.ubuntu.com/~kernel-ppa/mainline/drm-tip/

and if so then report it upstream:

  https://gitlab.freedesktop.org/drm/intel/-/issues

Revision history for this message
Chris Halse Rogers (raof) wrote :

Confirmed, 6d7a793aabf31d7ba2b16fc13a94cc is bad, it's predecessor is good.

Revision history for this message
Joe Barnett (thejoe) wrote :

appears to be fixed in drm-tip 2022-03-29 packages

Revision history for this message
Chris Halse Rogers (raof) wrote :

I suspect this may have been fixed by 9ce5884e5139037445d0efcf37aeba21008011ad; I'm building an Ubuntu kernel with that cherry-picked to test.

Hm. Although that seems to have been included in 5.16, and I *thought* I'd verified that 5.16 exhibited the bug?

Revision history for this message
Chris Halse Rogers (raof) wrote :

Nope. It looked like a good candidate, but isn't the fix.

5.17.1 fails, drm-tip succeeds, so the fix is somewhere between v5.17 and drm-tip. I'm (slowly) bisecting to find the commit that fixes it.

Revision history for this message
Joe Barnett (thejoe) wrote :

If I didn't get my good/bad confused on the reverse bisection, this appears to be the commit that fixes it:

5ac860cc52540df8bca27e0bb25b6744df67e8f0 is the first bad commit
commit 5ac860cc52540df8bca27e0bb25b6744df67e8f0
Author: Ville Syrjälä <email address hidden>
Date: Thu Mar 3 21:12:06 2022 +0200

    drm/i915: Fix DBUF bandwidth vs. cdclk handling

    Make the dbuf bandwidth min cdclk calculations match the spec
    more closely. Supposedly the arbiter can only guarantee an equal
    share of the total bandwidth of the slice to each active plane
    on that slice. So we take the max bandwidth of any of the planes
    on each slice and multiply that by the number of active planes
    on the slice to get a worst case estimate on how much bandwidth
    we require.

    Signed-off-by: Ville Syrjälä <email address hidden>
    Link: https://patchwork.<email address hidden>
    Reviewed-by: Stanislav Lisovskiy <email address hidden>

 drivers/gpu/drm/i915/display/intel_bw.c | 157 ++++++++++++++++++++---------
 drivers/gpu/drm/i915/display/intel_bw.h | 10 +-
 drivers/gpu/drm/i915/display/intel_cdclk.c | 67 +++++-------
 drivers/gpu/drm/i915/display/intel_cdclk.h | 2 +
 4 files changed, 147 insertions(+), 89 deletions(-)

Revision history for this message
Chris Halse Rogers (raof) wrote :

Thanks for that bisect.

Annoyingly, that commit doesn't cherry-pick cleanly onto 5.15, but I'll see if I can munge it up to test.

Revision history for this message
Chris Halse Rogers (raof) wrote :

Hm. That commit has too many dependencies to cherry-pick (or even cherry-pick a small series) without significant conflicts. It might be still possible to *understand* the patch and write an equivalent for 5.15.

Revision history for this message
Joe Barnett (thejoe) wrote :

found some other reports that claim i915.fastboot=0 is a workaround, and can confirm it does work around the issue on jammy's current 5.15 kernel.

https://www.reddit.com/r/pop_os/comments/scqr4n/dell_xps_15_9575_screen_flickering_upon_boot/
https://bugs.archlinux.org/task/72134
https://gitlab.freedesktop.org/drm/intel/-/issues/4952

Revision history for this message
Joe Barnett (thejoe) wrote :

also fwiw, confirmed that starting from tag Ubuntu-5.15.0-25.25 (commit f4a9abe17854fc753c84a0ba4ac275e715a008f3), reverting commit 6d7a793aabf31d7ba2b16fc13a94ccf0b90e4be0 applies cleanly and fixes the bug.

Revision history for this message
Joe Barnett (thejoe) wrote :

any plans to address this in jammy? haven't had any movement on the upstream bug so I assume reverting that one commit is probably the best option?

tags: added: rls-jj-incoming
Changed in linux:
status: Unknown → New
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Is kernel 5.15.0-40-generic still showing the bug?

Revision history for this message
Joe Barnett (thejoe) wrote :

I won't have access to that machine for a couple of weeks but will test then.

Revision history for this message
Kevin Lopez (kevin-lopez-91) wrote :

@Daniel van Vugt: I also have that same issue as described in this bug. I upgraded my machine to said kernel on a fresh install, no dice. I was able to workaround it by adding the flag posted by thejoe earlier in this thread (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1958620/comments/25)

I had also reported this bug (or a bug that is very similar/related) here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1979561

Revision history for this message
Joe Barnett (thejoe) wrote :

This bug is not fixed by 5.15.0-40-generic, and still occurs with that kernel

Revision history for this message
Joe Barnett (thejoe) wrote :

fwiw, this is fixed in kinetic with its 5.19 kernel

Changed in linux (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.