System hangs on purple screen

Bug #1949026 reported by jeremyszu
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OEM Priority Project
Critical
jeremyszu
nvidia-graphics-drivers-470 (Ubuntu)
Undecided
Unassigned
Bionic
Undecided
Unassigned
Focal
Undecided
Unassigned
Hirsute
Undecided
Unassigned
Impish
Undecided
Unassigned

Bug Description

[Impact]

 * In some race condition, the GDM starts before gdm-udev rule be applied.
 * That's because somehow the nvidia driver takes more times to do the runtime resume. Thus, apply the workaround to postpone to enable RTD3.

[Test Plan]

 1. In problematic machine, warn/cold boot the system.
 2.1 System hangs on GDM login shell
 2.2 After apply this patch, the system can enter the login screen.

[Where problems could occur]

 * Postpone to enable RTD3 when binding driver should not impact the functionality since the power consumption usually not be consider in booting stage.

---

We are facing an issue that GDM starts with Wayland instead of Xorg,
despite of 61-gdm.rules' gdm-disable-wayland for NVIDIA graphics.

The issue happens because gdm-disable-wayland is executed after GDM has
started. The reason why the udev rules takes so long is because the the
runtime suspended NVIDIA GFX takes more than 1 second to runtime resume,
hence the driver starts the probing routine rather late.

The proper solution is to impose a barrier like
systemd-udev-settle.service before GDM, but limits to the GFX device
only to avoid waiting for all udev rules are finished.

Since such mechanism isn't available right now, workaround the issue by
enabling runtime PM after driver is bound to avoid the runtime resume
delay, and hope GDM always starts after the probing is done.

Please backport below patch to supported nvidia-drivers:
https://github.com/tseliot/nvidia-graphics-drivers/pull/41/commits/7e9e4d4a827dc9da0e27058871034245ae4b7508

jeremyszu (os369510)
tags: added: oem-priority originate-from-1942281 stella
tags: added: originate-from-1943787
Changed in oem-priority:
assignee: nobody → jeremyszu (os369510)
importance: Undecided → Critical
status: New → Triaged
jeremyszu (os369510)
tags: removed: originate-from-1943787
jeremyszu (os369510)
description: updated
Revision history for this message
Andy Whitcroft (apw) wrote : Please test proposed package

Hello jeremyszu, or anyone else affected,

Accepted nvidia-graphics-drivers-470 into impish-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nvidia-graphics-drivers-470/470.82.00-0ubuntu0.21.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-impish to verification-done-impish. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-impish. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in nvidia-graphics-drivers-470 (Ubuntu Impish):
status: New → Fix Committed
tags: added: verification-needed verification-needed-impish
Changed in nvidia-graphics-drivers-470 (Ubuntu Hirsute):
status: New → Fix Committed
tags: added: verification-needed-hirsute
Revision history for this message
Andy Whitcroft (apw) wrote :

Hello jeremyszu, or anyone else affected,

Accepted nvidia-graphics-drivers-470 into hirsute-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nvidia-graphics-drivers-470/470.82.00-0ubuntu0.21.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-hirsute to verification-done-hirsute. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-hirsute. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in nvidia-graphics-drivers-470 (Ubuntu Focal):
status: New → Fix Committed
tags: added: verification-needed-focal
Revision history for this message
Andy Whitcroft (apw) wrote :

Hello jeremyszu, or anyone else affected,

Accepted nvidia-graphics-drivers-470 into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nvidia-graphics-drivers-470/470.82.00-0ubuntu0.20.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Andy Whitcroft (apw) wrote :

Hello jeremyszu, or anyone else affected,

Accepted nvidia-graphics-drivers-470 into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nvidia-graphics-drivers-470/470.82.00-0ubuntu0.18.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in nvidia-graphics-drivers-470 (Ubuntu Bionic):
status: New → Fix Committed
tags: added: verification-needed-bionic
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (nvidia-graphics-drivers-470/470.82.00-0ubuntu0.21.04.1)

All autopkgtests for the newly accepted nvidia-graphics-drivers-470 (470.82.00-0ubuntu0.21.04.1) for hirsute have finished running.
The following regressions have been reported in tests triggered by the package:

pyopencl/2021.1.2-1build1 (armhf)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/hirsute/update_excuses.html#nvidia-graphics-drivers-470

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

jeremyszu (os369510)
tags: added: originate-from-1943787
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nvidia-graphics-drivers-470 - 470.82.00-0ubuntu0.21.10.1

---------------
nvidia-graphics-drivers-470 (470.82.00-0ubuntu0.21.10.1) impish; urgency=medium

  * New upstream release (LP: #1948025):
    - Fixed a bug that can cause a kernel crash in SLI Mosaic
      configurations.
    - Added support for the EGL_NV_robustness_video_memory_purge.

  [ Kai-Heng Feng ]
  * debian/71-nvidia.rules:
    - Enable runtime PM after driver is bound (LP: #1949026).

 -- Alberto Milone <email address hidden> Tue, 26 Oct 2021 12:39:25 +0200

Changed in nvidia-graphics-drivers-470 (Ubuntu Impish):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nvidia-graphics-drivers-470 - 470.82.00-0ubuntu0.21.04.1

---------------
nvidia-graphics-drivers-470 (470.82.00-0ubuntu0.21.04.1) hirsute; urgency=medium

  * New upstream release (LP: #1948025):
    - Fixed a bug that can cause a kernel crash in SLI Mosaic
      configurations.
    - Added support for the EGL_NV_robustness_video_memory_purge.

  [ Kai-Heng Feng ]
  * debian/71-nvidia.rules:
    - Enable runtime PM after driver is bound (LP: #1949026).

 -- Alberto Milone <email address hidden> Tue, 26 Oct 2021 15:42:15 +0200

Changed in nvidia-graphics-drivers-470 (Ubuntu Hirsute):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nvidia-graphics-drivers-470 - 470.82.00-0ubuntu0.18.04.1

---------------
nvidia-graphics-drivers-470 (470.82.00-0ubuntu0.18.04.1) bionic; urgency=medium

  * New upstream release (LP: #1948025):
    - Fixed a bug that can cause a kernel crash in SLI Mosaic
      configurations.
    - Added support for the EGL_NV_robustness_video_memory_purge.

  [ Kai-Heng Feng ]
  * debian/71-nvidia.rules:
    - Enable runtime PM after driver is bound (LP: #1949026).

 -- Alberto Milone <email address hidden> Tue, 26 Oct 2021 15:44:51 +0200

Changed in nvidia-graphics-drivers-470 (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nvidia-graphics-drivers-470 - 470.82.00-0ubuntu0.20.04.1

---------------
nvidia-graphics-drivers-470 (470.82.00-0ubuntu0.20.04.1) focal; urgency=medium

  * New upstream release (LP: #1948025):
    - Fixed a bug that can cause a kernel crash in SLI Mosaic
      configurations.
    - Added support for the EGL_NV_robustness_video_memory_purge.

  [ Kai-Heng Feng ]
  * debian/71-nvidia.rules:
    - Enable runtime PM after driver is bound (LP: #1949026).

 -- Alberto Milone <email address hidden> Tue, 26 Oct 2021 15:44:04 +0200

Changed in nvidia-graphics-drivers-470 (Ubuntu Focal):
status: Fix Committed → Fix Released
jeremyszu (os369510)
Changed in oem-priority:
status: Triaged → Fix Released
Revision history for this message
Tremo Lovier (tremolo4) wrote :

Thanks a lot for this fix! At least I think it was this fix?

I had this issue on my Optimus laptop (Acer Predator G3-572). I think it only happened with kernel modesetting turned on. And it only happened with GDM, not LightDM.

What's interesting to me is that my GPU (1060 mobile) does not even have RTD3 capability.

I had spent a lot of time googling potential fixes until I gave up and switched to LightDM. After reading the changelog for this update and installing it, I tried out GDM again and it always works now.

Thanks again!

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nvidia-graphics-drivers-470 (Ubuntu):
status: New → Confirmed
Revision history for this message
Tremo Lovier (tremolo4) wrote :

Out of curiosity, I tried reverting this fix in /lib/udev/rules.d/71-nvidia.rules -- and it still works, no freeze on the login screen.

So I guess my problem was fixed by something else after all, maybe the driver update itself. Or I was lucky to avoid the race condition... who knows.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nvidia-graphics-drivers-470 - 470.86-0ubuntu1

---------------
nvidia-graphics-drivers-470 (470.86-0ubuntu1) jammy; urgency=medium

  * New upstream release (LP: #1950665):
    - Fixed a regression which prevented DisplayPort and HDMI 2.1
      variable refresh rate (VRR) G-SYNC Compatible monitors from
      functioning correctly in variable refresh rate mode, resulting
      in issues such as flickering.

 -- Alberto Milone <email address hidden> Thu, 11 Nov 2021 16:47:27 +0100

Changed in nvidia-graphics-drivers-470 (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers