systemd-logind crash when suspend with nvidia-suspend.service masked, bringing session down with it

Bug #1933880 reported by Ratchanan Srirattanamet
66
This bug affects 11 people
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-465 (Ubuntu)
Confirmed
Undecided
Unassigned
systemd (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Steps to reproduce:
1. Run `sudo apt install nvidia-driver-465`. Then, without a need for a restart, run `sudo apt autoremove nvidia-driver-465`.
2. Leaving the terminal open, suspend the machine.

Expected behavior: the machine suspend successfully and can be woken up successfully.
Actual behavior: the machine doesn't suspend. It either:
- hang with no respond other than SysRq+REISUB (or maybe network, but I didn't test), or
- return you back to the login screen. Upon logging in, you'll notice that the terminal you opened is gone.

Upon further inspection (on a session that doesn't hang), it's been found that X server died with:

Fatal server error:
[ 66.422] (EE) systemd-logind disappeared (stopped/restarted?)

And checking journal for systemd-logind log, it said:

Error during inhibitor-delayed operation (already returned success to client): Unit nvidia-suspend.service is masked.

before the new process takes it place.

The system is Ubuntu 20.04, X.org session.

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: systemd 245.4-4ubuntu3.7
ProcVersionSignature: Ubuntu 5.11.0-22.23~20.04.1-generic 5.11.21
Uname: Linux 5.11.0-22-generic x86_64
ApportVersion: 2.20.11-0ubuntu27.18
Architecture: amd64
CasperMD5CheckResult: skip
CurrentDesktop: ubuntu:GNOME
Date: Tue Jun 29 03:34:00 2021
InstallationDate: Installed on 2021-03-15 (105 days ago)
InstallationMedia: Ubuntu 20.04.2.0 LTS "Focal Fossa" - Release amd64 (20210209.1)
MachineType: LENOVO 82B5
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=th_TH.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.11.0-22-generic root=UUID=06f2a676-a62c-443a-8bc8-4e0eda4600f4 ro log_buf_len=2M quiet splash vt.handoff=7
SourcePackage: systemd
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 01/01/2021
dmi.bios.release: 1.31
dmi.bios.vendor: LENOVO
dmi.bios.version: EUCN31WW
dmi.board.asset.tag: NO Asset Tag
dmi.board.name: LNVNB161216
dmi.board.vendor: LENOVO
dmi.board.version: SDK0Q55756 WIN
dmi.chassis.asset.tag: NO Asset Tag
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Lenovo Legion 5 15ARH05
dmi.ec.firmware.release: 1.31
dmi.modalias: dmi:bvnLENOVO:bvrEUCN31WW:bd01/01/2021:br1.31:efr1.31:svnLENOVO:pn82B5:pvrLenovoLegion515ARH05:rvnLENOVO:rnLNVNB161216:rvrSDK0Q55756WIN:cvnLENOVO:ct10:cvrLenovoLegion515ARH05:
dmi.product.family: Legion 5 15ARH05
dmi.product.name: 82B5
dmi.product.sku: LENOVO_MT_82B5_BU_idea_FM_Legion 5 15ARH05
dmi.product.version: Lenovo Legion 5 15ARH05
dmi.sys.vendor: LENOVO

Revision history for this message
Ratchanan Srirattanamet (peat-new) wrote :
Revision history for this message
Ratchanan Srirattanamet (peat-new) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in systemd (Ubuntu):
status: New → Confirmed
Revision history for this message
Alex Boisvert (alex-boisvert) wrote :

Not sure if it's going to help anybody else but I fixed my problem by correcting broken symlinks with nvidia-suspend.service. It was pointing to a missing file/location (/lib/systemd/system/nvidia-suspend.service). I simply created empty files to match the broken links.

Revision history for this message
apienk (andrzej-pienkowski) wrote :

Alex, thanks a lot. Your workaround fixed my suspend problems.

Revision history for this message
soccer193 (soccer193) wrote :

A workaround of my own was to correct the nvidia driver's behaviour.

Running Ubuntu 21.04 and nvidia-drivers-460, I upgraded to nvidia-drivers-470 (which I had installed once before), resulting in no masked services, and files paths to real .service files.

Before:
❯ systemctl status nvidia-suspend nvidia-hibernate nvidia-resume
● nvidia-suspend.service
     Loaded: masked (Reason: Unit nvidia-suspend.service is masked.)
     Active: inactive (dead)

● nvidia-hibernate.service
     Loaded: masked (Reason: Unit nvidia-hibernate.service is masked.)
     Active: inactive (dead)

● nvidia-resume.service
     Loaded: masked (Reason: Unit nvidia-resume.service is masked.)
     Active: inactive (dead)

After:
systemctl status nvidia-suspend nvidia-hibernate nvidia-resume
● nvidia-suspend.service - NVIDIA system suspend actions
     Loaded: loaded (/lib/systemd/system/nvidia-suspend.service; enabled; vendor preset: enabled)
     Active: inactive (dead)

● nvidia-hibernate.service - NVIDIA system hibernate actions
     Loaded: loaded (/lib/systemd/system/nvidia-hibernate.service; enabled; vendor preset: enabled)
     Active: inactive (dead)

● nvidia-resume.service - NVIDIA system resume actions
     Loaded: loaded (/lib/systemd/system/nvidia-resume.service; enabled; vendor preset: enabled)
     Active: inactive (dead)

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nvidia-graphics-drivers-465 (Ubuntu):
status: New → Confirmed
Revision history for this message
Markus (markonetri) wrote :

Confirming this problem.
The work-around does not work for me.

Revision history for this message
apienk (andrzej-pienkowski) wrote :

For me, with nvidia-driver-495, the simple solution was to remove the damaged symlinks from systemd. You most likely have them if you upgraded from nvidia-driver-470 or nvidia-driver-465, because 470 still included the .service files in /lib/systemd/system/. The files are no longer included in 495 but the postinst script does not remove the symlinks. So, remove them with:

sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-hibernate.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-suspend.service

Works instantly, no need of rebooting or logging off.

According to user punyidea at https://gist.github.com/bmcbm/375f14eaa17f88756b4bdbbebbcfd029, the services are not needed on 470 either and may be safely removed.

Hope that will fix your problems as well.

Revision history for this message
Shuhao (shuhao) wrote :

Can confirm that removing the .service files in /etc/systemd/system/systemd-suspend.service.requires works.

The reason this occurred to me is because i switched GPU from nvidia to AMD, and some programs thought I have nvenc (because the library is installed) when i don't, which necessitates the removal of the nvidia-driver packages.

This seem like an issue with one of the nvidia package's postrm debian package scripts as opposed to systemd, tho.

Revision history for this message
Luigi Calligaris (luigicalligaris) wrote :

I confirm that removing the .service files described above solves the issue for me as well.

Revision history for this message
Rocio Platini (ubuntu-lover) wrote :

I can confirm that solution of apienk worked for me. My pc doesn't freeze any more after suspend.

Revision history for this message
Nick Rosbrook (enr0n) wrote :

Based on the workaround, it seems that the problem is with nvidia systemd units, and not systemd itself.

Changed in systemd (Ubuntu):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.