systemd-udevd appears to be attempting to load Nvidia kernel modules in a tight loop

Bug #1855747 reported by Dan Watkins
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-440 (Ubuntu)
Expired
Undecided
Unassigned
systemd (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

I'm seeing these lines repeated:

```
Dec 09 12:01:45 surprise systemd-udevd[731]: nvidia: Process '/usr/bin/nvidia-smi' failed with exit code 9.
Dec 09 12:01:46 surprise systemd-udevd[731]: nvidia: Process '/sbin/modprobe nvidia-modeset' failed with exit code 1.
Dec 09 12:01:46 surprise systemd-udevd[731]: nvidia: Process '/sbin/modprobe nvidia-drm' failed with exit code 1.
Dec 09 12:01:47 surprise systemd-udevd[731]: nvidia: Process '/sbin/modprobe nvidia-uvm' failed with exit code 1.
Dec 09 12:01:48 surprise systemd-udevd[731]: nvidia: Process '/usr/bin/nvidia-smi' failed with exit code 9.
Dec 09 12:01:49 surprise systemd-udevd[731]: nvidia: Process '/sbin/modprobe nvidia-modeset' failed with exit code 1.
Dec 09 12:01:49 surprise systemd-udevd[731]: nvidia: Process '/sbin/modprobe nvidia-drm' failed with exit code 1.
Dec 09 12:01:50 surprise systemd-udevd[731]: nvidia: Process '/sbin/modprobe nvidia-uvm' failed with exit code 1.
```

My system has been up ~an hour:

$ uptime
 12:02:28 up 1:00, 1 user, load average: 4.52, 4.80, 5.50

and I've seen a lot of these:

$ journalctl -u systemd-udevd.service -b0 | grep -c "exit code 9"
1333

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: udev 243-3ubuntu1
ProcVersionSignature: Ubuntu 5.3.0-25.27-generic 5.3.13
Uname: Linux 5.3.0-25-generic x86_64
NonfreeKernelModules: nvidia
ApportVersion: 2.20.11-0ubuntu13
Architecture: amd64
CurrentDesktop: i3
CustomUdevRuleFiles: 70-snap.core.rules 70-snap.spotify.rules
Date: Mon Dec 9 11:59:10 2019
InstallationDate: Installed on 2019-05-07 (215 days ago)
InstallationMedia: Ubuntu 18.04.2 LTS "Bionic Beaver" - Release amd64 (20190210)
MachineType: Gigabyte Technology Co., Ltd. B450M DS3H
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-25-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash resume=UUID=73909634-a75d-42c9-8f66-a69138690756 vt.handoff=7
SourcePackage: systemd
UpgradeStatus: Upgraded to focal on 2019-11-15 (23 days ago)
dmi.bios.date: 01/25/2019
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: F4
dmi.board.asset.tag: Default string
dmi.board.name: B450M DS3H-CF
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: x.x
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrF4:bd01/25/2019:svnGigabyteTechnologyCo.,Ltd.:pnB450MDS3H:pvrDefaultstring:rvnGigabyteTechnologyCo.,Ltd.:rnB450MDS3H-CF:rvrx.x:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: Default string
dmi.product.name: B450M DS3H
dmi.product.sku: Default string
dmi.product.version: Default string
dmi.sys.vendor: Gigabyte Technology Co., Ltd.
modified.conffile..etc.apport.crashdb.conf: [modified]
modified.conffile..etc.udev.udev.conf: [modified]
mtime.conffile..etc.apport.crashdb.conf: 2019-11-25T16:40:26.261317
mtime.conffile..etc.udev.udev.conf: 2019-11-28T09:22:42.096686

Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :

(The system booted up using nouveau, which is why the Nvidia drivers won't load.)

Revision history for this message
Dan Watkins (oddbloke) wrote :

/lib/udev/rules.d/71-nvidia.rules is definitely involved in this somehow; I'm seeing nvidia-persistenced started/stopped repeatedly too, and that's driven by that rules file.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Can you please run `sudo update-initramfs -u -k all` and see if the issue still happens?

Revision history for this message
Dan Watkins (oddbloke) wrote :

I fixed this by getting my system to boot with the correct drivers, so I can't easily do any further testing, I'm afraid.

I'll mark this as Incomplete.

Changed in nvidia-graphics-drivers-440 (Ubuntu):
status: New → Incomplete
Changed in systemd (Ubuntu):
status: New → Incomplete
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Somehow nouveau is still in the initramfs and I think that's the cause for this issue.

Revision history for this message
Dan Watkins (oddbloke) wrote :

Yes, at the time I had an upstream kernel installed. The Nvidia drivers were installed for _that_ kernel, but I booted onto the most recent Ubuntu archive kernel instead. DKMS didn't build/install the drivers for that kernel for some reason, so they weren't available to load, but that didn't stop the system attempting repeatedly (which lead to a noticeable amount of load on the system).

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for systemd (Ubuntu) because there has been no activity for 60 days.]

Changed in systemd (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for nvidia-graphics-drivers-440 (Ubuntu) because there has been no activity for 60 days.]

Changed in nvidia-graphics-drivers-440 (Ubuntu):
status: Incomplete → Expired
Revision history for this message
costinel (costinel) wrote :

>Kai-Heng Feng (kaihengfeng) wrote on 2019-12-11: #4

> Can you please run `sudo update-initramfs -u -k all` and see if the issue still happens?

yes this persists even after that.

I have a nvidia card but I want to blacklist it. even blacklisted, the systemd-udevd /lib/udev/rules.d/71-nvidia.rules attempt to run the rules forever.

moving /lib/udev/rules.d/71-nvidia.rules away from udev rules search path fixes that, but that's not the correct solution. why does systemd-udev insist repeating instead of failing after first try?

18.04 lts here. drivers nvidia-driver-390

Revision history for this message
costinel (costinel) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.