systemd-udevd appears to be attempting to load Nvidia kernel modules in a tight loop

Bug #1855747 reported by Dan Watkins on 2019-12-09
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-440 (Ubuntu)
Undecided
Unassigned
systemd (Ubuntu)
Undecided
Unassigned

Bug Description

I'm seeing these lines repeated:

```
Dec 09 12:01:45 surprise systemd-udevd[731]: nvidia: Process '/usr/bin/nvidia-smi' failed with exit code 9.
Dec 09 12:01:46 surprise systemd-udevd[731]: nvidia: Process '/sbin/modprobe nvidia-modeset' failed with exit code 1.
Dec 09 12:01:46 surprise systemd-udevd[731]: nvidia: Process '/sbin/modprobe nvidia-drm' failed with exit code 1.
Dec 09 12:01:47 surprise systemd-udevd[731]: nvidia: Process '/sbin/modprobe nvidia-uvm' failed with exit code 1.
Dec 09 12:01:48 surprise systemd-udevd[731]: nvidia: Process '/usr/bin/nvidia-smi' failed with exit code 9.
Dec 09 12:01:49 surprise systemd-udevd[731]: nvidia: Process '/sbin/modprobe nvidia-modeset' failed with exit code 1.
Dec 09 12:01:49 surprise systemd-udevd[731]: nvidia: Process '/sbin/modprobe nvidia-drm' failed with exit code 1.
Dec 09 12:01:50 surprise systemd-udevd[731]: nvidia: Process '/sbin/modprobe nvidia-uvm' failed with exit code 1.
```

My system has been up ~an hour:

$ uptime
 12:02:28 up 1:00, 1 user, load average: 4.52, 4.80, 5.50

and I've seen a lot of these:

$ journalctl -u systemd-udevd.service -b0 | grep -c "exit code 9"
1333

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: udev 243-3ubuntu1
ProcVersionSignature: Ubuntu 5.3.0-25.27-generic 5.3.13
Uname: Linux 5.3.0-25-generic x86_64
NonfreeKernelModules: nvidia
ApportVersion: 2.20.11-0ubuntu13
Architecture: amd64
CurrentDesktop: i3
CustomUdevRuleFiles: 70-snap.core.rules 70-snap.spotify.rules
Date: Mon Dec 9 11:59:10 2019
InstallationDate: Installed on 2019-05-07 (215 days ago)
InstallationMedia: Ubuntu 18.04.2 LTS "Bionic Beaver" - Release amd64 (20190210)
MachineType: Gigabyte Technology Co., Ltd. B450M DS3H
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-25-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash resume=UUID=73909634-a75d-42c9-8f66-a69138690756 vt.handoff=7
SourcePackage: systemd
UpgradeStatus: Upgraded to focal on 2019-11-15 (23 days ago)
dmi.bios.date: 01/25/2019
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: F4
dmi.board.asset.tag: Default string
dmi.board.name: B450M DS3H-CF
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: x.x
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrF4:bd01/25/2019:svnGigabyteTechnologyCo.,Ltd.:pnB450MDS3H:pvrDefaultstring:rvnGigabyteTechnologyCo.,Ltd.:rnB450MDS3H-CF:rvrx.x:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: Default string
dmi.product.name: B450M DS3H
dmi.product.sku: Default string
dmi.product.version: Default string
dmi.sys.vendor: Gigabyte Technology Co., Ltd.
modified.conffile..etc.apport.crashdb.conf: [modified]
modified.conffile..etc.udev.udev.conf: [modified]
mtime.conffile..etc.apport.crashdb.conf: 2019-11-25T16:40:26.261317
mtime.conffile..etc.udev.udev.conf: 2019-11-28T09:22:42.096686

Dan Watkins (oddbloke) wrote :
Dan Watkins (oddbloke) wrote :

(The system booted up using nouveau, which is why the Nvidia drivers won't load.)

Dan Watkins (oddbloke) wrote :

/lib/udev/rules.d/71-nvidia.rules is definitely involved in this somehow; I'm seeing nvidia-persistenced started/stopped repeatedly too, and that's driven by that rules file.

Kai-Heng Feng (kaihengfeng) wrote :

Can you please run `sudo update-initramfs -u -k all` and see if the issue still happens?

Dan Watkins (oddbloke) wrote :

I fixed this by getting my system to boot with the correct drivers, so I can't easily do any further testing, I'm afraid.

I'll mark this as Incomplete.

Changed in nvidia-graphics-drivers-440 (Ubuntu):
status: New → Incomplete
Changed in systemd (Ubuntu):
status: New → Incomplete
Kai-Heng Feng (kaihengfeng) wrote :

Somehow nouveau is still in the initramfs and I think that's the cause for this issue.

Dan Watkins (oddbloke) wrote :

Yes, at the time I had an upstream kernel installed. The Nvidia drivers were installed for _that_ kernel, but I booted onto the most recent Ubuntu archive kernel instead. DKMS didn't build/install the drivers for that kernel for some reason, so they weren't available to load, but that didn't stop the system attempting repeatedly (which lead to a noticeable amount of load on the system).

Launchpad Janitor (janitor) wrote :

[Expired for systemd (Ubuntu) because there has been no activity for 60 days.]

Changed in systemd (Ubuntu):
status: Incomplete → Expired
Launchpad Janitor (janitor) wrote :

[Expired for nvidia-graphics-drivers-440 (Ubuntu) because there has been no activity for 60 days.]

Changed in nvidia-graphics-drivers-440 (Ubuntu):
status: Incomplete → Expired
costinel (costinel) wrote :

>Kai-Heng Feng (kaihengfeng) wrote on 2019-12-11: #4

> Can you please run `sudo update-initramfs -u -k all` and see if the issue still happens?

yes this persists even after that.

I have a nvidia card but I want to blacklist it. even blacklisted, the systemd-udevd /lib/udev/rules.d/71-nvidia.rules attempt to run the rules forever.

moving /lib/udev/rules.d/71-nvidia.rules away from udev rules search path fixes that, but that's not the correct solution. why does systemd-udev insist repeating instead of failing after first try?

18.04 lts here. drivers nvidia-driver-390

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers