Installing nvidia-cudnn or nvidia-cuda-toolkit removes nvidia-driver-515

Bug #1983790 reported by Mark Jones
70
This bug affects 13 people
Affects Status Importance Assigned to Milestone
nvidia-cuda-toolkit (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

I'm running a fresh install of Ubuntu 22.04.1 LTS with the Proprietary/Tested drivers installed for my NVIDIA GPU, nvidia-driver-515.

I attempted to install the nvidia-cuda-toolkit and nvidia-cudnn packages, and I expected the packages to be installed normally.

However, both packages attempt to uninstall the NVIDIA driver, leaving my system unusable. It does this without installing a supported driver or recommending a supported driver.

I have found that if I manually install the nvidia-driver-510, I can install the toolkit and cudnn without it installing the NVIDIA driver.

Further information:

$ lsb_release -rd 18:28:35
Description: Ubuntu 22.04.1 LTS
Release: 22.04

NVIDIA Driver Version: 515.65.01
Attempted nvidia-cuda-toolkit version: 11.5.1-1ubuntu1
Attempted nvidia-cudnn version: 8.2.4.15~cuda11.4

Output when installing nvidia-cuda-toolkit and nvidia-cudnn:
The following packages will be REMOVED:
  libnvidia-compute-515 libnvidia-compute-515:i386 libnvidia-decode-515 libnvidia-decode-515:i386
  libnvidia-encode-515 libnvidia-encode-515:i386 nvidia-compute-utils-515 nvidia-driver-515
  nvidia-utils-515
The following NEW packages will be installed:
  ca-certificates-java fonts-dejavu-extra java-common libaccinj64-11.5 libatk-wrapper-java
  libatk-wrapper-java-jni libcub-dev libcublas11 libcublaslt11 libcudart11.0 libcufft10 libcufftw10
  libcuinj64-11.5 libcupti-dev libcupti-doc libcupti11.5 libcurand10 libcusolver11 libcusolvermg11
  libcusparse11 libegl-dev libgl-dev libgl1-mesa-dev libgles-dev libgles1 libglvnd-core-dev
  libglvnd-dev libglx-dev libjs-sphinxdoc libjs-underscore libnppc11 libnppial11 libnppicc11
  libnppidei11 libnppif11 libnppig11 libnppim11 libnppist11 libnppisu11 libnppitc11 libnpps11
  libnvblas11 libnvidia-compute-495 libnvidia-compute-510 libnvidia-ml-dev libnvjpeg11
  libnvrtc-builtins11.5 libnvrtc11.2 libnvtoolsext1 libnvvm4 libopengl-dev libpthread-stubs0-dev
  libtbb-dev libtbb12 libtbbmalloc2 libthrust-dev libvdpau-dev libx11-dev libxau-dev libxcb1-dev
  libxdmcp-dev node-html5shiv nsight-compute nsight-compute-target nsight-systems
  nsight-systems-target nvidia-cuda-dev nvidia-cuda-gdb nvidia-cuda-toolkit nvidia-cuda-toolkit-doc
  nvidia-cudnn nvidia-opencl-dev nvidia-profiler nvidia-visual-profiler ocl-icd-opencl-dev
  opencl-c-headers opencl-clhpp-headers openjdk-8-jre openjdk-8-jre-headless x11proto-dev
  xorg-sgml-doctools xtrans-dev
---
ProblemType: Bug
ApportVersion: 2.20.11-0ubuntu82.1
Architecture: amd64
CasperMD5CheckResult: pass
CurrentDesktop: ubuntu:GNOME
DistroRelease: Ubuntu 22.04
InstallationDate: Installed on 2022-08-06 (2 days ago)
InstallationMedia: Ubuntu 22.04 LTS "Jammy Jellyfish" - Release amd64 (20220419)
NonfreeKernelModules: nvidia_modeset nvidia
Package: nvidia-cuda-toolkit 11.5.1-1ubuntu1
PackageArchitecture: amd64
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/zsh
ProcVersionSignature: Ubuntu 5.15.0-43.46-generic 5.15.39
Tags: jammy
Uname: Linux 5.15.0-43-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: docker
_MarkForUpload: True

Revision history for this message
Aaron Rainbolt (arraybolt3) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. Please execute the following command only once, as it will automatically gather debugging information, in a terminal:
apport-collect 1983790

When reporting bugs in the future please use apport by using 'ubuntu-bug' and the name of the package affected. You can learn more about this functionality at https://wiki.ubuntu.com/ReportingBugs.

Revision history for this message
Mark Jones (ihateyoursystem) wrote : Dependencies.txt

apport information

tags: added: apport-collected jammy
description: updated
Revision history for this message
Mark Jones (ihateyoursystem) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nvidia-cuda-toolkit (Ubuntu):
status: New → Confirmed
Revision history for this message
Marco van Zwetselaar (zwets) wrote (last edit ):

It looks like the cause is nvidia-cuda-dev's dependence on libnvidia-compute-495, a transitional package that depends on libnvidia-compute-510, rather than libnvidia-compute-515.

Other packages (all in the nvidia-cuda-toolkit source package) also depend on libnvidia-compute-495, and are transitive dependencies of nvidia-cuda-dev, so the solution would appear to be to either upgrade all dependencies and get rid of the transitive package, or to keep the transitional package but make it depend on libnvidia-compute-515 | libnvidia-compute-510.

Revision history for this message
Hadmut Danisch (hadmut) wrote :

Any progress here?

The current kernel uses

cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 515.76 Mon Sep 12 19:21:56 UTC 2022

while nvidia-cuda-toolkit is incompatible with the 515 lib version and thus with the kernel itself, causing

% nvidia-smi
Failed to initialize NVML: Driver/library version mismatch

Thus, nvidia-cuda-toolkit is incompatible with the current kernel.

But I currently don't understand why there is both a

/lib/modules/5.15.0-50-generic/kernel/nvidia-510
/lib/modules/5.15.0-50-generic/kernel/nvidia-515

with files in it.

Is it possible that one of the problems is that Ubuntu takes the nvidia-cuda-toolkit package from Debian, but uses newer library versions that Debian for the kernel and thus gets a version missmatch between tools and kernel?

Revision history for this message
Daniel Greeley (danielgreeley) wrote :

I agree with the nothing it's the version mismatch between the tools and kernel. What puzzles me is why Ubuntu is taking the drivers from Debian in the first place, when both the driver and CUDA toolkits are available directly from Nvidia @ https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/

If you look, their repository not only has the drivers/cuda kit for 22.04, but all releases, and for most common distributions not that said fact has much relevance to this discussion aside from the fact that it shows no need to use Debian's drivers at all.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.