installing nvidia-cuda-toolkit removes nvidia-driver-455

Bug #1900627 reported by Lishai Eitan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
nvidia-cuda-toolkit (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

There seems to be a conflict between (the dependencies of) nvidia-cuda-toolkit and nvidia-driver-455.
At least one dependency of nvidia-cuda-toolkit has sthe following dependencies:
Package: libnvidia-ml-dev
...
Depends: libnvidia-compute-450 (>= 450) | libnvidia-compute-450-server (>= 450) | libnvidia-ml.so.1

However nvidia-driver-455 ultimately depends on libnvidia-compute-455, which does not satisfy libnvidia-compute-450 (>= 450).

ProblemType: Bug
DistroRelease: Ubuntu 20.10
Package: nvidia-cuda-toolkit (not installed)
ProcVersionSignature: Ubuntu 5.8.0-25.26-generic 5.8.14
Uname: Linux 5.8.0-25-generic x86_64
NonfreeKernelModules: nvidia_modeset nvidia
ApportVersion: 2.20.11-0ubuntu50
Architecture: amd64
CasperMD5CheckResult: skip
CurrentDesktop: KDE
Date: Tue Oct 20 01:25:35 2020
InstallationDate: Installed on 2020-10-19 (0 days ago)
InstallationMedia: Kubuntu 20.10 "Groovy Gorilla" - Beta amd64 (20200930)
SourcePackage: nvidia-cuda-toolkit
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Lishai Eitan (lishai-eitan) wrote :
Graham Inggs (ginggs)
Changed in nvidia-cuda-toolkit (Ubuntu):
status: New → Confirmed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nvidia-cuda-toolkit - 11.0.3-1ubuntu1

---------------
nvidia-cuda-toolkit (11.0.3-1ubuntu1) groovy; urgency=medium

  * Update libcuda1 availability for groovy (LP: #1900627)

 -- Graham Inggs <email address hidden> Tue, 20 Oct 2020 12:30:55 +0000

Changed in nvidia-cuda-toolkit (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Lishai Eitan (lishai-eitan) wrote :

Comment Summary:
----------------
suggested fix: make libnvidia-ml-dev depend on:
libnvidia-compute-450 (>= 450) | libnvidia-compute-450-server (>= 450) | libnvidia-ml.so.1 (>= 450) | libnvidia-ml1 (>=450)
instead of:
libnvidia-compute-450 (>= 450) | libnvidia-compute-450-server (>= 450) | libnvidia-ml.so.1 (>= 450)

Longer version :)
-----------------
I noticed now that libnvidia-ml.so.1 (>= 450) would satisfy libnvidia-ml-dev's dependencies (and in turn, would allow the installation of nvidia-cuda-toolkit).

both libnvidia-compute-450 and libnvidia-compute-455 provide libnvidia-ml1, AND
 contain the file libnvidia-ml.so.1, but do not "provide" libnvidia-ml.so.1, which suggests the problem is only in the dependency *declerations*, and nothing "material" prevents the installation of these packages together (i.e. if we would force their installation, they would work).

If this is correct, I believe that changing libnvidia-ml-dev to depend on libnvidia-ml1 (>=450) as an alternative to the dependency on libnvidia-ml.so.1 would fix the problem.

To test this theory, I created a dummy package (using equivs) that contain no files, depends on libnvidia-ml1 (>= 455.28) and "provides" libnvidia-ml.so.1 (= 455.28). After installing this package, I successfully installed nvidia-cuda-toolkit along with nvidia-driver-455.
This setup works for me: I was able to install pytorch and train some models on the gpu (verified using nvidia-smi).

Revision history for this message
Lishai Eitan (lishai-eitan) wrote :

Sorry, must have missed the fix released...
Thank you for the fix :)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.