Nvidia fabric-manager-535 version incompatible with Nvidia Driver

Bug #2065014 reported by Jean-Noël Bazin
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
fabric-manager-535 (Ubuntu)
New
Undecided
Unassigned

Bug Description

(On a server with NVidia HGX platform with 8 A100 on Ubuntu 22.04.4 LTS)

The new version of the cuda-drivers-fabricmanager-535 is 535.161.08, and the version of the nvidia-driver-535 is 535.171.04.

```
# apt-cache policy cuda-drivers-fabricmanager-535
  Candidat : 535.161.08-0ubuntu3.22.04.1
 Table de version :
     535.161.08-0ubuntu3.22.04.1 500
        500 http://fr.archive.ubuntu.com/ubuntu jammy-updates/multiverse amd64 Packages
        500 http://fr.archive.ubuntu.com/ubuntu jammy-security/multiverse amd64 Packages

# apt-cache policy nvidia-driver-535
nvidia-driver-535:
  Installé : (aucun)
  Candidat : 535.171.04-0ubuntu0.22.04.1
 Table de version :
     535.171.04-0ubuntu0.22.04.1 500
        500 http://fr.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 Packages
        500 http://fr.archive.ubuntu.com/ubuntu jammy-security/restricted amd64 Packages
```

They are incompatible, systemctl status on failed nvidia-fabricmanager.service reports :

```
nv-fabricmanager[2664]: fabric manager NVIDIA GPU driver interface version 535.161.08 don't match with driver version 535.171.04. Please update with matching NVIDIA driver package.
```

Revision history for this message
Patater (m-launchpad-patater-com) wrote :

I also ran into this. In order to obtain a version of nvidia-fabricmanager and nvidia-driver that both had a matching GPU driver interface version, I had to install nvidia-driver-550-server (550.54.15) which matched nvidia-fabricmanager (550.54.15). nvidia-driver-550 was (550.67), which didn't match.

For your case, driver version 535, the following two packages match:
 - nvidia-driver-535-server (535.161.08-0ubuntu2.22.04.1)
 - nvidia-fabricmanager-535 (535.161.08-0ubuntu3.22.04.1)

Here are my matching packages as installed.

```
$ apt-cache policy nvidia-driver-550-server
nvidia-driver-550-server:
  Installed: 550.54.15-0ubuntu0.22.04.2
  Candidate: 550.54.15-0ubuntu0.22.04.2
  Version table:
 *** 550.54.15-0ubuntu0.22.04.2 500
        500 http://archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 Packages
        500 http://archive.ubuntu.com/ubuntu jammy-security/restricted amd64 Packages
        100 /var/lib/dpkg/status

$ apt-cache policy nvidia-fabricmanager-550
nvidia-fabricmanager-550:
  Installed: 550.54.15-0ubuntu0.22.04.1
  Candidate: 550.54.15-0ubuntu0.22.04.1
  Version table:
 *** 550.54.15-0ubuntu0.22.04.1 500
        500 http://archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 Packages
        500 http://archive.ubuntu.com/ubuntu jammy-security/restricted amd64 Packages
        100 /var/lib/dpkg/status
```

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.