4.15.0-1037 does not see all PCI devices on GPU VMs

Bug #1816106 reported by David Coronel on 2019-02-15
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-azure (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Marcelo Cerri
Cosmic
Undecided
Marcelo Cerri

Bug Description

Host changes have altered how the PCI GUID is presented to the guest and the patches for PCI IDs in 4.15.0-1037 do not properly handle the new condition.

Impact:
Instances with multiple GPUs are only seeing one.

Workaround:
4.15.0-1036 does not have this behavior.

Additional info:

The commit in 4.15.0-1037 responsible is " - PCI: hv: Make sure the bus domain is really unique"

The immediate action requested is to back out this patch on 4.15.0 (azure):
https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-azure/commit/?h=master-next&id=b9ae54076a78d01659c4d0f0a558cdb4056f0d13

The same thing on 4.18:
https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-azure/commit/?h=azure-edge-next&id=29927dfb7f69bcf2ae7fd1cda10997e646a5189c

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-azure - 4.18.0-1011.11

---------------
linux-azure (4.18.0-1011.11) cosmic; urgency=medium

  * linux-azure: 4.18.0-1011.11 -proposed tracker (LP: #1816081)

  * 4.15.0-1037 does not see all PCI devices on GPU VMs (LP: #1816106)
    - Revert "PCI: hv: Make sure the bus domain is really unique"

linux-azure (4.18.0-1009.9) cosmic; urgency=medium

  * Allow I/O schedulers to be loaded with modprobe in linux-azure
    (LP: #1813211)
    - [Config] linux-azure: Enable all IO schedulers as modules

  * [Hyper-V] srcu: Lock srcu_data structure in srcu_gp_start() (LP: #1802021)
    - srcu: Lock srcu_data structure in srcu_gp_start()

  * CONFIG_SECURITY_SELINUX_DISABLE should be disabled on 4.15/4.18 Azure
    (LP: #1813866)
    - [Config]: disable CONFIG_SECURITY_SELINUX_DISABLE

  [ Ubuntu: 4.18.0-15.16 ]

  * Ubuntu boot failure. 4.18.0-14 boot stalls. (does not boot) (LP: #1814555)
    - Revert "drm/i915/ringbuffer: Delay after EMIT_INVALIDATE for gen4/gen5"
  * Userspace break as a result of missing patch backport (LP: #1813873)
    - tty: Don't hold ldisc lock in tty_reopen() if ldisc present

 -- Stefan Bader <email address hidden> Fri, 15 Feb 2019 17:16:24 +0100

Changed in linux-azure (Ubuntu):
status: New → Fix Released
Marcelo Cerri (mhcerri) on 2019-02-27
no longer affects: linux-azure (Ubuntu Bionic)
Changed in linux-azure (Ubuntu Xenial):
status: New → Fix Released
Changed in linux-azure (Ubuntu Cosmic):
status: New → Fix Released
Changed in linux-azure (Ubuntu Xenial):
assignee: nobody → Marcelo Cerri (mhcerri)
Changed in linux-azure (Ubuntu Cosmic):
assignee: nobody → Marcelo Cerri (mhcerri)
Marcelo Cerri (mhcerri) wrote :

4.18 also needs the fix for http://bugs.launchpad.net/bugs/1684971

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers