Add basic support to NVLink2 passthrough

Bug #1819989 reported by Jose Ricardo Ziviani on 2019-03-14
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
High
Canonical Kernel Team
linux (Ubuntu)
Undecided
Unassigned
Bionic
Undecided
Unassigned

Bug Description

This bug exists to track the basic support to NVLink2 passthrough on Ubuntu 18.04 - for the guest side only. There's a relative small patchset that I'm going to send to Canonical Kernel Team using this buglink.

On the host side we'll be running a custom version of Ubuntu 18.04 (kernel + qemu). However on the guest side it will be *very important* for clients to simply download the Ubuntu 18.04 from Canonical's website and have the NVLink2 working out of the box.

For that, we have worked on a small patchset using only upstream patches without changing beyond our area.

As soon as I send the patchset to the mailing list I'll update this bug with a link to that message.

Thank you very much,

Jose R. Ziviani

CVE References

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1819989

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Jose Ricardo Ziviani (joserz) wrote :

The patchset is in the mailing list for review:
https://lists.ubuntu.com/archives/kernel-team/2019-March/099243.html

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in linux (Ubuntu Bionic):
status: New → In Progress
Changed in linux (Ubuntu):
status: Confirmed → Invalid
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Changed in ubuntu-power-systems:
status: New → Fix Committed
importance: Undecided → High
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Jose Ricardo Ziviani (joserz) wrote :
Download full text (6.4 KiB)

SRU:

[Impact]

 * An important feature was developed for PowerPC upstream and backported to a custom version of Ubuntu Bionic 18.04.1. The feature is known as nvlink2[1] passthrough, it allows physical GPUs to be accessed from any QEMU/KVM virtual machine. The problem happens when clients want to use that feature within their virtual machines: They will need to use a custom version, not the standard Ubuntu Bionic for PowerPC that everyone knows where it's and how to install it.

We understand that it's a huge impact in the user experience, not only the extra-difficulty to find/install the correct version but users that misunderstand the need of a custom version will think that the feature is simply broken.

Due to the fact that the guest part (the code that will run in the virtual machine) is a way simpler than the host part we decided to send the patches as a SRU. Fixing the user-experience problem without impacting existing use-cases.

[1] https://wccftech.com/nvidia-volta-gv100-gpu-fast-pascal-gp100/

[Test Case]

 * In order to reproduce the issue, it's required a Power9 system with NVLink2 + NVidia GPU and the customized Ubuntu Bionic installed (kernel + qemu).

 * Then, create a virtual machine like:

Create a disk image:
$ qemu-img create sda.qcow2 -f qcow2 100G

Find the devices to be attached:
$ lspci | grep NVIDIA
...
0004:04:00.0 3D controller: NVIDIA Corporation GV100 [Tesla V100 SXM2] (rev a1)
...

Detach all devices (including devices that belong to the same IOMMU group) to be passed to the virtual machine (script detach.sh attached):
$ sudo ./detach.sh 0004:04:00.0

Run the virtual machine:
$ sudo qemu-system-ppc64 -nodefaults \
-chardev stdio,id=STDIO0,signal=off,mux=on \
-device spapr-vty,id=svty0,reg=0x71000010,chardev=STDIO0 \
-mon id=MON0,chardev=STDIO0,mode=readline \
-nographic -vga none -enable-kvm \
-device nec-usb-xhci,id=nec-usb-xhci0 \
-m 16384M \
-chardev socket,id=SOCKET0,server,nowait,host=localhost,port=40000 \
-mon chardev=SOCKET0,mode=control \
-smp 16,threads=4 \
-netdev "user,id=USER0,hostfwd=tcp::2222-:22" \
-device "virtio-net-pci,id=vnet0,mac=C0:41:49:4b:00:00,netdev=USER0" \
-drive file=sda.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 \
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-device "vfio-pci,id=vfio0004_04_00_0,host=0004:04:00.0" \
-device "vfio-pci,id=vfio0006_00_00_0,host=0006:00:00.0" \
-device "vfio-pci,id=vfio0006_00_00_1,host=0006:00:00.1" \
-device "vfio-pci,id=vfio0006_00_00_2,host=0006:00:00.2" \
-global spapr-pci-host-bridge.pgsz=0x10011000 \
-global spapr-pci-vfio-host-bridge.pgsz=0x10011000 \
-cdrom ubuntu-18.04.1-server-ppc64el.iso \
-machine pseries

Install the system in the virtual machine and reboot. After booting in the installed virtual machine, download and install the drivers from cuda-repo-ubuntu1804-10-1-local-10.1.91-418.29_1.0-1_ppc64el.deb (NVidia website).

With all nvidia drivers installed, check the result of the following commands:

$ nvidia-smi
on Nov 5 21:11:33 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.72 Driver Version: 410.72 C...

Read more...

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Jose Ricardo Ziviani (joserz) wrote :

Hello, I tested the kernel with the changes and it works nice!

Thank you

root@ubuntu:~# numactl -H
available: 3 nodes (0,251-252)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
node 0 size: 392745 MB
node 0 free: 390074 MB
node 251 cpus:
node 251 size: 32256 MB
node 251 free: 32253 MB
node 252 cpus:
node 252 size: 32256 MB
node 252 free: 32252 MB
node distances:
node 0 251 252
  0: 10 40 40
 251: 40 10 40
 252: 40 40 10
root@ubuntu:~# nvidia-smi
Fri Apr 12 13:53:42 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.72 Driver Version: 410.72 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000001:00:02.0 Off | 0 |
| N/A 34C P0 41W / 300W | 3MiB / 32256MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla V100-SXM2... On | 00000001:00:08.0 Off | 0 |
| N/A 38C P0 43W / 300W | 3MiB / 32256MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

tags: added: verification-done-bionic
removed: verification-needed-bionic
Launchpad Janitor (janitor) wrote :
Download full text (14.6 KiB)

This bug was fixed in the package linux - 4.15.0-48.51

---------------
linux (4.15.0-48.51) bionic; urgency=medium

  * linux: 4.15.0-48.51 -proposed tracker (LP: #1822820)

  * Packaging resync (LP: #1786013)
    - [Packaging] update helper scripts
    - [Packaging] resync retpoline extraction

  * 3b080b2564287be91605bfd1d5ee985696e61d3c in ubuntu_btrfs_kernel_fixes
    triggers system hang on i386 (LP: #1812845)
    - btrfs: raid56: properly unmap parity page in finish_parity_scrub()

  * [P9][LTCTest][Opal][FW910] cpupower monitor shows multiple stop Idle_Stats
    (LP: #1719545)
    - cpupower : Fix header name to read idle state name

  * [amdgpu] screen corruption when using touchpad (LP: #1818617)
    - drm/amdgpu/gmc: steal the appropriate amount of vram for fw hand-over (v3)
    - drm/amdgpu: Free VGA stolen memory as soon as possible.

  * [SRU][B/C/OEM]IOMMU: add kernel dma protection (LP: #1820153)
    - ACPICA: AML parser: attempt to continue loading table after error
    - ACPI / property: Allow multiple property compatible _DSD entries
    - PCI / ACPI: Identify untrusted PCI devices
    - iommu/vt-d: Force IOMMU on for platform opt in hint
    - iommu/vt-d: Do not enable ATS for untrusted devices
    - thunderbolt: Export IOMMU based DMA protection support to userspace
    - iommu/vt-d: Disable ATS support on untrusted devices

  * Add basic support to NVLink2 passthrough (LP: #1819989)
    - powerpc/powernv/npu: Do not try invalidating 32bit table when 64bit table is
      enabled
    - powerpc/powernv: call OPAL_QUIESCE before OPAL_SIGNAL_SYSTEM_RESET
    - powerpc/powernv: Export opal_check_token symbol
    - powerpc/powernv: Make possible for user to force a full ipl cec reboot
    - powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn
    - powerpc/powernv: Move npu struct from pnv_phb to pci_controller
    - powerpc/powernv/npu: Move OPAL calls away from context manipulation
    - powerpc/pseries/iommu: Use memory@ nodes in max RAM address calculation
    - powerpc/pseries/npu: Enable platform support
    - powerpc/pseries: Remove IOMMU API support for non-LPAR systems
    - powerpc/powernv/npu: Check mmio_atsd array bounds when populating
    - powerpc/powernv/npu: Fault user page into the hypervisor's pagetable

  * Huawei Hi1822 NIC has poor performance (LP: #1820187)
    - net-next: hinic: fix a problem in free_tx_poll()
    - hinic: remove ndo_poll_controller
    - net-next/hinic: add checksum offload and TSO support
    - hinic: Fix l4_type parameter in hinic_task_set_tunnel_l4
    - net-next/hinic:replace multiply and division operators
    - net-next/hinic:add rx checksum offload for HiNIC
    - net-next/hinic:fix a bug in set mac address
    - net-next/hinic: fix a bug in rx data flow
    - net: hinic: fix null pointer dereference on pointer hwdev
    - hinic: optmize rx refill buffer mechanism
    - net-next/hinic:add shutdown callback
    - net-next/hinic: replace disable_irq_nosync/enable_irq

  * [CONFIG] please enable highdpi font FONT_TER16x32 (LP: #1819881)
    - Fonts: New Terminus large console font
    - [Config]: enable highdpi Terminus 16x32 font support

  * [19.04 FEAT] qeth: Enhanced link...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers