Add support for NVIDIA GPU passthrough

Bug #1800649 reported by Jose Ricardo Ziviani
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Won't Fix
Undecided
Jose Ricardo Ziviani
Bionic
Won't Fix
Undecided
Unassigned
qemu (Ubuntu)
Won't Fix
Wishlist
Unassigned
Bionic
Won't Fix
Undecided
Unassigned

Bug Description

This bug will keep track of an important feature that is being developed upstream and need to be backported to Ubuntu 18.04 PPC64. The feature allows QEMU/KVM guests to have NVIDIA GPUs passed-through.

affects: launchpad → qemu-kvm
Changed in qemu-kvm:
assignee: nobody → Jose Ricardo Ziviani (w-jose)
Revision history for this message
Juerg Haefliger (juergh) wrote :
affects: qemu-kvm → linux
affects: linux → linux (Ubuntu)
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1800649

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: patch
Revision history for this message
Jose Ricardo Ziviani (joserz) wrote :
Download full text (3.4 KiB)

The file passthrough-patches.tar.gz contains the following patches:

0001-kvm-no-need-to-check-return-value-of-debugfs_create-.patch
0002-powerpc-powernv-idoa-Remove-unnecessary-pcidev-from-.patch
0003-powerpc-Use-sizeof-foo-rather-than-sizeof-struct-foo.patch
0004-powerpc-powernv-npu-Do-not-try-invalidating-32bit-ta.patch
0005-powerpc-ioda-Use-ibm-supported-tce-sizes-for-IOMMU-p.patch
0006-powerpc-io-Add-__raw_writeq_be-__raw_rm_writeq_be.patch
0007-powerpc-powernv-Use-__raw_-rm_-writeq_be-in-pci-ioda.patch
0008-powerpc-powernv-ioda2-Remove-redundant-free-of-TCE-p.patch
0009-powerpc-powernv-ioda2-Reduce-upper-limit-for-DMA-win.patch
0010-Revert-cxl-Add-kernel-API-to-allow-a-context-to-oper.patch
0011-Revert-cxl-Add-support-for-interrupts-on-the-Mellano.patch
0012-Revert-cxl-Add-cxl_check_and_switch_mode-API-to-swit.patch
0013-Revert-cxl-Add-support-for-using-the-kernel-API-with.patch
0014-Revert-powerpc-powernv-Add-support-for-the-cxl-kerne.patch
0015-Revert-cxl-Add-cxl_slot_is_supported-API.patch
0016-cxl-Remove-abandonned-capi-support-for-the-Mellanox-.patch
0017-powerpc-powernv-ioda2-Add-256M-IOMMU-page-size-to-th.patch
0018-powerpc-powernv-Remove-useless-wrapper.patch
0019-powerpc-powernv-Move-TCE-manupulation-code-to-its-ow.patch
0020-KVM-PPC-Make-iommu_table-it_userspace-big-endian.patch
0021-powerpc-powernv-Add-indirect-levels-to-it_userspace.patch
0022-powerpc-powernv-Rework-TCE-level-allocation.patch
0023-powerpc-powernv-ioda-Allocate-indirect-TCE-levels-on.patch
0024-KVM-PPC-Validate-all-tces-before-updating-tables.patch
0025-KVM-PPC-Inform-the-userspace-about-TCE-update-failur.patch
0026-KVM-PPC-Validate-TCEs-against-preregistered-memory-p.patch
0027-KVM-PPC-Avoid-marking-DMA-mapped-pages-dirty-in-real.patch
0028-KVM-PPC-Propagate-errors-to-the-guest-when-failed-in.patch
0029-KVM-PPC-Remove-redundand-permission-bits-removal.patch
0030-vfio-pci-Quiet-broken-INTx-whining-when-INTx-is-unsu.patch
0031-KVM-PPC-Book3S-HV-Add-a-debugfs-file-to-dump-radix-m.patch
0032-cxl-Remove-unused-include.patch
0033-powerpc-powernv-ioda2-Reduce-upper-limit-for-DMA-win.patch
0034-powerpc-powernv-ioda-Allocate-indirect-TCE-levels-of.patch
0035-powerpc-pseries-iommu-Allow-dynamic-window-to-start-.patch
0036-KVM-PPC-Optimize-clearing-TCEs-for-sparse-tables.patch
0037-powerpc-powernv-npu-Add-a-debugfs-setting-to-change-.patch
0038-powerpc-powernv-npu-Remove-unused-headers-and-a-macr.patch
0039-KVM-PPC-Expose-userspace-mm-context-id-via-debugfs.patch
0040-powerpc-ioda-npu2-Call-hot-reset-skiboot-hook-when-d.patch
0041-vfio-spapr_tce-Get-rid-of-possible-infinite-loop.patch
0042-vfio-spapr_tce-Simplify-page-contained-test.patch
0043-powerpc-iommu_context-Change-referencing-in-API.patch
0044-powerpc-iommu-Do-not-pin-memory-of-a-memory-device.patch
0045-vfio_pci-Allow-mapping-extra-regions.patch
0046-vfio_pci-Allow-regions-to-add-own-capabilities.patch
0047-powerpc-powernv-npu-Simplify-nestMMU-flush-flag-copy.patch
0048-powerpc-npu-dma-Add-helper-to-access-struct-npu-for-.patch
0049-powerpc-powernv-npu-Collect-all-static-symbols-under.patch
0050-FIXME-powerpc-powernv-Detach-npu-struct-from-pnv_phb.patch
0051-powerpc-pseries-iommu-Force-default-DMA-window-remov...

Read more...

Revision history for this message
Jose Ricardo Ziviani (joserz) wrote :

The patch attached here contains the necessary to build QEMU with support to nvlink2 passthrough:

0001-qdev-Use-string-for-QOM-string-properties.patch
0002-ppc-spapr-Receive-and-store-device-tree-blob-from-SL.patch
0003-DBG-store-fdt.patch
0004-vfio-spapr-Fix-indirect-levels-calculation.patch
0005-headers-update.patch
0006-pci-Move-NVIDIA-vendor-id-to-the-rest-of-ids.patch
0007-RFC-vfio-nvidia-v100-Disable-VBIOS-update.patch
0008-spapr-iommu-Always-advertise-the-maximum-possible-DM.patch
0009-FIXME-vfio-Do-not-replay-IOMMU-mappings.patch
0010-vfio-Make-vfio_get_region_info_cap-public.patch
0011-vfio-spapr-Try-allocating-less-levels-if-failed-with.patch
0012-spapr-Add-NVLink2-memory-to-PHB-placement.patch
0013-spapr-Create-ibm-gpu-and-ibm-npu-cross-links-in-the-.patch
0014-spapr-vfio-Map-GPU-RAM-and-advertise-to-the-guest.patch
0015-debug-Disables-KVM-TCE-acceleration-and-direct-ATSD-.patch

https://github.com/aik/qemu/tree/nv2

Juerg Haefliger (juergh)
summary: - Add support to NVIDIA GPU passthrough
+ Add support for NVIDIA GPU passthrough
Juerg Haefliger (juergh)
Changed in linux (Ubuntu Bionic):
status: New → Fix Released
Changed in linux (Ubuntu):
status: Incomplete → Fix Released
status: Fix Released → Won't Fix
Changed in linux (Ubuntu Bionic):
status: Fix Released → Won't Fix
Revision history for this message
Juerg Haefliger (juergh) wrote :

The kernel patches were delivered via a custom kernel.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Download full text (4.0 KiB)

Hi,
we can certainly help trying to make a test PPA available for your testing backporting what you provided here.

I wonder about the git tree at [1]. It is already further than what was reported a few days before. It is based on qemu v3.1.0-rc0 at the moment and has 11 patches . Some of the series listed here, some new ones. In any case it does not seem complete or fully upstreamed yet.
But eventually these things should be upstream in qemu before being picked into the current Ubuntu development release (19.04) and then planning SRUs from there.

Since I don't know about the interim state of the repo I'll pick the listed changes for now.

I was told that your target is Ubuntu 18.04 which means qemu 2.11 has to be the target unless you are fine with e.g. Ubuntu Cloud Archive Stein [3] which would eventually if the current plan holds true be on qemu 3.1 at least - much closer for the backports.
Or if it will be a PPA based solution anyway something like qemu 3.1 + changes in that PPA.

The backports of that series provided here to qemu 2.11 which would be needed for a real SRU release into Ubuntu 18.04:

OK 0001-qdev-Use-string-for-QOM-string-properties.patch

NOISE in 0002-ppc-spapr-Receive-and-store-device-tree-blob-from-SL.patch
  spapr_machine_reset was still ppc_spapr_reset
  Also this changes vmstate which I think would create a migration issue, so we'd need an extra
  machine type I think.
  This always as a big NONO for real SRUs - this makes this even more an only-PPA solution IMHO.

OK 0003-DBG-store-fdt.patch

Minor Noise 0004-vfio-spapr-Fix-indirect-levels-calculation.patch

Noise 0005-headers-update.patch
Noise 0006-pci-Move-NVIDIA-vendor-id-to-the-rest-of-ids.patch
Noise 0007-RFC-vfio-nvidia-v100-Disable-VBIOS-update.patch
Noise 0008-spapr-iommu-Always-advertise-the-maximum-possible-DM.patch
  The noise for all these is not too much form a code level, the context can be found.
  But I get the feeling I'll see this as a compile fail missing
  some other changes it depends on.

OK 0009-FIXME-vfio-Do-not-replay-IOMMU-mappings.patch

Minor Noise 0010-vfio-Make-vfio_get_region_info_cap-public.patch

OK - 0011-vfio-spapr-Try-allocating-less-levels-if-failed-with.patch

Noise - 0012-spapr-Add-NVLink2-memory-to-PHB-placement.patch

OK - 0013-spapr-Create-ibm-gpu-and-ibm-npu-cross-links-in-the-.patch

Noise - 0014-spapr-vfio-Map-GPU-RAM-and-advertise-to-the-guest.patch

Noise - 0015-debug-Disables-KVM-TCE-acceleration-and-direct-ATSD-.patch
  In general this is WIP still, because that makes not much sense:
  6 cap_spapr_vfio = kvm_vm_check_extension(s, KVM_CAP_SPAPR_TCE_VFIO);
  7 + cap_spapr_vfio = false;

With the related kernel being a special kernel I can't SRU it generally anyway.
And all the problems outlined above emphasize that.
So I wonder what is the actual delivery mechanism and content expected here?
Would a qemu 3.1 in a PP alongside the kernel do it - there the backport distance certainly would be much less?
That could eventually be replaced in the PPA with the more polished final qmeu 3.1 we will create for Ubuntu 19.04.

I have pushed a packetized and backported version of this at [4] but I'm 98% sure...

Read more...

Changed in qemu (Ubuntu):
status: New → Incomplete
Revision history for this message
Jose Ricardo Ziviani (joserz) wrote :

@Juerg Hello,

I have a new patchset. It has all the necessary fixes to stabilize nvlink2 passthrough, which required more patches backported.

I'm attaching the tar.gz here. You can also find it at https://github.com/jrziviani/linux-devel/tree/nv2_scratch_181203_bionic

It's built on top of ubuntu-bionic/master-next (Ubuntu-4.15.0-43.46).

Patches from 1 to 91 are upstream, others can be found at https://github.com/aik/linux/tree/nv2.

Thank you very much!

Revision history for this message
Jose Ricardo Ziviani (joserz) wrote :

@Christian, hello!

I completed the backport to QEMU v2.11.2, but more patches were required in order to make it work.

The patchset is attached here but it can be found also at https://github.com/jrziviani/qemu-devel/tree/nv2_scratch_181204_bionic.

Patches from 1 to 34 are already upstream, from 34 to 45 are under review.

Thank you very much!

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Jose, I'm puzzled that you update that series here. AFAIK per discussions this project will run it's own custom qemu, so for I'm not considering the backports for the main Archive (for everybody).

Changed in qemu (Ubuntu Bionic):
status: New → Won't Fix
Changed in qemu (Ubuntu):
importance: Undecided → Wishlist
Revision history for this message
Jose Ricardo Ziviani (joserz) wrote :

Hello Christian!

That's a good point. :-) As far as I know the project should be based on top of v2.11.2 (at least, this is what we've been testing).

I'll try to confirm it here.

Thank you

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

Add support for NVIDIA GPU passthrough
Is this just a duplicate of LP 1800649? (Now coming in via the ubuntu-power-system project.)
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1800649
or is it meant to be the libvirt part of LP 1800649,
or is this a request to move this feature over from the custom to the generic package(s)?

Please provide some more details - thx.

------- Comment From <email address hidden> 2018-12-10 10:15 EDT-------
This is a dup of 1800649, it was meant to create an IBM side mirror it instead of a new bug. I'm not sure why it's that way, but asked if we can get it fixed.

tags: added: architecture-ppc64le bugnameltc-173957 severity-medium targetmilestone-inin1804
Joshua Powers (powersj)
Changed in qemu (Ubuntu):
status: Incomplete → Won't Fix
Revision history for this message
Jose Ricardo Ziviani (joserz) wrote :

Hi Christian,

You're right, we're going to use our customized QEMU.

Thank you!!

Jose

Revision history for this message
Jose Ricardo Ziviani (joserz) wrote :

Hello Juerg,

Thanks for the packages provided. I made a initial setup and it looks good. However, a bug was identified and a new patch should be applied on top of my previous patchset. Would you mind to include it?

The patch is: https://github.com/jrziviani/linux-devel/commit/b466f257ada3b408be3ca475ce6253b80efc252c
(also attached).

Thank you very much,

Jose

Revision history for this message
Juerg Haefliger (juergh) wrote :

Jose, I'm confused. I never provided a kernel with the patches from comment #8 yet you did some testing and need a follow-on patch? What kernel did you test?

Also, this ticket is closed as "won't fix" so you should not add new requests to it. And since the patches are for the ibm-gt kernel (which is a private IBM kernel) all communication should go through SalesForce not LaunchPad.

Revision history for this message
Jose Ricardo Ziviani (joserz) wrote :

Juerg, sorry, my bad. I tested my own build. I was sure I had 'apt-get' that IBM-cloud, which didn't happen.

Anyway, I was not aware of that salesforce process. I saw this bug marked as "won't fix" without any further explanation, how could I know that? Anyway, how do I access the salesforce to make that request?

Thanks

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.