Make vfio-pci built-in or xhci_hcd optional

Bug #1770845 reported by Nazar Mokrynskyi
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Thadeu Lima de Souza Cascardo
Cosmic
Invalid
Medium
Unassigned
Focal
Fix Released
Medium
Thadeu Lima de Souza Cascardo
linux-kvm (Ubuntu)
New
Undecided
Unassigned
Cosmic
Invalid
Undecided
Unassigned
Focal
Fix Released
Medium
Unassigned

Bug Description

[Impact]
This allows vfio-pci to be bound to certain devices during boot, preventing other drivers from binding them.

In particular, USB host drivers, like xhci_hcd, are hard to unbind, as USB devices may end up being used by applications.

[Test case]
Boot the system with vfio-pci.ids=8086:1e31 or other device ID.

[Regression potential]
Check that VFIO does not bind to any other device.

--------------------------------------

Because of
nazar-pc@nazar-pc ~> cat '/boot/config-4.15.0-21-generic' | grep CONFIG_USB_XHCI_HCD
CONFIG_USB_XHCI_HCD=y

Following doesn't work:
nazar-pc@nazar-pc ~> cat /etc/modprobe.d/gpu-passthrough.conf
options vfio-pci ids=10de:1b06,10de:10ef,1b21:2142
softdep nouveau pre: vfio-pci
softdep xhci_hcd pre: vfio-pci

GPU is fine (first 2 IDs), but USB controller is always occupied by xhci_hcd and I can't change that while xhci_hcd is built-in.

I'd like you to resolve this issue by either embedding vfio-pci module too (prefered solution) or making xhci_hcd optional.

There are some discussions withot clean solution online like this: https://www.reddit.com/r/VFIO/comments/4o6wla/cant_get_vfiopci_to_bind_to_usb_30_card_at_boot/

ProblemType: Bug
DistroRelease: Ubuntu 18.10
Package: linux-image-4.15.0-21-generic 4.15.0-21.22
ProcVersionSignature: Ubuntu 4.15.0-21.22-generic 4.15.17
Uname: Linux 4.15.0-21-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version k4.15.0-21-generic.
ApportVersion: 2.20.10-0ubuntu2
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 1: S51 [SB Omni Surround 5.1], device 0: USB Audio [USB Audio]
   Subdevices: 0/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/pcmC1D0c: nazar-pc 2226 F...m pulseaudio
 /dev/snd/pcmC1D0p: nazar-pc 2226 F...m pulseaudio
 /dev/snd/controlC1: nazar-pc 2226 F.... pulseaudio
                      nazar-pc 2267 F.... volumeicon
Card1.Amixer.info:
 Card hw:1 'S51'/'Creative Technology Ltd SB Omni Surround 5.1 at usb-0000:00:14.0-9.2, full spee'
   Mixer name : 'USB Mixer'
   Components : 'USB041e:322c'
   Controls : 12
   Simple ctrls : 5
CurrentDesktop: Custom
Date: Sat May 12 16:44:11 2018
IwConfig: Error: [Errno 2] Немає такого файла або каталогу: 'iwconfig': 'iwconfig'
MachineType: Micro-Star International Co., Ltd. MS-7B45
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/root/boot/vmlinuz-4.15.0-21-generic root=UUID=5170aca4-061a-4c6c-ab00-bd7fc8ae6030 ro rootflags=subvol=root nosplash intel_pstate=disable scsi_mod.use_blk_mq=1 intel_iommu=on
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-21-generic N/A
 linux-backports-modules-4.15.0-21-generic N/A
 linux-firmware 1.173
RfKill: Error: [Errno 2] Немає такого файла або каталогу: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 03/29/2018
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: A.51
dmi.board.asset.tag: Default string
dmi.board.name: Z370 GAMING PRO CARBON (MS-7B45)
dmi.board.vendor: Micro-Star International Co., Ltd.
dmi.board.version: 2.0
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Micro-Star International Co., Ltd.
dmi.chassis.version: 2.0
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrA.51:bd03/29/2018:svnMicro-StarInternationalCo.,Ltd.:pnMS-7B45:pvr2.0:rvnMicro-StarInternationalCo.,Ltd.:rnZ370GAMINGPROCARBON(MS-7B45):rvr2.0:cvnMicro-StarInternationalCo.,Ltd.:ct3:cvr2.0:
dmi.product.family: Default string
dmi.product.name: MS-7B45
dmi.product.version: 2.0
dmi.sys.vendor: Micro-Star International Co., Ltd.

CVE References

Revision history for this message
Nazar Mokrynskyi (nazar-pc) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Triaged
Revision history for this message
Nazar Mokrynskyi (nazar-pc) wrote :

What is necessary for this change to happen?
To me it looks like just changing CONFIG_VFIO_PCI=m to CONFIG_VFIO_PCI=y would be enough and I don't think this may cause any regressions.

Revision history for this message
Johannes Wüller (jwueller) wrote :

Including VFIO in the kernel would be great! This would resolve all kinds of pass-through crashes.

Currently, the only way to do this is to hope that the USB controller behaves properly during reclamation by the kernel (most of them do not), which varies wildly by vendor. In my case, it gets as bad as having the whole system hang on shutdown due to a crash in the xhci_hcd driver. The ideal solution would be the ability to pass the device through directly to a VM guest, so that no cooperation by the controller is required.

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

Doing bind/unbind should work just fine, as reported on that same reddit discussion. libvirt also takes care of that.

I would rather not change anything here, if those solutions work just fine.

Cascardo.

Changed in linux (Ubuntu Cosmic):
status: Triaged → Opinion
Revision history for this message
Nazar Mokrynskyi (nazar-pc) wrote :

It does, but not working in 100% of cases, for instance when occupied by USB devices that are actively used by certain applications it was failing for me occasionally.

The only proper and clean solution is to be able to force the device to use vfio-pci, which is impossible with the kernel Ubuntu ships right now, hence this request to change kernel configuration.

Revision history for this message
Nazar Mokrynskyi (nazar-pc) wrote :

So according to Arch Wiki (https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#Isolating_the_GPU):
> Starting with Linux 4.18.16, vfio-pci is compiled-in as opposed to being a module

Does this mean the same will happen in Ubuntu's kernel?

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

I will send a patch making vfio-pci built-in.

Changed in linux (Ubuntu Focal):
status: Opinion → In Progress
assignee: nobody → Thadeu Lima de Souza Cascardo (cascardo)
description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (81.5 KiB)

This bug was fixed in the package linux - 5.4.0-18.22

---------------
linux (5.4.0-18.22) focal; urgency=medium

  * focal/linux: 5.4.0-18.22 -proposed tracker (LP: #1866488)

  * Packaging resync (LP: #1786013)
    - [Packaging] resync getabis
    - [Packaging] update helper scripts

  * Add sysfs attribute to show remapped NVMe (LP: #1863621)
    - SAUCE: ata: ahci: Add sysfs attribute to show remapped NVMe device count

  * [20.04 FEAT] Compression improvements in Linux kernel (LP: #1830208)
    - lib/zlib: add s390 hardware support for kernel zlib_deflate
    - s390/boot: rename HEAP_SIZE due to name collision
    - lib/zlib: add s390 hardware support for kernel zlib_inflate
    - s390/boot: add dfltcc= kernel command line parameter
    - lib/zlib: add zlib_deflate_dfltcc_enabled() function
    - btrfs: use larger zlib buffer for s390 hardware compression
    - [Config] Introducing s390x specific kernel config option CONFIG_ZLIB_DFLTCC

  * [UBUNTU 20.04] s390x/pci: increase CONFIG_PCI_NR_FUNCTIONS to 512 in kernel
    config (LP: #1866056)
    - [Config] Increase CONFIG_PCI_NR_FUNCTIONS from 64 to 512 starting with focal
      on s390x

  * CONFIG_IP_MROUTE_MULTIPLE_TABLES is not set (LP: #1865332)
    - [Config] CONFIG_IP_MROUTE_MULTIPLE_TABLES=y

  * Dell XPS 13 9300 Intel 1650S wifi [34f0:1651] fails to load firmware
    (LP: #1865962)
    - iwlwifi: remove IWL_DEVICE_22560/IWL_DEVICE_FAMILY_22560
    - iwlwifi: 22000: fix some indentation
    - iwlwifi: pcie: rx: use rxq queue_size instead of constant
    - iwlwifi: allocate more receive buffers for HE devices
    - iwlwifi: remove some outdated iwl22000 configurations
    - iwlwifi: assume the driver_data is a trans_cfg, but allow full cfg

  * [FOCAL][REGRESSION] Intel Gen 9 brightness cannot be controlled
    (LP: #1861521)
    - Revert "USUNTU: SAUCE: drm/i915: Force DPCD backlight mode on Dell Precision
      4K sku"
    - Revert "UBUNTU: SAUCE: drm/i915: Force DPCD backlight mode on X1 Extreme 2nd
      Gen 4K AMOLED panel"
    - SAUCE: drm/dp: Introduce EDID-based quirks
    - SAUCE: drm/i915: Force DPCD backlight mode on X1 Extreme 2nd Gen 4K AMOLED
      panel
    - SAUCE: drm/i915: Force DPCD backlight mode for some Dell CML 2020 panels

  * [20.04 FEAT] Enable proper kprobes on ftrace support (LP: #1865858)
    - s390/ftrace: save traced function caller
    - s390: support KPROBES_ON_FTRACE

  * alsa/sof: load different firmware on different platforms (LP: #1857409)
    - ASoC: SOF: Intel: hda: use fallback for firmware name
    - ASoC: Intel: acpi-match: split CNL tables in three
    - ASoC: SOF: Intel: Fix CFL and CML FW nocodec binary names.

  * [UBUNTU 20.04] Enable CONFIG_NET_SWITCHDEV in kernel config for s390x
    starting with focal (LP: #1865452)
    - [Config] Enable CONFIG_NET_SWITCHDEV in kernel config for s390x starting
      with focal

  * Focal update: v5.4.24 upstream stable release (LP: #1866333)
    - io_uring: grab ->fs as part of async offload
    - EDAC: skx_common: downgrade message importance on missing PCI device
    - net: dsa: b53: Ensure the default VID is untagged
    - net: fib_rules: Correctly set table field when table number exceeds 8 bit...

Changed in linux (Ubuntu Focal):
status: In Progress → Fix Released
Tim Gardner (timg-tpi)
Changed in linux-kvm (Ubuntu Focal):
status: New → In Progress
importance: Undecided → Medium
Juerg Haefliger (juergh)
Changed in linux-kvm (Ubuntu Cosmic):
status: New → Invalid
Changed in linux (Ubuntu Cosmic):
status: Opinion → Invalid
Tim Gardner (timg-tpi)
Changed in linux-kvm (Ubuntu Focal):
status: In Progress → Fix Committed
status: Fix Committed → In Progress
Changed in linux-kvm (Ubuntu Focal):
status: In Progress → Invalid
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Revision history for this message
Tim Gardner (timg-tpi) wrote :

Given that the main kernel has had this set for over a year, I'm going to call it good. Setting verification-done-focal for linux-kvm until proven otherwise. I don't see any interested parties in this bug that could perform verification anyway. Its been 3 years since the last non-Canonical comment.

tags: added: verification-done-focal
removed: verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (30.5 KiB)

This bug was fixed in the package linux-kvm - 5.4.0-1040.41

---------------
linux-kvm (5.4.0-1040.41) focal; urgency=medium

  * focal/linux-kvm: 5.4.0-1040.41 -proposed tracker (LP: #1927609)

  * Make vfio-pci built-in or xhci_hcd optional (LP: #1770845)
    - [Config]: kvm: CONFIG_VFIO=n

  * no memory hot-plugging in cloud-images (LP: #1925008)
    - [Config] kvm: CONFIG_ACPI_HOTPLUG_MEMORY=y

  [ Ubuntu: 5.4.0-74.83 ]

  * focal/linux: 5.4.0-74.83 -proposed tracker (LP: #1927619)
  * Introduce the 465 driver series, fabric-manager, and libnvidia-nscq
    (LP: #1925522)
    - debian/dkms-versions -- add NVIDIA 465 and migrate 450 to 460
  * linux-image-5.0.0-35-generic breaks checkpointing of container
    (LP: #1857257)
    - SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files
  * Enable CIFS GCM256 (LP: #1921916)
    - smb3: add defines for new crypto algorithms
    - smb3.1.1: add new module load parm require_gcm_256
    - smb3.1.1: add new module load parm enable_gcm_256
    - smb3.1.1: print warning if server does not support requested encryption type
    - smb3.1.1: rename nonces used for GCM and CCM encryption
    - smb3.1.1: set gcm256 when requested
    - cifs: Adjust key sizes and key generation routines for AES256 encryption
  * locking/qrwlock: Fix ordering in queued_write_lock_slowpath() (LP: #1926184)
    - locking/qrwlock: Fix ordering in queued_write_lock_slowpath()
  * [Ubuntu 21.04] net/mlx5: Fix HW spec violation configuring uplink
    (LP: #1925452)
    - net/mlx5: Fix HW spec violation configuring uplink
  * Focal update: v5.4.114 upstream stable release (LP: #1926493)
    - Revert "scsi: qla2xxx: Retry PLOGI on FC-NVMe PRLI failure"
    - Revert "scsi: qla2xxx: Fix stuck login session using prli_pend_timer"
    - scsi: qla2xxx: Dual FCP-NVMe target port support
    - scsi: qla2xxx: Fix device connect issues in P2P configuration
    - scsi: qla2xxx: Retry PLOGI on FC-NVMe PRLI failure
    - scsi: qla2xxx: Add a shadow variable to hold disc_state history of fcport
    - scsi: qla2xxx: Fix stuck login session using prli_pend_timer
    - scsi: qla2xxx: Fix fabric scan hang
    - net/sctp: fix race condition in sctp_destroy_sock
    - Input: nspire-keypad - enable interrupts only when opened
    - gpio: sysfs: Obey valid_mask
    - dmaengine: dw: Make it dependent to HAS_IOMEM
    - ARM: dts: Drop duplicate sha2md5_fck to fix clk_disable race
    - ARM: dts: Fix moving mmc devices with aliases for omap4 & 5
    - lockdep: Add a missing initialization hint to the "INFO: Trying to register
      non-static key" message
    - arc: kernel: Return -EFAULT if copy_to_user() fails
    - ASoC: max98373: Added 30ms turn on/off time delay
    - neighbour: Disregard DEAD dst in neigh_update
    - ARM: keystone: fix integer overflow warning
    - ARM: omap1: fix building with clang IAS
    - drm/msm: Fix a5xx/a6xx timestamps
    - ASoC: fsl_esai: Fix TDM slot setup for I2S mode
    - scsi: scsi_transport_srp: Don't block target in SRP_PORT_LOST state
    - net: ieee802154: stop dump llsec keys for monitors
    - net: ieee802154: forbid monitor for add llsec key
    - net: ieee802154: forbid monitor for del llsec ke...

Changed in linux-kvm (Ubuntu Focal):
status: Invalid → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.