PCI USB card passthrough does not work any more

Bug #1781891 reported by John Bester
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

System information:
Ubuntu 18.04 LTS (server edition) with kernel 4.15.0-24-generic x86_64
Upgraded from Ubuntu server 17.10

Software:
qemu-kvm:
  Installed: 1:2.11+dfsg-1ubuntu7.4
  Candidate: 1:2.11+dfsg-1ubuntu7.4
  Version table:
 *** 1:2.11+dfsg-1ubuntu7.4 500
        500 http://za.archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages
        100 /var/lib/dpkg/status
     1:2.11+dfsg-1ubuntu7.3 500
        500 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages
     1:2.11+dfsg-1ubuntu7 500
        500 http://za.archive.ubuntu.com/ubuntu bionic/main amd64 Packages

Hardware:
Motherboard: X370 Killer SLI
CPU: AMD Ryzen 7 1800X

PCI device:
27:00.0 USB controller: VIA Technologies, Inc. VL805 USB 3.0 Host Controller (rev 01)
IOMMU Group 15 27:00.0 USB controller [0c03]: VIA Technologies, Inc. VL805 USB 3.0 Host Controller [1106:3483] (rev 01)

USB controller of PCI card:
Bus 003 Device 002: ID 2109:3431 VIA Labs, Inc. Hub

Loaded device drivers:

Before upgrading to Ubuntu 18.04, this PCI device was added to pci-stub.ids which allowed the device to be passed to a Windows 10 VM. In turn, all USB devices connected to this card worked in the VM and drivers could successfully be installed.

Since the upgrade from Ubuntu 17.10 to Ubuntu 18.04, I have tried several approaches to have this device not bound to the xhci driver but all in vain. (In every test I did, I always performed update-initramfs -u as well as update-grub)

pci-stub.ids does not stop xhci from grabbing device. So passing PCI card to VM does not work

Adding the device ID to /etc/modprobe.d/vfio.conf (options) does seem to load the connect the vfio driver to it, but xhci still binds to it as well, so passing PCI device to VM does not work.

Adding "0000:27:00.0,xhci" to /etc/unbindpci also did not work.

By adding the USB controller to the VM, USB devices connected to it does seem to be USB devices on the VM, but some of the drivers does not load correctly in Windows 10. For example, I need to install a device driver for a ROCKEY4 USB dongle and even though the driver installs (which must be done with device disconnected), the driver does not seem to ever bind correctly to the device because the software that uses the dongle does not recognise it.

I have successfully bound a PCI graphics adapter to the VM, so in principle PCI passthrough works, bit in the case of the USB PCI card there seems to be no way to pass the device to a VM.

Expected result:

PCI passthrough should be available to all types of PCI devices and instructions should be available from qemu or kvm documentation even though it involves different parts of the OS (such as making use of /etc/modprobe/vfio.conf, /etc/unbindpci etc)
---
AlsaVersion: Advanced Linux Sound Architecture Driver Version k4.15.0-24-generic.
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.1
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/controlC0', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Card0.Amixer.info: Error: [Errno 2] No such file or directory: 'amixer': 'amixer'
Card0.Amixer.values: Error: [Errno 2] No such file or directory: 'amixer': 'amixer'
DistroRelease: Ubuntu 18.04
HibernationDevice: RESUME=UUID=087ca1e6-4fd0-4a4b-a323-8b8ce733b3c7
InstallationDate: Installed on 2018-03-14 (124 days ago)
InstallationMedia: Ubuntu-Server 16.04.3 LTS "Xenial Xerus" - Release amd64 (20170801)
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb:
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: QEMU Standard PC (i440FX + PIIX, 1996)
Package: linux (not installed)
ProcFB: 0 qxldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-24-generic root=UUID=0286b7bc-6ce2-494c-89aa-6c4402876bad ro
ProcVersionSignature: Ubuntu 4.15.0-24.26-generic 4.15.18
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-24-generic N/A
 linux-backports-modules-4.15.0-24-generic N/A
 linux-firmware 1.173
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
Tags: bionic
Uname: Linux 4.15.0-24-generic x86_64
UpgradeStatus: Upgraded to bionic on 2018-05-11 (66 days ago)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 02/06/2015
dmi.bios.vendor: EFI Development Kit II / OVMF
dmi.bios.version: 0.0.0
dmi.chassis.type: 1
dmi.chassis.vendor: QEMU
dmi.chassis.version: pc-i440fx-artful
dmi.modalias: dmi:bvnEFIDevelopmentKitII/OVMF:bvr0.0.0:bd02/06/2015:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-artful:cvnQEMU:ct1:cvrpc-i440fx-artful:
dmi.product.name: Standard PC (i440FX + PIIX, 1996)
dmi.product.version: pc-i440fx-artful
dmi.sys.vendor: QEMU

Revision history for this message
John Bester (john-bester) wrote :
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Since at the time the problem occurs this is just host kernel/modules behaving not as they should I'll move that over to the kernel.

You might want to share what kernel/modules you have, but until then I'll just assume the default of 18.04.

affects: qemu-kvm (Ubuntu) → linux (Ubuntu)
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1781891

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: bionic
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Trying to recreate this, I have a xhci Host controller (no extra card, but as part of the chipset).
  00:14.0 USB controller: Intel Corporation C610/X99 series chipset USB xHCI Host Controller (rev 05)

00:14.0 USB controller: Intel Corporation C610/X99 series chipset USB xHCI Host Controller (rev 05) (prog-if 30 [XHCI])
        Subsystem: Hewlett-Packard Company C610/X99 series chipset USB xHCI Host Controller
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin D routed to IRQ 19
        NUMA node: 0
        Region 0: Memory at 39ffff00000 (64-bit, non-prefetchable) [size=64K]
        Capabilities: <access denied>
        Kernel driver in use: xhci_hcd

First check if it can be unbound:
echo 0000:00:14.0 | sudo tee /sys/bus/pci/devices/0000:00:14.0/driver/unbind

Works and I see:
[2071597.213764] xhci_hcd 0000:00:14.0: remove, state 4
[2071597.213778] usb usb5: USB disconnect, device number 1
[2071597.215019] xhci_hcd 0000:00:14.0: USB bus 5 deregistered
[2071597.215036] xhci_hcd 0000:00:14.0: remove, state 4
[2071597.215046] usb usb4: USB disconnect, device number 1
[2071597.215049] usb 4-3: USB disconnect, device number 2
[2071597.218160] xhci_hcd 0000:00:14.0: USB bus 4 deregistered

FYI: Libvirt should do the unbind/bind for you at runtime if you configured it as managed hostdev

Check ID
$ lspci -n -s 00:14.0
00:14.0 0c03: 8086:8d31 (rev 05)

Tell vfio-pci to handle that
$ echo 8086 8d31 | sudo tee /sys/bus/pci/drivers/vfio-pci/new_id
I get a fail to bind it like:
  vfio-pci: probe of 0000:00:14.0 failed with error

But then on this system I always failed to get vfio working due to FW issues (not a Linux issue).

You might try the above but for your device to initially rule out all of the modprobe/boot timing that might affect it.
After boot just try to:
1. unbind your device from xhci
2. make the ID known to vfio-pci
   (that should autoload it then)

Report back the kernel you have and the success or fail when doing so, along a dmesg log of the try.
That should clarify if we look at vfio-pci no more being able to load at all (above test fails) or just at how to prep cour config correctly so that it works again.

Revision history for this message
John Bester (john-bester) wrote : AlsaDevices.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
John Bester (john-bester) wrote : CRDA.txt

apport information

Revision history for this message
John Bester (john-bester) wrote : Card0.Codecs.codec.0.txt

apport information

Revision history for this message
John Bester (john-bester) wrote : CurrentDmesg.txt

apport information

Revision history for this message
John Bester (john-bester) wrote : Lspci.txt

apport information

Revision history for this message
John Bester (john-bester) wrote : PciMultimedia.txt

apport information

Revision history for this message
John Bester (john-bester) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
John Bester (john-bester) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
John Bester (john-bester) wrote : ProcEnviron.txt

apport information

Revision history for this message
John Bester (john-bester) wrote : ProcInterrupts.txt

apport information

Revision history for this message
John Bester (john-bester) wrote : ProcModules.txt

apport information

Revision history for this message
John Bester (john-bester) wrote : UdevDb.txt

apport information

Revision history for this message
John Bester (john-bester) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.18 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.18-rc5

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
tags: added: kernel-da-key
Revision history for this message
John Bester (john-bester) wrote :

I followed the instructions to unbind the device from xhci and bind it to vfio and afterwords I was able to boot the VM with the PCI card set up as pass-through. However, the Windows VM only listed it as an unknown PCI device and could not find the device drivers (VIA VL805) and I could not find any driver on the internet, so I tried to roll back the changes so that the one USB device that did work could still work. This is where I am stuck now since a reboot does not load the xhci device as before. I was under the impression unloading the device using the given instructions would only change it until I reboot since I have not changed anything in etc. I tried binding it to xhci again, but could not. I would really appreciate some help getting it back to how it was since I can only do these tests after hours.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi John,
so you got the forwarding working, but then couldn't use it in the guest - sad to hear after all the effort you did :-/
And to confirm: now you are trying to revert the to the former normal setup and just forward e.g. the USB devices itself.

I agree - all the settings in /sys about unbind/bind/vfio_pci would be lost on a reboot.
Reading that you now after a reboot no more get it bound "normally" by the Host kernel is odd, but the runtime changes I suggested are not the reason as they are lost.

Maybe one of your former /etc based modprobe/unbindpci changes now finally kick in and work?
Try to revert all those.

You can remove all the files you created in that regard, and in addition let dpdk check if there are others still modified to check those files.
$ sudo dpkg --verify

Revision history for this message
John Bester (john-bester) wrote :

First let me thank you for your quick and informative and very helpful responses.

Installing the mainline kernel is not really an option for the following reason: The server is at a client and was built about a month before Ubuntu 18.04 was released, so I opted to install Ubuntu Server 17.10 to get the latest updates in KVM and then upgrade to 18.04 as soon as it became available. This upgrade lead to PCI graphics adapter not forwarded (needed for 3D graphics used on VM) as well as the USB PCI card and it caused. Fortunately Christian got my heart rate back to normal after pointing out how I could fix the Graphics pass-through (which was at the time crucial to get the office firing on all cylinders). In the mean time I used other solutions to get round a number of USB issues (including falling back to a old VirtualBox VM for one application and using Dosbox for interfacing with a USB to Serial device). So at this point I am reluctant to try a kernel that is not released via the Ubuntu repositories as I think you can understand. How long do you think it will take for this current mainline kernel release you meniuned to make its way into Ubuntu repositories?

There is one side effect of the USB pass-through that I don't think I have mentioned to now, but I am not very clear on its relevance and even if this is a real issue. Every time I have ever worked with USB devices, I always relied on the USB device ID (e.g. 1d6b:0002). For some peculiar reason in the Windows 10 pro VM this ID is nowhere to be found when going through the device properties in the device manager. I only saw this after trying to identify the correct USB device for updating a driver and it might be that Windows just never show the ID, or it could be because of changes to the way USB devices connected to a USB controller card is now shared to a VM when only the card is passed to the VM using pass-through. The most likely explanation is that I try to steer clear of Windows wherever possible and therefore does not know too much on what to expect when I am faced with a situation where I have to look deeper. In any event, you would know best what to make of this.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

"I try to steer clear of Windows wherever possible and therefore does not know too much on what to expect when I am faced with a situation where I have to look deeper"

Applies to me as well in this case, but LMGTFY would solve it, first hit https://superuser.com/questions/1106247/how-can-i-get-the-vendor-id-and-product-id-for-a-usb-device/1106248

So it Windows would show it.

Revision history for this message
John Bester (john-bester) wrote :

Thanks. Some very good news: In a previous message I asked how I could get back to having USB card controlled by Linux USB driver. I still cannot tell whether a recent kernel upgrade bought about the change, but Linux is still handing over the device to vfio (without me changing anything in etc) and I figured out the driver problem. I tried to get a VIA driver, because I looked at the output of lspci where in I should have looked at the card manufacturer. Got the correct driver and all is good now - all attached USB drivers works as they did before the upgrade to Ubuntu 18.04. I did a complete shut down of the server and even removed the power supply as well to ensure that all hardware do a complete reset and after booting up again everything still worked as expected.

Changed in linux (Ubuntu):
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.