qemu crashes with VGA pass-through, e-GPU, nvidia 1060

Bug #1897481 reported by Sergiy K
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Expired
Undecided
Unassigned

Bug Description

I try to pass-through nvidia 1060 6gb card, which is connected via ExpressCard (EXP-GDC converter).

I can successfully run my virtual machine without pass-through, but when I try to add the devices, qemu crashes.

The coredump contains:

Stack trace of thread 3289311:
#0 0x0000000000614c49 memory_region_update_container_subregions (qemu-system-x86_64 + 0x214c49)
#1 0x00000000005c0e8c vfio_probe_nvidia_bar0_quirk (qemu-system-x86_64 + 0x1c0e8c)
#2 0x00000000005bcec0 vfio_realize (qemu-system-x86_64 + 0x1bcec0)
#3 0x000000000079b423 pci_qdev_realize (qemu-system-x86_64 + 0x39b423)
#4 0x00000000006facda device_set_realized (qemu-system-x86_64 + 0x2facda)
#5 0x0000000000887e57 property_set_bool (qemu-system-x86_64 + 0x487e57)
#6 0x000000000088ac48 object_property_set (qemu-system-x86_64 + 0x48ac48)
#7 0x000000000088d1d2 object_property_set_qobject (qemu-system-x86_64 + 0x48d1d2)
#8 0x000000000088b1f7 object_property_set_bool (qemu-system-x86_64 + 0x48b1f7)
#9 0x0000000000693785 qdev_device_add (qemu-system-x86_64 + 0x293785)
#10 0x000000000061aad0 device_init_func (qemu-system-x86_64 + 0x21aad0)
#11 0x000000000098c87b qemu_opts_foreach (qemu-system-x86_64 + 0x58c87b)
#12 0x00000000006211cb qemu_init (qemu-system-x86_64 + 0x2211cb)
#13 0x00000000005002aa main (qemu-system-x86_64 + 0x1002aa)
#14 0x00007fce8af21152 __libc_start_main (libc.so.6 + 0x28152)
#15 0x000000000050087e _start (qemu-system-x86_64 + 0x10087e)

The whole running command is pretty long, since I use libvirt to manage my machines:

LC_ALL=C \
PATH=/usr/local/sbin:/usr/local/bin:/usr/bin \
HOME=/var/lib/libvirt/qemu/domain-2-Win10 \
XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-2-Win10/.local/share \
XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-2-Win10/.cache \
XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-2-Win10/.config \
QEMU_AUDIO_DRV=spice \
/usr/bin/qemu-system-x86_64 \
-name guest=Win10,debug-threads=on \
-S \
-blockdev '{"driver":"file","filename":"/usr/share/edk2-ovmf/x64/OVMF_CODE.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"}' \
-blockdev '{"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/Win10_VARS.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"}' \
-machine pc-q35-5.1,accel=kvm,usb=off,vmport=off,dump-guest-core=off,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format \
-cpu host,migratable=on,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff \
-m 8192 \
-overcommit mem-lock=off \
-smp 2,sockets=2,cores=1,threads=1 \
-uuid 7043c77b-4903-4527-8089-9679d9a17fee \
-no-user-config \
-nodefaults \
-chardev stdio,mux=on,id=charmonitor \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=localtime,driftfix=slew \
-global kvm-pit.lost_tick_policy=delay \
-no-hpet \
-no-shutdown \
-global ICH9-LPC.disable_s3=1 \
-global ICH9-LPC.disable_s4=1 \
-boot strict=on \
-device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 \
-device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 \
-device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 \
-device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 \
-device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 \
-device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 \
-device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 \
-device virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 \
-blockdev '{"driver":"file","filename":"/home/sergiy/VirtualBox VMs/win4games.img","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-2-format","read-only":false,"driver":"raw","file":"libvirt-2-storage"}' \
-device ide-hd,bus=ide.0,drive=libvirt-2-format,id=sata0-0-0,bootindex=1 \
-blockdev '{"driver":"file","filename":"/home/sergiy/Downloads/Win10_2004_Ukrainian_x64.iso","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-1-format","read-only":true,"driver":"raw","file":"libvirt-1-storage"}' \
-device ide-cd,bus=ide.1,drive=libvirt-1-format,id=sata0-0-1 \
-chardev pty,id=charserial0 \
-device isa-serial,chardev=charserial0,id=serial0 \
-chardev spicevmc,id=charchannel0,name=vdagent \
-device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 \
-spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on \
-device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 \
-chardev spicevmc,id=charredir0,name=usbredir \
-device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=1 \
-chardev spicevmc,id=charredir1,name=usbredir \
-device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=2 \
-device vfio-pci,host=0000:04:00.0,id=hostdev0,bus=pci.4,multifunction=on,addr=0x0 \
-device vfio-pci,host=0000:04:00.1,id=hostdev1,bus=pci.4,addr=0x0.0x1 \
-device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on

I've forced vfio_pci module for the VGA, and ensured that lspci shows

  Kernel driver in use: vfio_pci

My laptop is Thinkpad x230, that runs on Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz.
I run 5.8.6-1-MANJARO kernel and run QEMU emulator version 5.1.0.

Thank you for your attention. I'd love to provide more information, but I don't know what else matters.

CVE References

Revision history for this message
Alex Williamson (alex-l-williamson) wrote :

Please attach output from `dmesg` and `sudo lspci -vvv`, both from the host. Laptops typically don't provide sufficient resources for GPUs attached like this, so my guess is that we're trying to add a quirk on top of a BAR that isn't mapped. If that's the case, the following host kernel options might help: pci=realloc,assign-busses,nocrs

Revision history for this message
Sergiy K (sergey-kukunin) wrote :
Download full text (119.1 KiB)

dmesg:

[ 0.000000] microcode: microcode updated early to revision 0x21, date = 2019-02-13
[ 0.000000] Linux version 5.8.6-1-MANJARO (builder@db927223e331) (gcc (GCC) 10.2.0, GNU ld (GNU Binutils) 2.35) #1 SMP PREEMPT Thu Sep 3 14:19:36 UTC 2020
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.8-x86_64 root=UUID=f04fa3cc-b1c5-433a-896b-7194abdefa13 rw resume=UUID=f04fa3cc-b1c5-433a-896b-7194abdefa13 resume_offset=7829504 intel_iommu=on quiet resume=UUID=f04fa3cc-b1c5-433a-896b-7194abdefa13 resume_offset=7829504 intel_iommu=on
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel
[ 0.000000] AMD AuthenticAMD
[ 0.000000] Hygon HygonGenuine
[ 0.000000] Centaur CentaurHauls
[ 0.000000] zhaoxin Shanghai
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
[ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000008ffff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000000090000-0x00000000000bffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000001fffffff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000020000000-0x00000000201fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000020200000-0x0000000040003fff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000040004000-0x0000000040004fff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000040005000-0x00000000cfef6fff] usable
[ 0.000000] BIOS-e820: [mem 0x00000000cfef7000-0x00000000d00f8fff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000d00f9000-0x00000000d684efff] usable
[ 0.000000] BIOS-e820: [mem 0x00000000d684f000-0x00000000d6a4efff] type 20
[ 0.000000] BIOS-e820: [mem 0x00000000d6a4f000-0x00000000dae9efff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000dae9f000-0x00000000daf9efff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x00000000daf9f000-0x00000000daffefff] ACPI data
[ 0.000000] BIOS-e820: [mem 0x00000000dafff000-0x00000000daffffff] usable
[ 0.000000] BIOS-e820: [mem 0x00000000db000000-0x00000000df9fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000f80f8000-0x00000000f80f8fff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000041e5fffff] usable
[ 0.000000] BIOS-e820: [mem 0x000000041e600000-0x000000041effffff] reserved
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] efi: EFI v2.31 by Lenovo
[ 0.000000] efi: ACPI 2.0=0xdaffe014 ACPI=0xdaffe000 SMBIOS=0xdae9e000
[ 0.000000] SMBIOS 2.7 present.
[ 0.000000] DMI: LENOVO 2325KZ5/2325KZ5, BIOS G2ETB5WW (2.75 ) 04/09/2019
[ 0.000000] tsc: Fast TSC calibration using PIT
[ 0.000000] tsc: Detected 2594.172 MHz processor
[ 0.000921] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[ 0.000923] e820: remove [mem ...

Revision history for this message
Sergiy K (sergey-kukunin) wrote :

Thank you Alex for answering me.

It seems, I've got it working, if I boot the host with the connected GPU from the very beginning.
Previously, I tried hotplug and it crashes.

So previously I had:
  1. enable the host
  2. enable GPU
  3. connect the cable

And this time I tried:
  1. enable GPU
  2. connect the cable
  3. enable the host

And this works great. Actually, I was able to install nvidia drivers to the Win10 guest and it runs well.

Now, I'm not sure if there is a bug. From one side, it might be an expected requirement to exclude hotplug. From the other side, every crash is a bug, so there can be an extra check for that. It's up to you guys.

I'm thankful for your hard work and for the rocket science technologies I can use with my laptop.

I'm attaching dmesg for the fresh boot host with the GPU connected from the very beginning.

P.S. I'm sorry for the big files. I've just noticed the ability to upload attachments.

Revision history for this message
Sergiy K (sergey-kukunin) wrote :

What's more interesting, it doesn't crash if I hotplug GPU after it was boot with it. So if I do

  1. enable GPU
  2. connect the cord
  3. enable the host
  4. run qemu (I'm not sure, if it's mandatory)
  5. disable cord
  6. disable GPU
  7. enable GPU
  8. enable cord
  9. run qemu again

qemu doesn't crash. but the windows guest doesn't load too - it just hangs with a single core 100% load.

Not sure, if it's related, but trying to provide as much info as possible

Revision history for this message
Alex Williamson (alex-l-williamson) wrote :

There are definitely resource allocation issues on the host in the crashing case. The quirks currently enumerate the device BARs without testing them, we identify a device and know what the resources should be, which is why I think QEMU crashes. Are you able to test if the patch below is sufficient to resolve the crash? I'd expect the GPU not to work in the guest as it doesn't have enough resources, but the goal would be to resolve the crash; QEMU cannot fix the device mappings on the host.

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 0d83eb0e47bb..10477af9fc14 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2921,7 +2921,9 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
     }

     for (i = 0; i < PCI_ROM_SLOT; i++) {
- vfio_bar_quirk_setup(vdev, i);
+ if (vdev->bars[i].size) {
+ vfio_bar_quirk_setup(vdev, i);
+ }
     }

     if (!vdev->igd_opregion &&

Revision history for this message
Alex Williamson (alex-l-williamson) wrote :

non-mangled patch

Revision history for this message
Sergiy K (sergey-kukunin) wrote :

Can confirm that it does not crash after applying that patch. I've added the `fprintf` statement there:

        if (vdev->bars[i].size) {
          vfio_bar_quirk_setup(vdev, i);
        } else {
            fprintf(stderr, "%04x:%04x bars for %d are empty\n", vdev->vendor_id, vdev->device_id, i);
        }

and the output is:

    10de:1c03 bars for 0 are empty
    10de:1c03 bars for 1 are empty
    10de:1c03 bars for 2 are empty
    10de:1c03 bars for 3 are empty
    10de:1c03 bars for 4 are empty
    10de:10f1 bars for 1 are empty
    10de:10f1 bars for 2 are empty
    10de:10f1 bars for 3 are empty
    10de:10f1 bars for 4 are empty
    10de:10f1 bars for 5 are empty

What's interesting that 5 bar is available for VGA and 0 bar is available for the sound. Don't know if it gives some valuable information.

I understand that it's completely not a fault of QEMU, since the underlying layer gives wrong information. Any insight about potential problematic places? Is it completely a hardware issue (laptop's BIOS, nvidia) or something can be done in software? What's the next place to send a bugreport?

Thank you

Revision history for this message
Sergiy K (sergey-kukunin) wrote :
Download full text (3.4 KiB)

I recorded both lspci -vvvv and lspci -xxxx for the following connections:

  - hotplug: when GPU is connected after the host was loaded
  - fresh: when GPU is connected before the host was started

The main difference is the following:

1c1
< # hotplug
---
> # fresh
6c6
< Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
---
> Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
8c8
< Interrupt: pin A routed to IRQ 18
---
> Interrupt: pin A routed to IRQ 255
10,13c10,14
< Region 1: Memory at <unassigned> (64-bit, prefetchable) [disabled]
< Region 3: Memory at <unassigned> (64-bit, prefetchable) [disabled]
< Region 5: I/O ports at 4000 [size=128]
< Expansion ROM at f1400000 [virtual] [disabled] [size=512K]
---
> Region 0: Memory at f0000000 (32-bit, non-prefetchable) [size=16M]
> Region 1: Memory at c0000000 (64-bit, prefetchable) [size=256M]
> Region 3: Memory at d0000000 (64-bit, prefetchable) [size=32M]
> Region 5: I/O ports at 4000 [disabled] [size=128]
> Expansion ROM at f1080000 [disabled] [size=512K]
30c31
< LnkSta: Speed 5GT/s (downgraded), Width x1 (downgraded)
---
> LnkSta: Speed 2.5GT/s (downgraded), Width x1 (downgraded)
35a37
> AtomicOpsCap: 32bit- 64bit- 128bitCAS-
79c81
< Interrupt: pin B routed to IRQ 19
---
> Interrupt: pin B routed to IRQ 255
81c83
< Region 0: Memory at f1480000 (32-bit, non-prefetchable) [size=16K]
---
> Region 0: Memory at f1000000 (32-bit, non-prefetchable) [size=16K]
98c100
< LnkSta: Speed 5GT/s (downgraded), Width x1 (downgraded)
---
> LnkSta: Speed 2.5GT/s (downgraded), Width x1 (downgraded)
124,125c126,127

I can tell, that hotplug connects as 5GT/s and fresh - 2.5GT/s. And there is an obvious difference between Regions.

The difference between lspci -xxxx but I don't know how to interpret the result:

124,125c126,127
< 00: de 10 03 1c 01 00 10 00 a1 00 00 03 00 00 80 00
< 10: 00 00 00 00 0c 00 00 00 00 00 00 00 0c 00 00 00
---
> 00: de 10 03 1c 02 00 10 00 a1 00 00 03 10 00 80 00
> 10: 00 00 00 f0 0c 00 00 c0 00 00 00 00 0c 00 00 d0
127c129
< 30: 00 00 00 00 60 00 00 00 00 00 00 00 00 01 00 00
---
> 30: 00 00 f8 ff 60 00 00 00 00 00 00 00 ff 01 00 00
132c134
< 80: 10 29 09 00 03 3d 45 00 43 01 12 10 00 00 00 00
---
> 80: 10 29 09 00 03 3d 45 00 43 01 11 10 00 00 00 00
198c200
< 4a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 78
---
> 4a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff
221c223
< 610: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
---
> 610: 01 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00
257c259
< 850: 00 00 00 00 78 00 00 00 ff 3f 00 00 00 00 00 00
---
> 850: 00 00 00 00 af 04 00 00 ff 3f 00 00 00 00 00 00
382,383c384,385
< 00: de 10 f1 10 02 00 10 00 a1 00 03 04 00 00 80 00
< 10: 00 00 48 f1 00 00 00 00 00 00 00 00 00 00 00 00
---
> 00: de 10 f1 10 02 00 10 00 a1 00 03 04 10 00 80 00
> 10: 00 00 00 f1 00 00 00 00 00 00 00 00 00 00 00 00
385c387
< 30: 00 00 00 00 60 00 00 00 00 00 00 00 00 02 00 00
---
> 30: 00 00 00 00 60 00 00 00 00 00 00 00 ff 02 00 00
390c392
< 80: 10 29 09 00 03 3d 45 00 43 01 12 10 00 00 00 00
---
> 80: 10 29 09 00 03 3d 45 00 43...

Read more...

Revision history for this message
Thomas Huth (th-huth) wrote :

The QEMU project is currently moving its bug tracking to another system.
For this we need to know which bugs are still valid and which could be
closed already. Thus we are setting the bug state to "Incomplete" now.

If the bug has already been fixed in the latest upstream version of QEMU,
then please close this ticket as "Fix released".

If it is not fixed yet and you think that this bug report here is still
valid, then you have two options:

1) If you already have an account on gitlab.com, please open a new ticket
for this problem in our new tracker here:

    https://gitlab.com/qemu-project/qemu/-/issues

and then close this ticket here on Launchpad (or let it expire auto-
matically after 60 days). Please mention the URL of this bug ticket on
Launchpad in the new ticket on GitLab.

2) If you don't have an account on gitlab.com and don't intend to get
one, but still would like to keep this ticket opened, then please switch
the state back to "New" or "Confirmed" within the next 60 days (other-
wise it will get closed as "Expired"). We will then eventually migrate
the ticket automatically to the new system (but you won't be the reporter
of the bug in the new system and thus you won't get notified on changes
anymore).

Thank you and sorry for the inconvenience.

Changed in qemu:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for QEMU because there has been no activity for 60 days.]

Changed in qemu:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.