GVTd not working (black screen) after upgrade to qemu-5.0.0

Bug #1879175 reported by TheCatFelix
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
New
Undecided
Unassigned

Bug Description

Hi QEMU team,

=== Problem Summary ===

I have recently upgraded from QEMU-3.1.0 to to QEMU-5.0.0 on Debian Unstable. Unfortunately GVTd (legacy passthrough of the integrated intel gpu) stopped working correctly. The guest can still see and loads the driver for the GPU, but the screen stays black.

The following is the version used:

$ /usr/bin/qemu-system-x86_64 --version
QEMU emulator version 5.0.0 (Debian 1:5.0-5)
Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers

=== Investigation/Triage done so far ===

Running QEMU with trace flags enabled shows the following behavior change for the same VM (left: 3.1.0, right: 5.0.0):

vfio_realize (0000:00:02.0) group 1 vfio_realize (0000:00:02.0) group 1
vfio_listener_region_add_ram region_add [ram] 0x0 - 0xbffff [0x7f5b41e00000] | vfio_listener_region_add_ram region_add [ram] 0x0 - 0xbffff [0x7f2bb1e00000]
vfio_listener_region_add_ram region_add [ram] 0xc0000 - 0xdffff [0x7f5d1d400000] | vfio_listener_region_add_ram region_add [ram] 0xc0000 - 0xdffff [0x7f2d7c800000]
vfio_listener_region_add_ram region_add [ram] 0xe0000 - 0xfffff [0x7f5d1d620000] | vfio_listener_region_add_ram region_add [ram] 0xe0000 - 0xfffff [0x7f2d84220000]
vfio_listener_region_add_ram region_add [ram] 0x100000 - 0xbfffffff [0x7f5b41f00000] | vfio_listener_region_add_ram region_add [ram] 0x100000 - 0xbfffffff [0x7f2bb1f00000]
vfio_listener_region_add_skip SKIPPING region_add 0xfec00000 - 0xfec00fff vfio_listener_region_add_skip SKIPPING region_add 0xfec00000 - 0xfec00fff
vfio_listener_region_add_skip SKIPPING region_add 0xfee00000 - 0xfeefffff vfio_listener_region_add_skip SKIPPING region_add 0xfee00000 - 0xfeefffff
vfio_listener_region_add_ram region_add [ram] 0xfffc0000 - 0xffffffff [0x7f5d1d600000] | vfio_listener_region_add_ram region_add [ram] 0xfffc0000 - 0xffffffff [0x7f2d84200000]
vfio_listener_region_add_ram region_add [ram] 0x100000000 - 0x201ffffff [0x7f5c01e00000] | vfio_listener_region_add_ram region_add [ram] 0x100000000 - 0x201ffffff [0x7f2c71e00000]
vfio_mdev (0000:00:02.0) is_mdev 0 vfio_mdev (0000:00:02.0) is_mdev 0
vfio_get_device Device 0000:00:02.0 flags: 3, regions: 12, irqs: 5 vfio_get_device Device 0000:00:02.0 flags: 3, regions: 12, irqs: 5
vfio_region_setup Device 0000:00:02.0, region 0 "0000:00:02.0 BAR 0", flags: 0x7, offset: 0x0, s vfio_region_setup Device 0000:00:02.0, region 0 "0000:00:02.0 BAR 0", flags: 0x7, offset: 0x0, s
vfio_region_setup Device 0000:00:02.0, region 1 "0000:00:02.0 BAR 1", flags: 0x0, offset: 0x1000 vfio_region_setup Device 0000:00:02.0, region 1 "0000:00:02.0 BAR 1", flags: 0x0, offset: 0x1000
vfio_region_setup Device 0000:00:02.0, region 2 "0000:00:02.0 BAR 2", flags: 0x7, offset: 0x2000 vfio_region_setup Device 0000:00:02.0, region 2 "0000:00:02.0 BAR 2", flags: 0x7, offset: 0x2000
vfio_region_setup Device 0000:00:02.0, region 3 "0000:00:02.0 BAR 3", flags: 0x0, offset: 0x3000 vfio_region_setup Device 0000:00:02.0, region 3 "0000:00:02.0 BAR 3", flags: 0x0, offset: 0x3000
vfio_region_setup Device 0000:00:02.0, region 4 "0000:00:02.0 BAR 4", flags: 0x3, offset: 0x4000 vfio_region_setup Device 0000:00:02.0, region 4 "0000:00:02.0 BAR 4", flags: 0x3, offset: 0x4000
vfio_region_setup Device 0000:00:02.0, region 5 "0000:00:02.0 BAR 5", flags: 0x0, offset: 0x5000 vfio_region_setup Device 0000:00:02.0, region 5 "0000:00:02.0 BAR 5", flags: 0x0, offset: 0x5000
vfio_populate_device_config Device 0000:00:02.0 config: vfio_populate_device_config Device 0000:00:02.0 config:
 0x1000, offset: 0x70000000000, flags: 0x3 0x1000, offset: 0x70000000000, flags: 0x3
vfio_region_mmap Region 0000:00:02.0 BAR 0 mmaps[0] [0x0 - 0xffffff] vfio_region_mmap Region 0000:00:02.0 BAR 0 mmaps[0] [0x0 - 0xffffff]
vfio_region_mmap Region 0000:00:02.0 BAR 2 mmaps[0] [0x0 - 0xfffffff] vfio_region_mmap Region 0000:00:02.0 BAR 2 mmaps[0] [0x0 - 0xfffffff]
vfio_check_pm_reset 0000:00:02.0 Supports PM reset vfio_check_pm_reset 0000:00:02.0 Supports PM reset
vfio_msi_setup 0000:00:02.0 PCI MSI CAP @0xac vfio_msi_setup 0000:00:02.0 PCI MSI CAP @0xac
vfio_check_pcie_flr 0000:00:02.0 Supports FLR via PCIe cap vfio_check_pcie_flr 0000:00:02.0 Supports FLR via PCIe cap
vfio_get_dev_region 0000:00:02.0 index 9, 80008086/18 <
vfio_get_dev_region 0000:00:02.0 index 9, 80008086/18 <
vfio_get_dev_region 0000:00:02.0 index 10, 80008086/28 <
vfio_get_dev_region 0000:00:02.0 index 9, 80008086/18 <
vfio_get_dev_region 0000:00:02.0 index 10, 80008086/28 <
vfio_get_dev_region 0000:00:02.0 index 11, 80008086/38 <
vfio_listener_region_del region_del 0x0 - 0xbffff <
vfio_listener_region_add_ram region_add [ram] 0x0 - 0x9ffff [0x7f5b41e00000] <
vfio_listener_region_add_skip SKIPPING region_add 0xa0000 - 0xbffff <
vfio_pci_igd_lpc_bridge_enabled 0000:00:02.0 <
vfio_pci_igd_host_bridge_enabled 0000:00:02.0 <
vfio_pci_igd_opregion_enabled 0000:00:02.0 <
vfio_pci_igd_bdsm_enabled 0000:00:02.0 40MB <
vfio_intx_enable_kvm (0000:00:02.0) KVM INTx accel enabled vfio_intx_enable_kvm (0000:00:02.0) KVM INTx accel enabled
vfio_intx_enable (0000:00:02.0) vfio_intx_enable (0000:00:02.0)
 0x100, offset: 0x70000000000, flags: 0x3 0x100, offset: 0x70000000000, flags: 0x3
vfio_populate_device_get_irq_info_failure VFIO_DEVICE_GET_IRQ_INFO failure: Invalid argument vfio_populate_device_get_irq_info_failure VFIO_DEVICE_GET_IRQ_INFO failure: Invalid argument
vfio_pci_reset (0000:00:02.0) vfio_pci_reset (0000:00:02.0)
vfio_intx_disable_kvm (0000:00:02.0) KVM INTx accel disabled vfio_intx_disable_kvm (0000:00:02.0) KVM INTx accel disabled
vfio_region_mmaps_set_enabled Region 0000:00:02.0 BAR 0 mmaps enabled: 1 vfio_region_mmaps_set_enabled Region 0000:00:02.0 BAR 0 mmaps enabled: 1
vfio_region_mmaps_set_enabled Region 0000:00:02.0 BAR 2 mmaps enabled: 1 vfio_region_mmaps_set_enabled Region 0000:00:02.0 BAR 2 mmaps enabled: 1
vfio_region_mmaps_set_enabled Region 0000:00:02.0 BAR 4 mmaps enabled: 1 vfio_region_mmaps_set_enabled Region 0000:00:02.0 BAR 4 mmaps enabled: 1
vfio_intx_disable (0000:00:02.0) vfio_intx_disable (0000:00:02.0)
vfio_pci_write_config (0000:00:02.0, @0x4, 0x0, len=0x2) vfio_pci_write_config (0000:00:02.0, @0x4, 0x0, len=0x2)
vfio_listener_region_del region_del 0x0 - 0x9ffff <
vfio_listener_region_del_skip SKIPPING region_del 0xa0000 - 0xbffff <
vfio_listener_region_add_ram region_add [ram] 0x0 - 0xbffff [0x7f5b41e00000] <
vfio_pci_reset_flr 0000:00:02.0 FLR/VFIO_DEVICE_RESET vfio_pci_reset_flr 0000:00:02.0 FLR/VFIO_DEVICE_RESET
vfio_intx_enable (0000:00:02.0) vfio_intx_enable (0000:00:02.0)
vfio_listener_region_del region_del 0x0 - 0xbffff vfio_listener_region_del region_del 0x0 - 0xbffff
vfio_listener_region_del region_del 0xc0000 - 0xdffff vfio_listener_region_del region_del 0xc0000 - 0xdffff
vfio_listener_region_del region_del 0xe0000 - 0xfffff vfio_listener_region_del region_del 0xe0000 - 0xfffff
vfio_listener_region_del region_del 0x100000 - 0xbfffffff vfio_listener_region_del region_del 0x100000 - 0xbfffffff
vfio_listener_region_add_ram region_add [ram] 0x0 - 0xcffff [0x7f5b41e00000] | vfio_listener_region_add_ram region_add [ram] 0x0 - 0xcffff [0x7f2bb1e00000]

We can see here, the following key lines are not printed in 5.0.0:

vfio_pci_igd_lpc_bridge_enabled 0000:00:02.0 <
vfio_pci_igd_host_bridge_enabled 0000:00:02.0 <
vfio_pci_igd_opregion_enabled 0000:00:02.0 <
vfio_pci_igd_bdsm_enabled 0000:00:02.0 40MB <

Looking through the code and bisecting the problem (I can provide more detail if helpful), shows the following ifdef statement lines introduce the problem:

https://github.com/qemu/qemu/blob/master/hw/vfio/pci-quirks.c#L1253

  1246 void vfio_bar_quirk_setup(VFIOPCIDevice *vdev, int nr)
  1247 {
  1248 vfio_probe_ati_bar4_quirk(vdev, nr);
  1249 vfio_probe_ati_bar2_quirk(vdev, nr);
  1250 vfio_probe_nvidia_bar5_quirk(vdev, nr);
  1251 vfio_probe_nvidia_bar0_quirk(vdev, nr);
  1252 vfio_probe_rtl8168_bar2_quirk(vdev, nr);
  1253 #ifdef CONFIG_VFIO_IGD
  1254 vfio_probe_igd_bar4_quirk(vdev, nr);
  1255 #endif
  1256 }

This was added by the following commits:

https://github.com/qemu/qemu/commit/29d62771c81d8fd244a67c14a1d968c268d3fb19#diff-38093e21794c7a4987feb71e498dbdc6

Reading through the commit message, I suspect the something may be happening with the Kconfig switches mentioned there.

=== Validation/Workaround ===

I have rebuilt the package with the following two changes:

root@debian:/home/test/src# diff debian_*/qemu-5.0/hw/vfio/pci-quirks.c
0a1
> #define CONFIG_VFIO_IGD y
root@debian:/home/test/src# diff debian_*/qemu-5.0/hw/vfio/Kconfig
42c42
< default y if PC_PCI
---
> default y
root@debian:/home/test/src#

GVTd started working fine again (Screen shows output again).

I have tried with either change alone:

- with only the ifdef in pci-quirks.c compilation fails with linker errors
- with only the Kconfig it compiles, but GVTd still does not work (black screen)

Please take a look and thank you very much for a fantastic product!

TheCatFelix

Tags: gvt gvtd igd vfio
summary: - GVTd not working after upgrade to qemu-5.0.0
+ GVTd not working (black screen) after upgrade to qemu-5.0.0
Revision history for this message
Shak (sshaikh) wrote :

I've also posted the bug and fix here:

https://bugs.launchpad.net/qemu/+bug/1882784

I may be wrong but Legacy IGD assignment doesn't use GVT-g or GVT-d, which is why I missed this ticket when reporting my own.

Revision history for this message
TheCatFelix (thecatfelix) wrote :

No problem, thanks for getting the issue resolved!

As a note, i've been going by this guide here from Intel. They seem to describe as that GVT-d refers to the concept of attaching a "whole" GPU to a single VM through PCI PT and it has two modes of operation, "Legacy" and "UPT".

https://github.com/intel/gvt-linux/wiki/GVTd_Setup_Guide#561-create-gvt-d-kvm-vm

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.