Unable to passthrough GPUs to guest, due to PCI64 aperture limitation
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
edk2 (Ubuntu) |
Confirmed
|
Medium
|
Guilherme G. Piccoli | ||
Bionic |
Confirmed
|
Medium
|
Guilherme G. Piccoli | ||
Eoan |
Won't Fix
|
Medium
|
Guilherme G. Piccoli | ||
Focal |
Confirmed
|
Medium
|
Guilherme G. Piccoli | ||
Groovy |
Won't Fix
|
Medium
|
Guilherme G. Piccoli |
Bug Description
I'm having issues passing Nvidia Tesla GPUs to an OVMF-mode guest. While I can passthrough other devices to an OVMF-mode guest w/o a problem (e.g. Mellanox Connect-X 5 VFs), I'm seeing a couple different failure modes when passing through a GPU:
1) No output:
---------
$ virsh start virtinst; virsh console virtinst
Domain virtinst started
Connected to domain virtinst
Escape character is ^]
---------
I discovered that I'm able to avoid this by placing the device on a different BSF in the guest.
This results in a hang:
<address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
Whilst this gets us further:
<address type='pci' domain='0x0000' bus='0x05' slot='0x02' function='0x0'/>
Though that too fails after OS boot as described next:
2) OS boots, device appears within, but the kernel is unable to configure resources:
[ 4.744211] nvidia-nvlink: Nvlink Core is being initialized, major device number 241
[ 4.750811] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
[ 4.750811] NVRM: BAR1 is 0M @ 0x0 (PCI:0000:01:02.0)
[ 4.756960] NVRM: The system BIOS may have misconfigured your GPU.
[ 4.759725] nvidia: probe of 0000:01:02.0 failed with error -1
[ 4.762347] NVRM: The NVIDIA probe routine failed for 1 device(s).
[ 4.766010] NVRM: None of the NVIDIA devices were initialized.
[ 4.769701] nvidia-nvlink: Unregistered the Nvlink Core, major device number 241
I've found that #2 can be worked around w/ 'pci=nocrs'.
Neither issue is reproducible when booting in non-UEFI mode.
I observed this with bionic's ovmf 0~20180205.
Changed in edk2 (Ubuntu Bionic): | |
status: | New → Confirmed |
tags: | added: sts |
Changed in edk2 (Ubuntu Eoan): | |
status: | Confirmed → Won't Fix |
cpaelzer recommends that we retest w/ the q35 machine type as a next step.