qemu cannot run twice

Bug #1680679 reported by Roger Lawhorn on 2017-04-07
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Undecided
Unassigned

Bug Description

After using qemu with gpu passthrough and then shutting down windows 7 properly I cannot boot windows 7 a second time.
Only a full reboot of linux fixes this issue.
Qemu appears to corrupt something in linux when exiting.
I get no error messages but windows 7 never finishes booting during the 2nd try.
Apparently I do try to run vfiobind each time the script is run.
Wondering if rerunning vfiobind can cause an issue?

My specs:
-----------------------------------------------------------------------------------------
System: Host: GT70-2PE Kernel: 4.5.4-040504-generic x86_64 (64 bit gcc: 5.3.1)
           Desktop: Cinnamon 3.2.7 (Gtk 3.18.9) Distro: Linux Mint 18.1 Serena
Machine: Mobo: Micro-Star model: MS-1763 v: REV:0.C Bios: American Megatrends v: E1763IMS.51B date: 01/29/2015
CPU: Quad core Intel Core i7-4810MQ (-HT-MCP-) cache: 6144 KB
           flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 22347
           clock speeds: max: 2801 MHz 1: 2801 MHz 2: 2801 MHz 3: 2801 MHz 4: 2801 MHz 5: 2801 MHz 6: 2801 MHz
           7: 2801 MHz 8: 2801 MHz
Graphics: Card-1: Intel 4th Gen Core Processor Integrated Graphics Controller bus-ID: 00:02.0
           Card-2: NVIDIA GK104M [GeForce GTX 880M] bus-ID: 01:00.0
           Display Server: X.Org 1.18.4 drivers: intel (unloaded: fbdev,vesa)
           Resolution: 1920x1080@60.02hz, 1920x1080@60.00hz
           GLX Renderer: Mesa DRI Intel Haswell Mobile GLX Version: 3.0 Mesa 12.0.6 Direct Rendering: Yes

My script:
-------------------------------------------------------------------------------------------
#!/bin/bash

cd ~/qemu
sudo ./up.sh tap0

configfile=~/qemu/vfio-pci1.cfg

vfiobind() {
    dev="$1"
        vendor=$(cat /sys/bus/pci/devices/$dev/vendor)
        device=$(cat /sys/bus/pci/devices/$dev/device)
        if [ -e /sys/bus/pci/devices/$dev/driver ]; then
                echo $dev > /sys/bus/pci/devices/$dev/driver/unbind
        fi
        echo $vendor $device > /sys/bus/pci/drivers/vfio-pci/new_id

}

modprobe vfio-pci

cat $configfile | while read line;do
    echo $line | grep ^# >/dev/null 2>&1 && continue
        vfiobind $line
done

sudo qemu-system-x86_64 -machine type=q35,accel=kvm -cpu host,kvm=off \
-smp 8,sockets=1,cores=4,threads=2 \
-bios /usr/share/seabios/bios.bin \
-serial none \
-parallel none \
-vga none \
-m 4G \
-mem-path /run/hugepages/kvm \
-mem-prealloc \
-balloon none \
-rtc clock=host,base=localtime \
-device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,chassis=1,id=root.1 \
-device vfio-pci,host=01:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on \
-device virtio-scsi-pci,id=scsi \
-drive id=disk0,if=virtio,cache=none,format=raw,file=/home/dad/qemu/windows7.img \
-drive file=/home/dad/1TB-Backup/Iso/SP1ForWin7.iso,id=isocd,format=raw,if=none -device scsi-cd,drive=isocd \
-net nic -net tap,ifname=tap0,script=no,downscript=no \
-usbdevice host:413c:a503 \
-usbdevice host:13fe:3100 \
-usbdevice host:0bc2:ab21 \
-boot menu=on \
-boot order=c

sudo ./down.sh tap0

exit 0

If you get one boot where GPU assignment works with a mobile GeForce, you're doing better than most.

Roger Lawhorn (rll-m) wrote :

I have been told that getting this to work with a laptop is rare.
I own an MSI GT70-2PE.

Roger Lawhorn (rll-m) wrote :

NOTE: I only get gpu passthrough to work when using seabios. UEFI will not work with modern nvidia drivers. You get the 'code 43' error because whatever standards nvidia expects when talking to uefi are not met by the OVMF firmware used by qemu. This issue happens to windows users also. They had to update their uefi firmware to get around the code 43 issue.

Roger Lawhorn (rll-m) wrote :

NOTE: Article showing windows users updating their motherboard uefi firmware to get around code 43:
https://devtalk.nvidia.com/default/topic/861244/cuda-setup-and-installation/geforce-740m-asus-x550l-code-43-after-windows-10-update/3

Comments #3 & #4 are not relevant to this bug and inaccurate speculation. There is no known incompatibility between NVIDIA drivers and OVMF. Many users, including myself, use this combination daily. The issue is far more likely to be a VM configuration issue or lack of UEFI support in the GPU ROM.

Perhaps you get Code 43 because mobile NVIDIA chips make use of Optimus which requires significant proprietary firmware support. QEMU/VFIO has never claimed to work with such devices. The further speculation in the original report that QEMU corrupted something in Linux seems unjustified, the device simply doesn't work a second time. This might be lack of proper reset support in the GPU itself or poor interaction with aforementioned proprietary firmware.

Changed in qemu:
status: New → Invalid
Roger Lawhorn (rll-m) wrote :

GTX880M - uefi firmware built in, confirmed

Roger Lawhorn (rll-m) wrote :

My uefi build script. If you see an error or issue that could cause error 43 please confirm it:

#!/bin/bash

cd /home/dad/qemu/qemu2
sudo ./up.sh tap0

configfile=./vfio-pci1.cfg

vfiobind() {
    dev="$1"
        vendor=$(cat /sys/bus/pci/devices/$dev/vendor)
        device=$(cat /sys/bus/pci/devices/$dev/device)
        if [ -e /sys/bus/pci/devices/$dev/driver ]; then
                echo $dev > /sys/bus/pci/devices/$dev/driver/unbind
        fi
        echo $vendor $device > /sys/bus/pci/drivers/vfio-pci/new_id

}

modprobe vfio-pci

cat $configfile | while read line;do
    echo $line | grep ^# >/dev/null 2>&1 && continue
        vfiobind $line
done

vmname="windows10vm"

if ps -A | grep -q $vmname; then
   echo "$vmname is already running." &
   exit 1

else

# use pulseaudio
export QEMU_AUDIO_DRV=pa

cp /usr/share/OVMF/OVMF_VARS.fd /tmp/my_vars.fd

qemu-system-x86_64 \
  -name $vmname,process=$vmname \
  -machine type=pc-i440fx-2.6,accel=kvm \
  -cpu host,kvm=off \
  -smp 8,sockets=1,cores=4,threads=2 \
  -enable-kvm \
  -m 4G \
  -mem-path /run/hugepages/kvm \
  -mem-prealloc \
  -balloon none \
  -rtc clock=host,base=localtime \
  -serial none \
  -parallel none \
  -vga none \
  -device vfio-pci,host=01:00.0,multifunction=on \
  -device vfio-pci,host=01:00.1 \
  -drive if=pflash,format=raw,readonly,file=/home/dad/qemu/qemu2/OVMF-pure-efi.fd \
  -drive if=pflash,format=raw,file=/tmp/my_vars.fd \
  -boot order=dc -boot menu=on \
  -device virtio-scsi-pci,id=scsi \
  -drive id=disk0,if=virtio,cache=none,format=raw,file=/home/dad/qemu/qemu2/win7.img \
  -drive file=/home/dad/qemu/qemu2/virtio-win-0.1.126.iso,id=virtiocd,format=raw,if=none -device ide-cd,bus=ide.1,drive=virtiocd \
  -netdev type=tap,id=net0,ifname=tap0,vhost=on,script=no,downscript=no \
  -device virtio-net-pci,netdev=net0,mac=00:16:3e:00:01:01 \
  -usb -usbdevice host:413c:a503 -usbdevice host:13fe:3100

sudo ./down.sh tap0
   exit 0
fi

Roger Lawhorn (rll-m) wrote :

ok, so I will research qemu and nvidia optimus. I have a custom BIOS made by an MSI employee. I have hundreds of bios options to play with.

I suspect the 'M' in GTX880M is the biggest contributor to the Code 43. The fact you can get it to work once per host boot on SeaBIOS is a fluke. If you can get custom ROMs, you might try playing with the vfio-pci x-pci-device-id option to masquerade as a discrete card, maybe that would avoid mobile code in the NVIDIA driver that would expect Optimus. Obviously do so at your own risk.

Roger Lawhorn (rll-m) wrote :

It works using seabios. I assume that the nvidia driver cannot see optimus, and expect an intel card also, unless I use OVMF. I do know that you cannot run off the intel card in windows. It's a no-no. Thanks for the bios tip. Maybe I can hide the optimus feature from OVMF and windows.

Roger Lawhorn (rll-m) wrote :

I meant to say you cannot run without the intel card in windows if you have optimus. I am glad seabios somehow hides optimus.

misairu (misairu) wrote :

Does your Subsystem ID and Subsystem Vendor ID (of your GPU) show correctly inside the WindowsVM?

It should be the same ID shown in your host. Otherwise that will trigger the Code 43 error.

I once have this problem but now solve this by some vfio-pci option. Now I have a laptop that passthrough my dGPU with OVMF working perfectly.

Thanks.
I will look into this.

On 11/15/2017 10:56 AM, misairu wrote:
> Does your Subsystem ID and Subsystem Vendor ID (of your GPU) show
> correctly inside the WindowsVM?
>
> It should be the same ID shown in your host. Otherwise that will trigger
> the Code 43 error.
>
> I once have this problem but now solve this by some vfio-pci option. Now
> I have a laptop that passthrough my dGPU with OVMF working perfectly.
>

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers