We're running Kata Containers using QEMU and with v5.1.0rc0 and rc1 have noticed a problem at startup where QEMu appears to hang. We are not seeing this problem on our bare metal nodes and only on a VSI that supports nested virtualization.
We unfortunately see nothing at all in the QEMU logs to help understand the problem and a hung process is just a guess at this point.
Using git bisect we first see the problem with...
---
f19bcdfedd53ee93412d535a842a89fa27cae7f2 is the first bad commit
commit f19bcdfedd53ee93412d535a842a89fa27cae7f2
Author: Jason Wang <email address hidden>
Date: Wed Jul 1 22:55:28 2020 +0800
virtio-pci: implement queue_enabled method
With version 1, we can detect whether a queue is enabled via
queue_enabled.
Signed-off-by: Jason Wang <email address hidden>
Signed-off-by: Cindy Lu <email address hidden>
Message-Id: <email address hidden>
Reviewed-by: Michael S. Tsirkin <email address hidden>
Signed-off-by: Michael S. Tsirkin <email address hidden>
Acked-by: Jason Wang <email address hidden>
We're running Kata Containers using QEMU and with v5.1.0rc0 and rc1 have noticed a problem at startup where QEMu appears to hang. We are not seeing this problem on our bare metal nodes and only on a VSI that supports nested virtualization.
We unfortunately see nothing at all in the QEMU logs to help understand the problem and a hung process is just a guess at this point.
Using git bisect we first see the problem with...
---
f19bcdfedd53ee9 3412d535a842a89 fa27cae7f2 is the first bad commit 3412d535a842a89 fa27cae7f2
commit f19bcdfedd53ee9
Author: Jason Wang <email address hidden>
Date: Wed Jul 1 22:55:28 2020 +0800
virtio-pci: implement queue_enabled method
With version 1, we can detect whether a queue is enabled via
queue_enabled.
Signed-off-by: Jason Wang <email address hidden>
Signed-off-by: Cindy Lu <email address hidden>
Message-Id: <email address hidden>
Reviewed-by: Michael S. Tsirkin <email address hidden>
Signed-off-by: Michael S. Tsirkin <email address hidden>
Acked-by: Jason Wang <email address hidden>
hw/virtio/ virtio- pci.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
---
Reverting this commit seems to work and prevent the hanging.
---
Here's how kata ends up launching qemu in our environment -- bin/qemu- system- x86_64 -name sandbox- 849df14c6065931 adedb9d18bc9260 a6d896f1814a8c5 cfa239865772f1b 7a5f -uuid 6bec458e- 1da7-4847- a5d7-5ab31d4d24 65 -machine pc,accel= kvm,kernel_ irqchip -cpu host,pmu=off -qmp unix:/run/ vc/vm/849df14c6 065931adedb9d18 bc9260a6d896f18 14a8c5cfa239865 772f1b7a5f/ qmp.sock, server, nowait -m 4096M,slots= 10,maxmem= 30978M -device pci-bridge, bus=pci. 0,id=pci- bridge- 0,chassis_ nr=1,shpc= on,addr= 2,romfile= -device virtio- serial- pci,disable- modern= true,id= serial0, romfile= -device virtconsole, chardev= charconsole0, id=console0 -chardev socket, id=charconsole0 ,path=/ run/vc/ vm/849df14c6065 931adedb9d18bc9 260a6d896f1814a 8c5cfa239865772 f1b7a5f/ console. sock,server, nowait -device virtio- scsi-pci, id=scsi0, disable- modern= true,romfile= -object rng-random, id=rng0, filename= /dev/urandom -device virtio- rng-pci, rng=rng0, romfile= -device virtserialport, chardev= charch0, id=channel0, name=agent. channel. 0 -chardev socket, id=charch0, path=/run/ vc/vm/849df14c6 065931adedb9d18 bc9260a6d896f18 14a8c5cfa239865 772f1b7a5f/ kata.sock, server, nowait -chardev socket, id=char- 396c5c3e19e2935 3,path= /run/vc/ vm/849df14c6065 931adedb9d18bc9 260a6d896f1814a 8c5cfa239865772 f1b7a5f/ vhost-fs. sock -device vhost-user- fs-pci, chardev= char-396c5c3e19 e29353, tag=kataShared, romfile= -netdev tap,id= network- 0,vhost= on,vhostfds= 3:4,fds= 5:6 -device driver= virtio- net-pci, netdev= network- 0,mac=52: ac:2d:02: 1f:6f,disable- modern= true,mq= on,vectors= 6,romfile= -global kvm-pit. lost_tick_ policy= discard -vga none -no-user-config -nodefaults -nographic -daemonize -object memory- backend- file,id= dimm1,size= 4096M,mem- path=/dev/ shm,share= on -numa node,memdev=dimm1 -kernel /opt/kata/ share/kata- containers/ vmlinuz- 5.7.9-74 -initrd /opt/kata/ share/kata- containers/ kata-containers -initrd_ alpine_ 1.11.2- 6_agent. initrd -append tsc=reliable no_timer_check rcupdate. rcu_expedited= 1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 console=hvc1 iommu=off cryptomgr.notests net.ifnames=0 pci=lastbus=0 debug panic=1 nr_cpus=4 agent.use_ vsock=false scsi_mod.scan=none init=/usr/ bin/kata- agent -pidfile /run/vc/ vm/849df14c6065 931adedb9d18bc9 260a6d896f1814a 8c5cfa239865772 f1b7a5f/ pid -D /run/vc/ vm/849df14c6065 931adedb9d18bc9 260a6d896f1814a 8c5cfa239865772 f1b7a5f/ qemu.log -smp 2,cores= 1,threads= 1,sockets= 4,maxcpus= 4
/opt/kata/
---