Hi,
On an arm64 Nova compute node, instances will sometimes end up in the Paused state with the following libvirt log :
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-system-aarch64 -name instance-001285c8 -S -machine virt,accel=kvm,usb=off -cpu host -bios /usr/share/qemu-efi/QEMU_EFI.fd -m 8192 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 353d3da4-da46-48c0-9ed2-4acf940bc61e -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-001285c8.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -no-shutdown -boot strict=on -usb -drive file=/var/lib/nova/instances/353d3da4-da46-48c0-9ed2-4acf940bc61e/disk,if=none,id=drive-virtio-disk0,format=qcow2,cache=unsafe -device virtio-blk-device,scsi=off,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/var/lib/nova/instances/353d3da4-da46-48c0-9ed2-4acf940bc61e/disk.swap,if=none,id=drive-virtio-disk1,format=qcow2,cache=unsafe -device virtio-blk-device,scsi=off,drive=drive-virtio-disk1,id=virtio-disk1 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=27 -device virtio-net-device,netdev=hostnet0,id=net0,mac=fa:16:3e:55:bb:d8 -serial file:/var/lib/nova/instances/353d3da4-da46-48c0-9ed2-4acf940bc61e/console.log -serial pty -device virtio-balloon-device,id=balloon0 -msg timestamp=on
Failed to open module: /usr/lib/aarch64-linux-gnu/qemu/block-curl.so: failed to map segment from shared object: Permission denied
Failed to open module: /usr/lib/aarch64-linux-gnu/qemu/block-rbd.so: failed to map segment from shared object: Permission denied
char device redirected to /dev/pts/6 (label serial1)
2016-12-02T09:20:48.382852Z qemu-system-aarch64: binding does not support guest notifiers
2016-12-02T09:20:48.383041Z qemu-system-aarch64: unable to start vhost net: 38: falling back on userspace virtio
error: kvm run failed Function not implemented
PC=000000023faf924c SP=0000000240000020
X00=0000000000000000 X01=000000023fafb064 X02=0000000040000305 X03=0000000000000000
X04=00000000dbadc0de X05=1de7ec7edbadc0de X06=0000000000000004 X07=000000023fa60430
X08=fffffffffffff001 X09=0000000000000400 X10=000000023fb87de8 X11=000000023bdd9a18
X12=00000000a007e03a X13=000000023fdadffc X14=000000023fb87cb8 X15=0000000000000001
X16=0000000000000003 X17=000000000020d0d0 X18=000000001f526ead X19=000000023be36060
X20=000000023be36018 X21=0000000000000010 X22=000000023fb9da38 X23=0000000000000000
X24=0000000000000064 X25=000000004007fb40 X26=0000000000000000 X27=000000004007c278
X28=000000004007c030 X29=0000000000000000 X30=000000023faf9178 PSTATE=60000305 (flags -ZC-)
The 2 "qemu-system-aarch64:" lines are also present on successful instances (i.e. instances which don't end up in the Paused state)
This is on trusty, with :
qemu-system-arm 1:2.3+dfsg-5ubuntu9~ubuntu14.04.1
nova-compute-kvm 1:2015.1.1-0ubuntu2+ppa1
libvirt0 1.2.12-0ubuntu14.4+ppa1
Linux kernel 4.8.0-040800rc8-generic
Description of problem:
This has similarities to the following bug, but is much rarer: /bugzilla. redhat. com/show_ bug.cgi? id=1194366
https:/
I have a script which boots a guest 1000 times on the Fedora/aarch64 1.fc24. aarch64) .
host. Guest and host kernels are identical (4.2.0-
About 5 in every 1000 boots fail. The host kernel prints:
kvm [3683]: load/store instruction decoding not implemented
The corresponding qemu process hangs after printing:
error: kvm run failed Function not implemented 303e2 X01=0000000068e 67b40 X02=000000006ba a7cec X03=0000000000e 68200 c1398 X05=00000000009 ffaf8 X06=00000000000 00000 X07=000000006f0 4c85c 4cb78 X09=00000000000 00000 X10=00000000000 00004 X11=00000000000 00000 fe0fa X13=00000000000 00000 X14=00000000000 00000 X15=00000000000 00000 4cdf0 X17=00000000000 00000 X18=00000000000 00000 X19=000000006bf f0018 00000 X21=00000000000 00000 X22=00000000000 00000 X23=00000000000 00000 00000 X25=00000000000 00000 X26=00000000000 00000 X27=00000000000 00000 00000 X29=00000000000 00000 X30=00000000000 00000 PSTATE=60000305 (flags -ZC-)
PC=000000006bbfd238 SP=00000000700000b0
X00=aa1903e1aa0
X04=000000006ba
X08=000000006f0
X12=00000000700
X16=000000006f0
X20=00000000000
X24=00000000000
X28=00000000000
The PC address does not correspond to any kernel address.
Version-Release number of selected component (if applicable):
kernel 4.2.0-1. fc24.aarch64
How reproducible:
Rare, approximately 1 in 200 boots.
Steps to Reproduce: qemu/qemu- boot -n 1000
1. In the libguestfs test suite, run:
./tests/
Additional info:
The error message usually indicates that the guest has jumped into random code.
I'm still investigating this bug, will update this bug with further details
as I collect it.