We upgraded the libvirt UCA packages from 3.6 to 4.0 and qemu 2.10 to 2.11 as part of a queens upgrade and noticed that
virtio-ballon is broken when instances live migrate (started with a prior 3.6 version) with:
2019-07-24T06:46:49.487109Z qemu-system-x86_64: warning: Unknown firmware file in legacy mode: etc/msr_feature_control
2019-07-24T06:47:22.187749Z qemu-system-x86_64: VQ 2 size 0x80 < last_avail_idx 0xb57 - used_idx 0xb59
2019-07-24T06:47:22.187768Z qemu-system-x86_64: Failed to load virtio-balloon:virtio
2019-07-24T06:47:22.187771Z qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:05.0/virtio-balloon'
2019-07-24T06:47:22.188194Z qemu-system-x86_64: load of migration failed: Operation not permitted
2019-07-24 06:47:22.430+0000: shutting down, reason=failed
This seem to be the exact problem as reported by https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg02228.html
Listed the packages which changed:
Start-Date: 2019-07-06 06:40:55
Commandline: /usr/bin/apt-get -y -o Dpkg::Options::=--force-confdef -o Dpkg::Options::=--force-confold install libvirt-bin python-libvirt qemu qemu-utils qemu-system qemu-system-arm qemu-system-mips qemu-system-ppc qemu-system-sparc qemu-system-x86 qemu-system-misc qemu-block-extra qemu-utils qemu-user qemu-kvm
Install: librdmacm1:amd64 (17.1-1ubuntu0.1~cloud0, automatic), libvirt-daemon-driver-storage-rbd:amd64 (4.0.0-1ubuntu8.10~cloud0, automatic), ipxe-qemu-256k-compat-efi-roms:amd64 (1.0.0+git-20150424.a25a16d-0ubuntu2~cloud0, automatic)
Upgrade: qemu-system-mips:amd64 (1:2.10+dfsg-0ubuntu3.8~cloud1, 1:2.11+dfsg-1ubuntu7.13~cloud0), qemu-system-misc:amd64 (1:2.10+dfsg-0ubuntu3.8~cloud1, 1:2.11+dfsg-1ubuntu7.13~cloud0), qemu-system-ppc:amd64 (1:2.10+dfsg-0ubuntu3.8~cloud1, 1:2.11+dfsg-1ubuntu7.13~cloud0), python-libvirt:amd64 (3.5.0-1build1~cloud0, 4.0.0-1~cloud0), qemu-system-x86:amd64 (1:2.10+dfsg-0ubuntu3.8~cloud1, 1:2.11+dfsg-1ubuntu7.13~cloud0), libvirt-clients:amd64 (3.6.0-1ubuntu6.8~cloud0, 4.0.0-1ubuntu8.10~cloud0), qemu-user:amd64 (1:2.10+dfsg-0ubuntu3.8~cloud1, 1:2.11+dfsg-1ubuntu7.13~cloud0), libvirt-bin:amd64 (3.6.0-1ubuntu6.8~cloud0, 4.0.0-1ubuntu8.10~cloud0), qemu:amd64 (1:2.10+dfsg-0ubuntu3.8~cloud1, 1:2.11+dfsg-1ubuntu7.13~cloud0), qemu-utils:amd64 (1:2.10+dfsg-0ubuntu3.8~cloud1, 1:2.11+dfsg-1ubuntu7.13~cloud0), libvirt-daemon-system:amd64 (3.6.0-1ubuntu6.8~cloud0, 4.0.0-1ubuntu8.10~cloud0), qemu-system-sparc:amd64 (1:2.10+dfsg-0ubuntu3.8~cloud1, 1:2.11+dfsg-1ubuntu7.13~cloud0), qemu-user-binfmt:amd64 (1:2.10+dfsg-0ubuntu3.8~cloud1, 1:2.11+dfsg-1ubuntu7.13~cloud0), qemu-kvm:amd64 (1:2.10+dfsg-0ubuntu3.8~cloud1, 1:2.11+dfsg-1ubuntu7.13~cloud0), libvirt0:amd64 (3.6.0-1ubuntu6.8~cloud0, 4.0.0-1ubuntu8.10~cloud0), qemu-system-arm:amd64 (1:2.10+dfsg-0ubuntu3.8~cloud1, 1:2.11+dfsg-1ubuntu7.13~cloud0), qemu-block-extra:amd64 (1:2.10+dfsg-0ubuntu3.8~cloud1, 1:2.11+dfsg-1ubuntu7.13~cloud0), qemu-system-common:amd64 (1:2.10+dfsg-0ubuntu3.8~cloud1, 1:2.11+dfsg-1ubuntu7.13~cloud0), qemu-system:amd64 (1:2.10+dfsg-0ubuntu3.8~cloud1, 1:2.11+dfsg-1ubuntu7.13~cloud0), libvirt-daemon:amd64 (3.6.0-1ubuntu6.8~cloud0, 4.0.0-1ubuntu8.10~cloud0)
End-Date: 2019-07-06 06:41:08
At this point the instances would have to be hard rebooted or stopped/started to fix the issue for future live migration attemps
Hi Bjoern,
I don't think this is the same bug as the one you reference; that's a config space disagreement, not a virtio queue disagreeement.
It almost feels like the old 4a1e48becab8102 0adfb74b22c76a5 95f2d02a01 stats migration fix; but that went in with 2.8 so shouldn't be a problem going 2.10 to 2.11