QEMU KVM live migration crashes when the VM is in booting state

Bug #1872107 reported by Rudolph Bott on 2020-04-10
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
qemu-kvm
Unknown
Unknown
qemu (Ubuntu)
Undecided
Unassigned

Bug Description

During a QEMU KVM live migration the sending process crashes if the VM is currently in a booting state (and possibly also during a 'soft' reboot from inside the VM). This has been fixed upstream:

https://github.com/qemu/qemu/commit/9b3a31c745b61758aaa5466a3a9fc0526d409188

There are also bug reports available for this problem:

https://bugzilla.redhat.com/show_bug.cgi?id=1771032
https://bugzilla.redhat.com/show_bug.cgi?id=1772774

I stumbled over this problem while testing latest builds of Ganeti (https://github.com/ganeti/ganeti) on Ubuntu Focal Fossa which ships qemu 4.2. The Ganeti QA suite runs a series of tests against a cluster and issues a VM failover (QEMU shutdown on Node A and start on Node B) directly followed by a live migration (QEMU live migration from Node B to Node A). The sending QEMU process dies with this error message:

qemu-system-x86_64: /build/qemu-oknQD6/qemu-4.2/accel/kvm/kvm-all.c:653: kvm_log_clear_one_slot: Assertion `mem->dirty_bmap' failed.

If you add 'sleep 2' between the reboot and the live migration instructions everything works fine, because the QEMU VM has left the booting state by the time the live migration starts. From a Ganeti point of view, this only happens when using the "sharedfile" storage backend. When you use e.g. DRBD, the Ganeti commands take a bit longer to finish which gives the VM enough time to boot up.

Debian Bullseye (which ships the same QEMU version as Focal) shows the exact same behaviour.

Related branches

CVE References

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks for the Report Rudolph.

I missed that fix when I was scanning for things that went to stable@qemu since this wasn#t tagged for it.

It went into v5.0.0-rc0 and will be released with it.
There are no follow on fixes since then in upstream/master.

I'm working on a another fix upload for focal anyway and made this part of it.

Changed in qemu-kvm (Ubuntu):
status: New → In Progress
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I did some extra migration regression tests with the patch applied and all still worked.
This is now in focal proposed blocked by some (what seems) unrelated tests.
Should be in 20.04 soon.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hmm not sure why the auto-update won't reach this, but :

This bug was fixed in the package qemu - 1:4.2-3ubuntu5

---------------
qemu (1:4.2-3ubuntu5) focal; urgency=medium

  * d/p/ubuntu/lp-1871830-*: avoid crash when using QEMU_MODULE_DIR
    (LP: #1871830)
  * Security and packaging fixes (LP: #1872937)
    - arm-fix-PAuth-sbox-functions-CVE-2020-10702.patch
    - net-tulip-check-frame-size-and-r-w-data-length-CVE-2020-11102.patch
      CVE-2020-10702
      CVE-2020-11102
    - fix external spice UI
      + install ui-spice-app.so in qemu-system-common
      + install ui-spice-app.so only if built, spice is optional
    - switch binfmt registration to use update-binfmts --[un]import (#866756)
    - qemu-system-gui: Multi-Arch=same, not foreign (#956763)
    - qemu-system-data: s/highcolor/hicolor/ (#955741)
  * d/p/ubuntu/lp-1872107*: fix migration while rebooting guests (LP: #1872107)

 -- Christian Ehrhardt <email address hidden> Wed, 15 Apr 2020 11:26:44 +0200

Changed in qemu-kvm (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Oh this was filed against the wrong package.
Qemu-kvm doesn't exist since precise

affects: qemu-kvm (Ubuntu) → qemu (Ubuntu)
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.