Comment 40 for bug 1647389

Revision history for this message
Len (lwhite-5) wrote : RE: [Bug 1647389] Re: Regression: Live migrations can still crash after CVE-2016-5403 fix

I also forgot to mention in our case it didn't matter if the migration was tunneled or not, and turning off the memory stats before migration in virsh didn't help at all. Did not have access to the instance to try playing around with blnsrvr.exe though.

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Dave Chiluk
Sent: Friday, March 31, 2017 7:47 PM
To: Len White <email address hidden>
Subject: [Bug 1647389] Re: Regression: Live migrations can still crash after CVE-2016-5403 fix

I just tested removing CVE-2016-5403-3.patch, but that didn't seem to do it. I still don't understand how upsteam qemu functions with the calculation the way it is.

--
You received this bug notification because you are subscribed to the bug report.
https://bugs.launchpad.net/bugs/1647389

Title:
  Regression: Live migrations can still crash after CVE-2016-5403 fix

Status in qemu package in Ubuntu:
  Confirmed
Status in qemu source package in Xenial:
  Confirmed

Bug description:
  [Impact]

   * Libvirt migrations using tunnelled libvirt cause a failure to
  migrate on the destination with error VQ 2 size 0x80 < last_avail_idx
  0x9 - used_idx 0xa

   * TBD: justification for backporting the fix to the stable release.
   * TBD: In addition, it is helpful, but not required, to include an
     explanation of how the upload fixes this bug.

  [Test Case]
  1. Create a VM on shared storage solution. In my case NFS.
  2. set start_libvirtd="yes" in /etc/default/libvirt-bin
  3. systemctl restart libvirt-bin
  4. virsh dommemstat 1 <vm>
  4. virsh -c qemu+ssh://${FROM}/system migrate --live --p2p --tunnelled ${VM} qemu+tcp://ubuntu@${TO}/system
  5. Repeat until failure to migrate, then check /var/log/libvirt/qemu/<vm>.log for error from above.

  * Yes --live, --p2p, and --tunnelled are all required to reproduce
  afaik.

  [Regression Potential]
  TBD
   * discussion of how regressions are most likely to manifest as a result of this change.
   * It is assumed that any SRU candidate patch is well-tested before
     upload and has a low overall risk of regression, but it's important
     to make the effort to think about what ''could'' happen in the
     event of a regression.
   * This both shows the SRU team that the risks have been considered,
     and provides guidance to testers in regression-testing the SRU.

  [Other Info]
  TBD
   * Anything else you think is useful to include
   * Anticipate questions from users, SRU, +1 maintenance, security teams and the Technical Board
   * and address these questions in advance

  ___________________ Original Description follows _____________________

  See updates at the end of #1612089. Sample error message:

  Dec 05 14:41:07 zbk130713 libvirtd[29690]: internal error: early end of file from monitor, possible problem:
  2016-12-05T14:41:07.903932Z qemu-system-x86_64: VQ 2 size 0x80 < last_avail_idx 0x9 - used_idx 0xa
  2016-12-05T14:41:07.903981Z qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:05.0/virtio-balloon'
  2016-12-05T14:41:07.905180Z qemu-system-x86_64: load of migration failed: Operation not permitted

  Seems related to this patch series:
  https://lists.gnu.org/archive/html/qemu-devel/2016-08/msg03079.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1647389/+subscriptions