QEMU Does not Send L2 Broadcasts After Live Migration

Bug #1656480 reported by Dmitrii Shcherbakov
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
qemu (Ubuntu)
Fix Released
High
Unassigned
Xenial
Fix Released
High
Dmitrii Shcherbakov

Bug Description

[Impact]
L2 broadcasts are sent out in order to update MAC tables of the switches in the same broadcast domain. If this is not done after live migration there may be a temporary downtime period for clients trying to connect to services residing on a migrated VM due to the fact that switches will still forward frames to the old destination. It will also lead to varying behaviors depending on the operating system or software installed in the guest OS with regards to network activity: there may be cases when VM's network activity will force the MAC table update across the L2 network but this is not guaranteed.

[Test case]
Steps are provided in the bug description.

[Regression Potential]
Small: the change is trivial and is present both in the upstream QEMU and the newer ubuntu releases. This is also not a heavily changing code path in the upstream.

--

In short:
1) Get two Xenial hosts (instead, could be two lxd containers with QEMU inside or two VMs with nested virtualization enabled - doesn't matter);
1) Create a libvirt domain that uses QEMU (can be a bare instance, even without a disk, with a NIC without an IP address since we are testing L2 broadcasts);
2) Launch an instance;
3) Start listening for RARP packets on the destination host's bridge (QEMU uses a RARP L3 header and a broadcast L2 header which is easy to filter as no other software sends RARPs nowadays);
3) Do virsh migrate --live domain desturi;
4) Observe that no RARP packets are sent.

The expected result (QEMU actually sends 5 of those, grep for SELF_ANNOUNCE_ROUNDS == 5 in the sources):

sudo tcpdump -e -i br0 rarp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br0, link-type EN10MB (Ethernet), capture size 262144 bytes
18:19:32.460765 52:54:00:25:49:b7 (oui Unknown) > Broadcast, ethertype Reverse ARP (0x8035), length 60: Reverse Request who-is 52:54:00:25:49:b7 (oui Unknown) tell 52:54:00:25:49:b7 (oui Unknown), length 46
18:19:32.504609 52:54:00:25:49:b7 (oui Unknown) > Broadcast, ethertype Reverse ARP (0x8035), length 60: Reverse Request who-is 52:54:00:25:49:b7 (oui Unknown) tell 52:54:00:25:49:b7 (oui Unknown), length

---
lsb_release -r
Release: 16.04

dpkg --status qemu-kvm | grep Version
Version: 1:2.5+dfsg-5ubuntu10.6
--

Fortunately, there is already a fix for this:
https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg04645.html

http://git.qemu.org/?p=qemu.git;a=commitdiff;h=ca1ee3d6b546e841a1b9db413eb8fa09f13a061b;hp=14e60aaece20a1cfc059a69f6491b0899f9257a8

The issue was introduced in fefe2a78abde932e0f340b21bded2c86def1d242:
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=fefe2a78abde932e0f340b21bded2c86def1d242

$ apt-get source qemu

# we are looking at qemu_2.5+dfsg debian branch
$ ls qemu*dsc
qemu_2.5+dfsg-5ubuntu10.6.dsc

$ git remote show origin | grep Fetch
  Fetch URL: git://anonscm.debian.org/pkg-qemu/qemu.git

$ git checkout debian/qemu_2.5+dfsg-5
HEAD is now at aa4dbf2... Uploading version 2.5+dfsg-5 to unstable

The commit that introduced the issue is present in qemu_2.5+dfsg-5.
$ git merge-base --is-ancestor fefe2a78abde932e0f340b21bded2c86def1d242 HEAD ; echo $?
0

There are no patches for it:

grep -RiP 'QEMU_NET_PACKET_FLAG_RAW' qemu-2.5+dfsg/debian/patches/ ; echo $?
1

description: updated
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

[Impact]
L2 broadcasts are sent out in order to update MAC tables of the switches in the same broadcast domain. If this is not done after live migration there may be a temporary downtime period for clients trying to connect to services residing on a migrated VM due to the fact that switches will still forward frames to the old destination. It will also lead to varying behaviors depending on the operating system or software installed in the guest OS with regards to network activity: there may be cases when VM's network activity will force the MAC table update across the L2 network but this is not guaranteed.

[Test case]
Steps are provided in the bug description.

[Regression Potential]
Small: the change is trivial and is present both in the upstream QEMU and the newer ubuntu releases. This is also not a heavily changing code path in the upstream.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "net-fix-qemu_announce_self-not-emitting-packets.debdiff" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch
description: updated
Mathew Hodson (mhodson)
Changed in qemu (Ubuntu):
importance: Undecided → High
James Page (james-page)
Changed in qemu (Ubuntu Xenial):
importance: Undecided → High
status: New → Triaged
Changed in qemu (Ubuntu):
status: New → Fix Released
Changed in qemu (Ubuntu Xenial):
assignee: nobody → Dmitrii Shcherbakov (dmitriis)
status: Triaged → In Progress
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

The fix is present in yakkety:

> pull-lp-source qemu yakkety
pull-lp-source: Downloading qemu version 1:2.6.1+dfsg-0ubuntu5.2
pull-lp-source: Downloading qemu_2.6.1+dfsg.orig.tar.xz from archive.ubuntu.com (6.021 MiB)
pull-lp-source: Downloading qemu_2.6.1+dfsg-0ubuntu5.2.debian.tar.xz from archive.ubuntu.com (0.117 MiB)
...

> grep 'if (nc->info->receive_iov && !(flags & QEMU_NET_PACKET_FLAG_RAW))' net/net.c
    if (nc->info->receive_iov && !(flags & QEMU_NET_PACKET_FLAG_RAW)) {

Revision history for this message
James Page (james-page) wrote :

Thanks for confirming that Dmitrii; marking dev task as fix released, and targeted to Xenial only.

description: updated
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Dmitrii,
review is fine, patch upstream and applying cleanly - thanks for your work.
I was additonally doing some upgrade and migration tests, but as expected all passed ok.

That said sponsoring your upload now.
qemu_2.5+dfsg-5ubuntu10.7 just left the queue - in some bits of launchpad it is still listed as in queue - I hope this causes no collision.

As of now https://launchpad.net/ubuntu/+source/qemu/1:2.5+dfsg-5ubuntu10.7 still lists it in proposed. And it is the one missing on https://bugs.launchpad.net/qemu/+bug/1626972 to migrate.

But other than that I'd expect no issues / delays.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Hi Christian,

Thanks for the info.

10.7 is still shown as in -proposed here.
http://people.canonical.com/~ubuntu-archive/pending-sru.html

It looks like net/net.c was not modified - only vhost + some other things so there shouldn't be a collision there.
https://launchpadlibrarian.net/294687058/qemu_1%3A2.5+dfsg-5ubuntu10.5_1%3A2.5+dfsg-5ubuntu10.7.diff.gz

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

FYI 10.7 migrated a few minutes ago, the SRU Team can now evaluate your fix to move it to proposed.

Revision history for this message
Andy Whitcroft (apw) wrote : Please test proposed package

Hello Dmitrii, or anyone else affected,

Accepted qemu into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/qemu/1:2.5+dfsg-5ubuntu10.8 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in qemu (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Download full text (8.6 KiB)

I've used two lxd containers interconnected via a bridge to verify this. Each container was a privileged one and had a config as follows (basically allowed /dev/kvm and /dev/net/tun usage and did not create a separate user namespace).

name: cq1
profiles:
- default
config:
  environment.LC_ALL: en_US.UTF-8
  linux.kernel_modules: iptable_nat, ip6table_nat, ebtables, openvswitch, kvm, kvm_intel
  raw.lxc: lxc.cgroup.devices.allow = c 10:232 rwm, lxc.cgroup.devices.allow = c 10:200
    rwm
  security.nesting: "true"
  security.privileged: "true"
...
devices:
  kvm:
    path: /dev/kvm
    type: unix-char
  root:
    path: /
    type: disk
  tun:
    path: /dev/net/tun
    type: unix-char
ephemeral: false

lxc list | grep cq
| cq1 | RUNNING | 192.168.122.1 (virbr0) | | PERSISTENT | 0 |
| cq2 | RUNNING | 192.168.122.1 (virbr0) | | PERSISTENT | 0 |

----

Preparation steps:

root@cq1:~# grep proposed /etc/apt/sources.list
deb http://archive.ubuntu.com/ubuntu/ xenial-proposed restricted main multiverse universe

root@cq1:~# apt-get install {qemu,qemu-user,qemu-utils,qemu-system,qemu-system-x86,qemu-system-common}/xenial-proposed
Reading package lists... Done
Building dependency tree
Reading state information... Done
Selected version '1:2.5+dfsg-5ubuntu10.8' (Ubuntu:16.04/xenial-proposed [amd64]) for 'qemu'
Selected version '1:2.5+dfsg-5ubuntu10.8' (Ubuntu:16.04/xenial-proposed [amd64]) for 'qemu-user'
Selected version '1:2.5+dfsg-5ubuntu10.8' (Ubuntu:16.04/xenial-proposed [amd64]) for 'qemu-utils'
Selected version '1:2.5+dfsg-5ubuntu10.8' (Ubuntu:16.04/xenial-proposed [amd64]) for 'qemu-system'
Selected version '1:2.5+dfsg-5ubuntu10.8' (Ubuntu:16.04/xenial-proposed [amd64]) for 'qemu-system-x86'
Selected version '1:2.5+dfsg-5ubuntu10.8' (Ubuntu:16.04/xenial-proposed [amd64]) for 'qemu-system-common'
The following additional packages will be installed:
  binfmt-support cpu-checker ipxe-qemu libaio1 libasound2 libasound2-data libasyncns0 libbluetooth3 libboost-iostreams1.58.0 libboost-random1.58.0 libboost-system1.58.0 libboost-thread1.58.0 libbrlapi0.6 libcaca0
  libcacard0 libfdt1 libflac8 libiscsi2 libjpeg-turbo8 libjpeg8 libnspr4 libnss3 libnss3-nssdb libogg0 libopus0 libpixman-1-0 libpulse0 librados2 librbd1 libsdl1.2debian libsndfile1 libspice-server1 libusbredirparser1
  libvorbis0a libvorbisenc2 libxen-4.6 libxenstore3.0 libyajl2 msr-tools qemu-block-extra qemu-slof qemu-system-arm qemu-system-mips qemu-system-misc qemu-system-ppc qemu-system-sparc qemu-user-static seabios sharutils
Suggested packages:
  libasound2-plugins alsa-utils opus-tools pulseaudio samba vde2 qemu-block-extra openbios-ppc openhackware sgabios ovmf debootstrap bsd-mailx | mailx
The following NEW packages will be installed:
  binfmt-support cpu-checker ipxe-qemu libaio1 libasound2 libasound2-data libasyncns0 libbluetooth3 libboost-iostreams1.58.0 libboost-random1.58.0 libboost-system1.58.0 libboost-thread1.58.0 libbrlapi0.6 libcaca0
  libcacard0 libfdt1 libflac8 libiscsi2 libjpeg-turbo8 libjpeg8 libnspr4 libnss3 libnss3-nssdb libogg0 libopus0 libpixman-1-0 libpulse0 librados2 librbd1 libsdl1.2debian...

Read more...

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu - 1:2.5+dfsg-5ubuntu10.8

---------------
qemu (1:2.5+dfsg-5ubuntu10.8) xenial; urgency=medium

  [ Dmitrii Shcherbakov ]
  * d/p/ubuntu/net-fix-qemu_announce_self-not-emitting-packets.patch:
     Cherrypick upstream patch: net: fix qemu_announce_self not emitting
     packets (LP: #1656480)

 -- Christian Ehrhardt <email address hidden> Mon, 23 Jan 2017 15:12:05 +0100

Changed in qemu (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for qemu has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

For Mitaka, this bug will be included in UCA together with the fix for:

https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1641532

When it becomes available.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.