Linux (3.13 to 3.16) as a Router under QEmu +2, does not route VLAN tagged packets, that are originated within the Hypervisor itself.

Bug #1362755 reported by Thiago Martins
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Undecided
Unassigned

Bug Description

Guys,

Trusty QEmu 2.0 Hypervisor fails to create a consistent virtual network. It does not route tagged VLAN packets.

That's it, it is impossible to use Trusty acting as a QEmu 2.0 Hypervisor (metapakage `ubuntu-virt-server`), to make a basic virtual tagged network within itself. QEmu 2.X guest does not route traffic when with tagged VLANs!

So, Trusty QEmu 2.0 Hypervisor cannot be used to host guests acting as "firewalls / routers", and it have an easy to reproduce, connectivity problem.

This network problem affects Ubuntu 14.04.1 (Linux-3.13.0-35-generic) with QEmu 2.0 (it also affects 14.10, Linux 3.16 - QEmu 2.1).

I have this very same setup up and running, on about ~100 physical servers (others Trusty QEmu 2.0 Hypervisors), and in only a few of them, the QEmu Hypervisors dedicated to host "guest acting as routers / firewalls", like a "borger gateway" for example, that it does not work as expected.

One interesting thing to note is that, this BUG appear only, and only at, the QEmu Hypervisors dedicated to host guests that are used as `router / firewalls` (as I said above), others QEmu Hypervisors of my network does not suffer from this problem.

Another interesting point is that it fails to route tagged VLAN packets only when these packets are originated from within the Hypervisor itself, I mean, packets from both host and other guests (not the router/firewall guest itself), suffer from this connectivity problem.

As a workaroung / fix, Xen-4.4 can be used, instead of QEmu 2.0, as a "border hypervisor". So, this proves that there is something wrong with QEmu.

I already tested it with both `openvswitch-switch` and with `bridge-utils`, same bad results. So, don't waste your time trying `bridge-utils` (optional steps while reproducing it), you can keep OVS bridges from original design.

I think that I'm using the best pratices to build this environment, as follows...

* Topology *

QEmu 2.0 Hypervisor - (qemu-host-1.domain.com - the "border hypervisor"):

1- Physical machine with 3 NICs;
2- Minimal Ubuntu 14.04.1 installed and upgraded;
3- Packages installed: "ubuntu-virt-server openvswitch-switch rdnssd tcpdump".

- eth0 connected to the Internet - VLAN tag 10;
- eth1 connected to the LAN1 - VLAN tag 100;
- eth2 connected to the LAN2 - VLAN tag 200;

Guest (guest-fw-1.domain.com - the "border gateway" itself - regular guest acting as a router with iptables/ip6tables):

1- Virtual Machine with 3 NICs (VirtIO);
2- Minimal Virtual Machine Ubuntu 14.04.1 installed and upgraded;
3- Packages installed: "aiccu iptables vlan pv-grub-menu".

OBS: You'll need `virt-manager` to connect at `qemu-host-1` to install `guest-fw-1`. Then, use `guest-fw-1` as a default gateway for your (virt-)lab network, including the `qemu-host-1` itself.

Steps to reproduce

* Preparing the `qemu-host-1` host:

- Configure the /etc/network/interfaces with:

---
# The loopback network interface
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet manual
        up ip link set $IFACE up
        down ip link set $IFACE down

auto eth1
iface eth1 inet manual
 up ip link set dev $IFACE up
 down ip link set dev $IFACE down

auto ovsbr1p1
iface ovsbr1p1 inet6 auto

iface ovsbr1p1 inet static
 address 192.168.1.10
 netmask 24
 gateway 192.168.1.1

auto eth2
iface eth2 inet manual
 up ip link set $IFACE up
 down ip link set $IFACE down
---

- Creating the Hypervisor OVS Bridges:

ovs-vsctl add-br ovsbr0
ovs-vsctl add-br ovsbr1
ovs-vsctl add-br ovsbr2

- Attaching the bridges to the NICs:

ovs-vsctl add-port ovsbr0 eth0
ovs-vsctl add-port ovsbr1 eth1
ovs-vsctl add-port ovsbr2 eth2

- Creating the OVS internal tagged interface (best practice?), so the QEmu Hypervisor itself can have its own IP (v4 and v6):

ovs-vsctl add-port ovsbr1 ovsbr1p1 tag=100 -- set interface ovsbr1p1 type=internal
ovs-vsctl set interface ovsbr1p1 mac=\"32:ac:85:72:ab:fe\"

 NOTE:

 * I'm fixing the MAC Address of ovsbr1p1 because I like to use IPv6 with SLAAC, so, it remain fixed across host reboots.

- Making Libvirt aware of OVS Bridges:

Create 3 files, one for each bridge, like this (ovsbr0.xml, ovsbr1.xml and ovsbr2.xml):

--- ovsbr0.xml contents:
<network>
 <name>ovsbr0</name>
 <forward mode='bridge'/>
 <bridge name='ovsbr0'/>
 <virtualport type='openvswitch'/>
</network>
---

--- ovsbr1.xml contents:
<network>
 <name>ovsbr1</name>
 <forward mode='bridge'/>
 <bridge name='ovsbr1'/>
 <virtualport type='openvswitch'/>
</network>
---

--- ovsbr2.xml contents:
<network>
 <name>ovsbr2</name>
 <forward mode='bridge'/>
 <bridge name='ovsbr2'/>
 <virtualport type='openvswitch'/>
</network>
---

Run:

virsh net-define ovsbr0.xml
virsh net-define ovsbr1.xml
virsh net-define ovsbr2.xml

virsh net-autostart ovsbr0
virsh net-autostart ovsbr1
virsh net-autostart ovsbr2

virsh net-start ovsbr0
virsh net-start ovsbr1
virsh net-start ovsbr2

- Creating the "guest-fw-1.domain.com" (Ubuntu 14.04.1 - Minimum Virtual Machine):

1- VM Configuration file (network-only / cutted):

---
    <interface type='network'>
      <mac address='52:54:00:41:8c:3f'/>
      <source network='ovsbr0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <interface type='network'>
      <mac address='52:54:00:27:b2:7d'/>
      <source network='ovsbr1'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </interface>
    <interface type='network'>
      <mac address='52:54:00:ff:35:5c'/>
      <source network='ovsbr2'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
    </interface>
---

2- Configure "guest-fw-1.domain.com" (the router / firewall guest) /etc/network/interfaces file like this:

---
auto vlan10
iface vlan10 inet static
        vlan_raw_device eth0
        address 200.2.1.106
        netmask 29
        gateway 200.2.1.105
        dns-nameserver 8.8.8.8

auto vlan100
iface vlan100 inet6 static
        vlan_raw_device eth1
        address 2001:129X:2XX:810X::2
        netmask 64
        dns-nameserver 2001:4860:4860::8844 2001:4860:4860::8888

iface vlan100 inet static
        vlan_raw_device eth1
        address 192.168.4.1
        netmask 24

auto vlan200
iface vlan200 inet6 static
        vlan_raw_device eth2
        address 2001:1291:2de:10::1
        netmask 64

iface vlan200 inet static
        vlan_raw_device eth2
        address 172.16.0.1
        netmask 24
---

3- Enable radvd for your LANs:

---
# SERVERS
interface vlan100 {
        AdvSendAdvert on;
        MinRtrAdvInterval 3;
        MaxRtrAdvInterval 10;
        AdvLinkMTU 1500;
        AdvDefaultPreference high;
        prefix 2001:1291:200:850a::/64 {
                DeprecatePrefix on;
                AdvOnLink on;
                AdvAutonomous on;
                AdvRouterAddr on;
        };
        route ::/0 {
                RemoveRoute on;
        };
        RDNSS 2001:4860:4860::8844 2001:4860:4860::8888 { };
        DNSSL domain.com.br { };
};
# DESKTOPS
interface vlan200 {
        AdvSendAdvert on;
        MinRtrAdvInterval 3;
        MaxRtrAdvInterval 10;
        AdvLinkMTU 1500;
        AdvDefaultPreference high;
        prefix 2001:1291:2de:10::/64 {
                DeprecatePrefix on;
                AdvOnLink on;
                AdvAutonomous on;
                AdvRouterAddr on;
        };
        route ::/0 {
                RemoveRoute on;
        };
        RDNSS 2001:4860:4860::8844 2001:4860:4860::8888 { };
        DNSSL igcorp.com.br { };
};
---

4- HIT TUE BUG!

 Go to `qemu-host-1.domain.com` and try to run "apt-get update", it will not work! Ping works... TCP connections doesn't.

 The gateway of `qemu-host-1.domain.com` (through ovsbr1p1), is the QEmu 2.0 Virtual Machine hosted on itself, the guest `guest-fw-1.domain.com`.

Details:

---
root@qemu-host-1:~# ip r
default via 192.168.4.1 dev ovsbr1p1
192.168.4.0/24 dev ovsbr1p1 proto kernel scope link src 192.168.4.2
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1

root@qemu-host-1:~# ip -6 r | grep ovsbr1p1
2001:1291:200:850a::/64 dev ovsbr1p1 proto kernel metric 256 expires 86397sec
fe80::/64 dev ovsbr1p1 proto kernel metric 256
default via fe80::5054:ff:feb5:7744 dev ovsbr1p1 proto ra metric 1024 expires 27sec

# ping6 okay...
root@qemu-host-1:~# ping6 google.com -c1
PING google.com(2800:3f0:4001:815::1007) 56 data bytes
64 bytes from 2800:3f0:4001:815::1007: icmp_seq=1 ttl=55 time=44.5 ms

--- google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 44.579/44.579/44.579/0.000 ms

# traceroute6 okay...
root@qemu-host-1:~# traceroute6 google.com
traceroute to google.com (2800:3f0:4001:815::1007) from 2001:1291:200:850a:1054:3d86:369:d4b2, 30 hops max, 24 byte packets
 1 2001:1291:200:850a::2 (2001:1291:200:850a::2) 0.394 ms 0.261 ms 0.223 ms
 2 gw-1291.udi-01.br.sixxs.net (2001:1291:200:50a::1) 21.536 ms 20.738 ms 20.902 ms
 3 brudi01.sixxs.net (2001:1291:2::b) 20.684 ms 20.74 ms 20.846 ms
 4 ge-0-2-0-71.seed.ula001.ctbc.com.br (2001:1291:2::a) 197.392 ms 141.706 ms 21.058 ms
 5 ge-5-2-0-0.core-d.ula001.ctbc.com.br (2001:1291:0:98::a) 21.069 ms 20.837 ms 20.903 ms
 6 ae0-0.core-b.fac001.ctbc.com.br (2001:1291:0:d7::a) 24.564 ms 24.464 ms 24.649 ms
 7 et-1-0-0-0.border-a.fac001.ctbc.com.br (2001:1291:0:4b::b) 24.734 ms 24.525 ms 25.273 ms
 8 2001:1291:0:63::2 (2001:1291:0:63::2) 36.619 ms 36.245 ms 36.335 ms
 9 2001:4860::1:0:4f20 (2001:4860::1:0:4f20) 36.285 ms 41.017 ms 36.375 ms
10 2001:4860:0:1::71 (2001:4860:0:1::71) 31.601 ms 31.623 ms 31.512 ms
11 2800:3f0:4001:815::12 (2800:3f0:4001:815::12) 30.826 ms 30.683 ms 30.769 ms

# NOTE: the first hope is the "guest-fw-1".

# "apt-get update", not okay! *BUG*

root@qemu-host-1:~# apt-get update
0% [Connecting to us.archive.ubuntu.com (2001:67c:1562::14)] [Connecting to sec

# it remains "Waiting for headers" forever...

# While waiting for "apt-get update" above, `tcpdump -ni ovsbr1p1` shows:

http://pastebin.com/2BUiNEfQ

---

 (OPTIONAL STEP - replace OpenvSwitch by bridge-utils - does not fix it!)

 Possible workarounds: is this an OpenvSwitch BUG? Lets try it with `bridge-utils` instead...

 * Reconfigure your "qemu-host-1.domain.com" to use `bridge-utils`, instead of openvswitch-switch.

------------------------

1- Preparing the host, now using `bridge-utils` instead of OpenvSwitch:

- Reconfigure `qemu-host-1`s /etc/network/interfaces file with:

---
auto br0
iface br0 inet manual
        bridge_ports eth0
        bridge_maxwait 5
        bridge_fd 1
        bridge_stp on

auto br1
iface br1 inet manual
        bridge_ports eth1
        bridge_maxwait 5
        bridge_fd 1
        bridge_stp on

auto vlan100
iface vlan100 inet6 auto
        vlan_raw_device br1

iface vlan100 inet static
        vlan_raw_device br1
        address 192.168.1.10
        netmask 24
        gateway 192.168.1.1

auto br2
iface br2 inet manual
        bridge_ports eth2
        bridge_maxwait 5
        bridge_fd 1
        bridge_stp on
---

2- New VM Configuration file (network-only section / cutted), adjusted to make use bridges from `bridge-utils` package:

---
    <interface type='bridge'>
      <mac address='52:54:00:41:8c:3f'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='52:54:00:27:b2:7d'/>
      <source bridge='br1'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='52:54:00:ff:35:5c'/>
      <source bridge='br2'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
    </interface>
---

* Start `guest-fw-1` as-is:

 virsh start guest-fw-1

New try:

---
root@qemu-host-1:~# ip r
default via 192.168.4.1 dev vlan100
192.168.4.0/24 dev vlan100 proto kernel scope link src 192.168.4.2
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1

root@qemu-host-1:~# ip -6 r | grep vlan100
2001:1291:200:850a::/64 dev vlan100 proto kernel metric 256 expires 86397sec
fe80::/64 dev vlan100 proto kernel metric 256
default via fe80::5054:ff:feb5:7744 dev vla100 proto ra metric 1024 expires 27sec

# ping6 okay...
root@qemu-host-1:~# ping6 google.com -c1
PING google.com(2800:3f0:4001:815::1007) 56 data bytes
64 bytes from 2800:3f0:4001:815::1007: icmp_seq=1 ttl=55 time=44.5 ms

--- google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 44.579/44.579/44.579/0.000 ms

# traceroute6 okay...
root@qemu-host-1:~# traceroute6 google.com
traceroute to google.com (2800:3f0:4001:815::1007) from 2001:1291:200:850a:1054:3d86:369:d4b2, 30 hops max, 24 byte packets
 1 2001:1291:200:850a::2 (2001:1291:200:850a::2) 0.394 ms 0.261 ms 0.223 ms
 2 gw-1291.udi-01.br.sixxs.net (2001:1291:200:50a::1) 21.536 ms 20.738 ms 20.902 ms
 3 brudi01.sixxs.net (2001:1291:2::b) 20.684 ms 20.74 ms 20.846 ms
 4 ge-0-2-0-71.seed.ula001.ctbc.com.br (2001:1291:2::a) 197.392 ms 141.706 ms 21.058 ms
 5 ge-5-2-0-0.core-d.ula001.ctbc.com.br (2001:1291:0:98::a) 21.069 ms 20.837 ms 20.903 ms
 6 ae0-0.core-b.fac001.ctbc.com.br (2001:1291:0:d7::a) 24.564 ms 24.464 ms 24.649 ms
 7 et-1-0-0-0.border-a.fac001.ctbc.com.br (2001:1291:0:4b::b) 24.734 ms 24.525 ms 25.273 ms
 8 2001:1291:0:63::2 (2001:1291:0:63::2) 36.619 ms 36.245 ms 36.335 ms
 9 2001:4860::1:0:4f20 (2001:4860::1:0:4f20) 36.285 ms 41.017 ms 36.375 ms
10 2001:4860:0:1::71 (2001:4860:0:1::71) 31.601 ms 31.623 ms 31.512 ms
11 2800:3f0:4001:815::12 (2800:3f0:4001:815::12) 30.826 ms 30.683 ms 30.769 ms

# BUG effect! "apt-get update", not okay!

root@qemu-host-1:~# apt-get update
0% [Connecting to us.archive.ubuntu.com (2001:67c:1562::14)] [Connecting to sec

# it remains "Waiting for headers" forever...

- So! It is not an OpenvSwitch BUG! Removing `bridge-utils` bridges, falling back to OpenvSwitch as we started.

** Workaround #2: Use Xen-4.4 instead of QEmu 2.0 / Back to OpenvSwitch.

-- VM conf (`guest-fw-1` needs to have /etc/init/hvc0.conf):

---
name = "guest-fw-1"

uuid = "17e031c7-1264-4979-8f06-c5e016469474"

bootloader = "pygrub"

memory = 2048

vcpus = 2

vif = [ 'bridge=ovsbr0', 'bridge=ovsbr1', 'bridge=ovsbr2', 'bridge=ovsbr3', 'bridge=ovsbr4', 'bridge=ovsbr5' ]

disk = [ 'tap:raw:/var/lib/libvirt/images/guest-fw-1-disk0.img,xvda,rw' ]
---

Details - Working as expected when with Xen!! Look:

---
root@qemu-host-1:~# ping6 -c 1 google.com
PING google.com(2800:3f0:4001:815::1002) 56 data bytes
64 bytes from 2800:3f0:4001:815::1002: icmp_seq=1 ttl=55 time=37.5 ms

root@qemu-host-1:~# ip -6 r | grep ovsbr1p1
2001:1291:200:850a::/64 dev ovsbr1p1 proto kernel metric 256 expires 86394sec
fe80::/64 dev ovsbr1p1 proto kernel metric 256
default via fe80::5054:ff:feb5:7744 dev ovsbr1p1 proto ra metric 1024 expires 24sec

# *BUG dissapeared!*

root@qemu-host-1:~# apt-get update
Ign http://us.archive.ubuntu.com trusty InRelease
Ign http://security.ubuntu.com trusty-security InRelease
Ign http://us.archive.ubuntu.com trusty-proposed InRelease
Ign http://us.archive.ubuntu.com trusty-updates InRelease
Ign http://us.archive.ubuntu.com trusty-backports InRelease
Get:1 http://security.ubuntu.com trusty-security Release.gpg [933 B]
Hit http://us.archive.ubuntu.com trusty Release.gpg
Get:2 http://security.ubuntu.com trusty-security Release [59.7 kB]
........................
Ign http://us.archive.ubuntu.com trusty/main Translation-en_US
Ign http://us.archive.ubuntu.com trusty/multiverse Translation-en_US
Ign http://us.archive.ubuntu.com trusty/restricted Translation-en_US
Ign http://us.archive.ubuntu.com trusty/universe Translation-en_US
Fetched 1,011 kB in 19s (50.7 kB/s)
Reading package lists... Done
---

Now, both Xen Dom0 (`qemu-host-1`) and DomU (`guest-fw-1`) works as expected! You guys can see that the `guest-fw-1` is working on top of Xen, as-is, I mean, the changes happened only at the Hypervisor itself, problem solved (not for QEmu)!

But, QEmu still have a problem, if I remove Xen, back to QEmu, then, the host `qemu-host-1` cannot browse the web again (`apt-get update` will not work if its gateway is a QEmu guest).

 ** Workaround #3: Untagging the VLANs with OpenvSwitch and its "fake bridges".

 The presented workaround have one big downside, while it allows us to keep using QEmu (and KSM), it requires a complete reconfiguration of the `guest-fw-1` interfaces! Also, for each VLAN tag, you'll need to create a fake bridge, a new VirtIO NIC for your guest (this might add a bit of overhead for your hypervisor as a whole, I'm not sure), plus a lot of extra work... If you need to add a new VLAN to your `guest-fw-1`, you'll need to reboot it, to add a new VirtIO NIC (this isn't the best way to build hypervisors - not the best practice), this is just a real workaround that allows you to keep using QEmu (and benefits from KSM, Libvirt and etc)...

 While, when replacing QEmu by Xen, you don't need to change a single line within the guest itself...

 So, this network problem lies within the QEmu Virtual Machine!

 Doing this workaround:

1- Untagging the VLANs at OpenvSwitch, because QEmu can't handle it:

ovs-vsctl add-br vlan10 ovsbr0 10
ovs-vsctl add-br vlan100 ovsbr1 100
ovs-vsctl add-br vlan200 ovsbr2 100

2- Making Libvirt aware of OVS Fake Bridges:

Create 3 files, one for each fake bridge, like this (vlan10.xml, vlan100.xml and vlan200.xml):

--- vlan10.xml contents:
<network>
 <name>vlan10</name>
 <forward mode='bridge'/>
 <bridge name='vlan10'/>
 <virtualport type='openvswitch'/>
</network>
---

--- vlan100.xml contents:
<network>
 <name>vlan100</name>
 <forward mode='bridge'/>
 <bridge name='vlan100'/>
 <virtualport type='openvswitch'/>
</network>
---

--- vlan200.xml contents:
<network>
 <name>vlan200</name>
 <forward mode='bridge'/>
 <bridge name='vlan200'/>
 <virtualport type='openvswitch'/>
</network>
---

Run:

virsh net-define vlan10.xml
virsh net-define vlan100.xml
virsh net-define vlan200.xml

virsh net-autostart vlan10
virsh net-autostart vlan100
virsh net-autostart vlan200

virsh net-start vlan10
virsh net-start vlan100
virsh net-start vlan200

3- Reconfigure the `guest-fw-1` to make use of new "fake bridges":

---
    <interface type='network'>
      <mac address='52:54:00:41:8c:3f'/>
      <source network='vlan10'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <interface type='network'>
      <mac address='52:54:00:27:b2:7d'/>
      <source network='vlan100'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </interface>
    <interface type='network'>
      <mac address='52:54:00:ff:35:5c'/>
      <source network='vlan200'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
    </interface>
---

4- Reconfigure `guest-gw-1`s /etc/network/interfaces file:

---
auto eth0
iface eth0 inet static
# vlan_raw_device eth0
        address 200.2.1.106
        netmask 29
        gateway 200.2.1.105
        dns-nameserver 8.8.8.8

auto eth1
iface eth1 inet6 static
# vlan_raw_device eth1
        address 2001:129X:2XX:810X::2
        netmask 64
        dns-nameserver 2001:4860:4860::8844 2001:4860:4860::8888

iface eth1 inet static
# vlan_raw_device eth1
        address 192.168.4.1
        netmask 24

auto eth2
iface eth2 inet6 static
# vlan_raw_device eth2
        address 2001:1291:2de:10::1
        netmask 64

iface eth2 inet static
# vlan_raw_device eth2
        address 172.16.0.1
        netmask 24
---

5- Details: Working as expected when with QEmu but, without tagging the VLAN within the `guest-fw-1` itself.

---
root@qemu-host-1:~# ping6 -c 1 google.com
PING google.com(2800:3f0:4001:815::1002) 56 data bytes
64 bytes from 2800:3f0:4001:815::1002: icmp_seq=1 ttl=55 time=37.5 ms

root@qemu-host-1:~# ip -6 r | grep ovsbr1p1
2001:1291:200:850a::/64 dev ovsbr1p1 proto kernel metric 256 expires 86394sec
fe80::/64 dev ovsbr1p1 proto kernel metric 256.
default via fe80::5054:ff:feb5:7744 dev ovsbr1p1 proto ra metric 1024 expires 24sec

# *BUG dissapeared!*

root@qemu-host-1:~# apt-get update
Ign http://us.archive.ubuntu.com trusty InRelease
Ign http://security.ubuntu.com trusty-security InRelease
Ign http://us.archive.ubuntu.com trusty-proposed InRelease
Ign http://us.archive.ubuntu.com trusty-updates InRelease
Ign http://us.archive.ubuntu.com trusty-backports InRelease
Get:1 http://security.ubuntu.com trusty-security Release.gpg [933 B]
Hit http://us.archive.ubuntu.com trusty Release.gpg
Get:2 http://security.ubuntu.com trusty-security Release [59.7 kB]
........................
Ign http://us.archive.ubuntu.com trusty/main Translation-en_US
Ign http://us.archive.ubuntu.com trusty/multiverse Translation-en_US
Ign http://us.archive.ubuntu.com trusty/restricted Translation-en_US
Ign http://us.archive.ubuntu.com trusty/universe Translation-en_US
Fetched 1,011 kB in 19s (50.7 kB/s)
Reading package lists... Done
---

Conclusion:

A QEmu guest router does not route tagged VLAN packages that are originated at its host, neighter from others guests hosted at the same hypervisor. Making it impossible to create a virtual network within a hypervisor.

Best Regards,
Thiago Martins

Tags: qemu tag vlan
Thiago Martins (martinx)
description: updated
tags: added: qemu tag vlan
Thiago Martins (martinx)
description: updated
Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1362755] [NEW] QEmu +2 does not route VLAN tagged packets, that are originated within the Hypervisor itself.

Does this also fail with a build of git://git.qemu.org/qemu.git?

Revision history for this message
Thiago Martins (martinx) wrote : Re: [Qemu-devel] [Bug 1362755] [NEW] QEmu +2 does not route VLAN tagged packets, that are originated within the Hypervisor itself.
Download full text (24.8 KiB)

Hey Vlad,

Sorry about this delay...

I tried the following commands within the guest (that router/firewall "
guest-fw-1.domain.com"):

root@guest-fw-1:~# ethtool -K eth1 tso off
root@guest-fw-1:~# ethtool -K eth1 gso off
root@guest-fw-1:~# ethtool -K eth1 tx off

But it did not fixed the problem... Then, I tried (it was still enabled for
that "vlan device", just to test):

root@guest-fw-1:~# ethtool -K vlan100 gso off

Didn't worked either...

---
root@guest-fw-1:~# grep vlan100 /proc/net/vlan/config
vlan100 | 100 | eth1
---

Also, at the KVM host, its ovsbr1 (OpenvSwitch bridge), is attached to
eth1, then, I tried to turn `tso/gso/tx off` there too but, no, same bad
results.

I tried to disable tso/gso/tx at another guest, of vlan100 net, didn't
worked either.

Regards,
Thiago

On 29 August 2014 11:20, Vlad Yasevich <email address hidden> wrote:

> [ realized that the bug and reporter were non cc'd, updated cc list]
>
> On 08/28/2014 02:40 PM, Thiago Martins wrote:
> > Public bug reported:
> >
> > Guys,
> >
> > Trusty QEmu 2.0 Hypervisor fails to create a consistent virtual network.
> > It does not route tagged VLAN packets.
> >
>
> The have a been a bunch of rather recent changes to the kernel to support
> guest VLANs correctly. The issues have been around TSO/GSO implementation
> in the kernel.
>
> Could try disabling TSO/GSO and tx checksums on the vlan devices in the
> guest
> and see if it solves your problem?
>
> If it does, could you try the kernel from
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git
> turn the offloads back on and see if the problem is solved?
>
> Thanks
> -vlad
>
> > That's it, it is impossible to use Trusty acting as a QEmu 2.0
> > Hypervisor (metapakage `ubuntu-virt-server`), to make a basic virtual
> > tagged network within itself. QEmu 2.X guest does not route traffic when
> > with tagged VLANs!
> >
> > So, Trusty QEmu 2.0 Hypervisor cannot be used to host guests acting as
> > "firewalls / routers", and it have an easy to reproduce, connectivity
> > problem.
> >
> > This network problem affects Ubuntu 14.04.1 (Linux-3.13.0-35-generic)
> > with QEmu 2.0 (it also affects 14.10, Linux 3.16 - QEmu 2.1).
> >
> > I have this very same setup up and running, on about ~100 physical
> > servers (others Trusty QEmu 2.0 Hypervisors), and in only a few of them,
> > the QEmu Hypervisors dedicated to host "guest acting as routers /
> > firewalls", like a "borger gateway" for example, that it does not work
> > as expected.
> >
> > One interesting thing to note is that, this BUG appear only, and only
> > at, the QEmu Hypervisors dedicated to host guests that are used as
> > `router / firewalls` (as I said above), others QEmu Hypervisors of my
> > network does not suffer from this problem.
> >
> > Another interesting point is that it fails to route tagged VLAN packets
> > only when these packets are originated from within the Hypervisor
> > itself, I mean, packets from both host and other guests (not the
> > router/firewall guest itself), suffer from this connectivity problem.
> >
> > As a workaroung / fix, Xen-4.4 can be used, instead of QEmu 2.0, as a
> > "border hypervisor". So, this proves that there ...

Revision history for this message
Thiago Martins (martinx) wrote :
Download full text (25.5 KiB)

Oops!

It seems to be working now! :-D

This that I just said:

I tried to disable tso/gso/tx at another guest, of vlan100 net, didn't
> worked either.
>

Isn't true... I created a new guest, side-by-side with "guest-fw-1", on
vlan100, and after disabling `tso/gso/tx`, its connectivity becomes stable!

I'll try this: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git

Nevertheless, I'm bit more confused now, the problem was fixed by disabling
`tso/gso/tx` but, I did this *only within this new "guest-server-1"* that I
created within vlan100... Only at it, I disabled "tso/gso/tx", so, the
problem seems to not be within the "guest-fw-1" as I was thinking...
Weird...

Tks,
Thiago

> Regards,
> Thiago
>
> On 29 August 2014 11:20, Vlad Yasevich <email address hidden> wrote:
>
>> [ realized that the bug and reporter were non cc'd, updated cc list]
>>
>> On 08/28/2014 02:40 PM, Thiago Martins wrote:
>> > Public bug reported:
>> >
>> > Guys,
>> >
>> > Trusty QEmu 2.0 Hypervisor fails to create a consistent virtual network.
>> > It does not route tagged VLAN packets.
>> >
>>
>> The have a been a bunch of rather recent changes to the kernel to support
>> guest VLANs correctly. The issues have been around TSO/GSO implementation
>> in the kernel.
>>
>> Could try disabling TSO/GSO and tx checksums on the vlan devices in the
>> guest
>> and see if it solves your problem?
>>
>> If it does, could you try the kernel from
>> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git
>> turn the offloads back on and see if the problem is solved?
>>
>> Thanks
>> -vlad
>>
>> > That's it, it is impossible to use Trusty acting as a QEmu 2.0
>> > Hypervisor (metapakage `ubuntu-virt-server`), to make a basic virtual
>> > tagged network within itself. QEmu 2.X guest does not route traffic when
>> > with tagged VLANs!
>> >
>> > So, Trusty QEmu 2.0 Hypervisor cannot be used to host guests acting as
>> > "firewalls / routers", and it have an easy to reproduce, connectivity
>> > problem.
>> >
>> > This network problem affects Ubuntu 14.04.1 (Linux-3.13.0-35-generic)
>> > with QEmu 2.0 (it also affects 14.10, Linux 3.16 - QEmu 2.1).
>> >
>> > I have this very same setup up and running, on about ~100 physical
>> > servers (others Trusty QEmu 2.0 Hypervisors), and in only a few of them,
>> > the QEmu Hypervisors dedicated to host "guest acting as routers /
>> > firewalls", like a "borger gateway" for example, that it does not work
>> > as expected.
>> >
>> > One interesting thing to note is that, this BUG appear only, and only
>> > at, the QEmu Hypervisors dedicated to host guests that are used as
>> > `router / firewalls` (as I said above), others QEmu Hypervisors of my
>> > network does not suffer from this problem.
>> >
>> > Another interesting point is that it fails to route tagged VLAN packets
>> > only when these packets are originated from within the Hypervisor
>> > itself, I mean, packets from both host and other guests (not the
>> > router/firewall guest itself), suffer from this connectivity problem.
>> >
>> > As a workaroung / fix, Xen-4.4 can be used, instead of QEmu 2.0, as a
>> > "border hypervisor". So, this proves that there is something wrong wi...

Revision history for this message
Thiago Martins (martinx) wrote :
Download full text (26.7 KiB)

Guys,

In fact, the "ethtool" workaround is far from perfect, I'm still seeing
lots of connectivity issues within my KVM Host and its Guests, even after
disabling gso/tso/tx on all interfaces, both virtual (within guests) and
physical (at the host).

I really appreciate any help!

Thanks!
Thiago

On 19 October 2014 03:30, Martinx - ジェームズ <email address hidden> wrote:

> Oops!
>
> It seems to be working now! :-D
>
> This that I just said:
>
> I tried to disable tso/gso/tx at another guest, of vlan100 net, didn't
>> worked either.
>>
>
> Isn't true... I created a new guest, side-by-side with "guest-fw-1", on
> vlan100, and after disabling `tso/gso/tx`, its connectivity becomes stable!
>
> I'll try this: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git
>
> Nevertheless, I'm bit more confused now, the problem was fixed by
> disabling `tso/gso/tx` but, I did this *only within this new
> "guest-server-1"* that I created within vlan100... Only at it, I disabled
> "tso/gso/tx", so, the problem seems to not be within the "guest-fw-1" as I
> was thinking... Weird...
>
> Tks,
> Thiago
>
>
>> Regards,
>> Thiago
>>
>> On 29 August 2014 11:20, Vlad Yasevich <email address hidden> wrote:
>>
>>> [ realized that the bug and reporter were non cc'd, updated cc list]
>>>
>>> On 08/28/2014 02:40 PM, Thiago Martins wrote:
>>> > Public bug reported:
>>> >
>>> > Guys,
>>> >
>>> > Trusty QEmu 2.0 Hypervisor fails to create a consistent virtual
>>> network.
>>> > It does not route tagged VLAN packets.
>>> >
>>>
>>> The have a been a bunch of rather recent changes to the kernel to support
>>> guest VLANs correctly. The issues have been around TSO/GSO
>>> implementation
>>> in the kernel.
>>>
>>> Could try disabling TSO/GSO and tx checksums on the vlan devices in the
>>> guest
>>> and see if it solves your problem?
>>>
>>> If it does, could you try the kernel from
>>> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git
>>> turn the offloads back on and see if the problem is solved?
>>>
>>> Thanks
>>> -vlad
>>>
>>> > That's it, it is impossible to use Trusty acting as a QEmu 2.0
>>> > Hypervisor (metapakage `ubuntu-virt-server`), to make a basic virtual
>>> > tagged network within itself. QEmu 2.X guest does not route traffic
>>> when
>>> > with tagged VLANs!
>>> >
>>> > So, Trusty QEmu 2.0 Hypervisor cannot be used to host guests acting as
>>> > "firewalls / routers", and it have an easy to reproduce, connectivity
>>> > problem.
>>> >
>>> > This network problem affects Ubuntu 14.04.1 (Linux-3.13.0-35-generic)
>>> > with QEmu 2.0 (it also affects 14.10, Linux 3.16 - QEmu 2.1).
>>> >
>>> > I have this very same setup up and running, on about ~100 physical
>>> > servers (others Trusty QEmu 2.0 Hypervisors), and in only a few of
>>> them,
>>> > the QEmu Hypervisors dedicated to host "guest acting as routers /
>>> > firewalls", like a "borger gateway" for example, that it does not work
>>> > as expected.
>>> >
>>> > One interesting thing to note is that, this BUG appear only, and only
>>> > at, the QEmu Hypervisors dedicated to host guests that are used as
>>> > `router / firewalls` (as I said above), others QEmu Hypervisors of my
>>> > network doe...

Revision history for this message
Thiago Martins (martinx) wrote : Re: QEmu +2 does not route VLAN tagged packets, that are originated within the Hypervisor itself.

Hey guys!

 From what I'm seeing, this problem does not affect Ubuntu Vivid with Linux 3.18 (and with systemd = PID1).

 I'll do more tests tomorrow, but with Linux 3.18 on Trusty (upstart = PID1), just to make sure that this is a Linux problem (and not related to QEmu itself).

Cheers!
Thiago

Thiago Martins (martinx)
affects: qemu → linux
Changed in linux:
status: New → Confirmed
summary: - QEmu +2 does not route VLAN tagged packets, that are originated within
- the Hypervisor itself.
+ Linux (3.13 to 3.16) as a Router under QEmu +2, does not route VLAN
+ tagged packets, that are originated within the Hypervisor itself.
Revision history for this message
Thiago Martins (martinx) wrote :

Hey guys!

 I can confirm that this problem does not exists when running Linux 3.18.

 So, in the end of the day, it wasn't a QEmu problem, it was a Linux BUG when running as a KVM Guest.

 Right now, all my "Border KVM Hypervisors" are upgraded to Linux 3.18. Problem fixed!

 I'm using my own Ubuntu PPA Repository to deploy Linux 3.18 on Trusty, here it is:

 Linux 3.18 backported from Vivid to Trusty:

 https://launchpad.net/~martinx/+archive/ubuntu/linux

 From what I'm seeing, Trusty NEEDS Linux 3.18 ASAP!

 I'm waiting for the package `linux-generic-lts-vivid` on Trusty, so I can stop using my PPA and start deploying it on my entire infrastructure.

Best!
Thiago

Revision history for this message
Otto Berger (otto-bergerdata) wrote :

any change to get this fixed in the current longerm? BTW: ubuntu-precise with the linux-generic-lts-trusty support shurely has the same problem.

Revision history for this message
Thiago Martins (martinx) wrote : Re: [Bug 1362755] Re: Linux (3.13 to 3.16) as a Router under QEmu +2, does not route VLAN tagged packets, that are originated within the Hypervisor itself.

Hello Otto,

I think that it will be "fixed" in Ubuntu 14.04, right after the package
`linux-generic-lts-vivid` becomes available to it (i.e., in Ubuntu 14.04.2).

I don't think that the Kernel Team will backport the fix from Linux 3.18,
to 3.13 (or to 3.16)...

In the meanwhile, I'm using in Trusty, for a production environment, the
Linux 3.18 from Vivid that I manually backported, hosted at my own PPA:

https://launchpad.net/~martinx/+archive/ubuntu/linux

Regards,
Thiago

On 8 February 2015 at 11:30, Otto Berger <email address hidden> wrote:

> any change to get this fixed in the current longerm? BTW: ubuntu-precise
> with the linux-generic-lts-trusty support shurely has the same problem.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1362755
>
> Title:
> Linux (3.13 to 3.16) as a Router under QEmu +2, does not route VLAN
> tagged packets, that are originated within the Hypervisor itself.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/linux/+bug/1362755/+subscriptions
>

Revision history for this message
Thiago Martins (martinx) wrote :

Fixed when with "linux-generic-lts-vivid" installed!

Default kernel of 14.04.2 is still Linux 3.16, which is affected by this bug.

Revision history for this message
Thiago Martins (martinx) wrote :

Fixed in Ubuntu 14.04.3 with Linux 3.19.

Changed in linux:
assignee: nobody → Thiago Martins (martinx)
assignee: Thiago Martins (martinx) → nobody
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.