Canonical Juju

lxd subnet setup by juju will interfere with openstack instance traffic

Bug #1600546 reported by Trent Lloyd on 2016-07-10

This bug report is a duplicate of: Bug #1614364: container addressability: lxc/lxd units are behind NAT on manual and openstack providers. Edit Remove

This bug affects 3 people

	Status	Importance	Assigned to	Milestone
Canonical Juju	Triaged	High	Unassigned	Canonical Juju 2.1-rc2
2.0	Won't Fix	Undecided	Unassigned
nova-compute (Juju Charms Collection)	Invalid	Undecided	Unassigned

Bug Description

the LXD bridge subnet setup by juju will interfere with openstack instance traffic

When juju configures lxd, it sets up an IPv4 subnet from 10.0.0.0/8 in container/lxd/initialisation.go:findNextAvailableIPv4Subnet()

This network configuration (ultimately written into /etc/default/lxd-bridge) also configures a MASQUERADE rule in POSROUTING

root@unsecretive-raelynn:~# iptables -t nat -nvL POSTROUTING
Chain POSTROUTING (policy ACCEPT 1472 packets, 145K bytes)
pkts bytes target prot opt in out source destination
1476 145K neutron-openvswi-POSTROUTING all -- * * 0.0.0.0/0 0.0.0.0/0
1476 145K neutron-postrouting-bottom all -- * * 0.0.0.0/0 0.0.0.0/0
4 240 MASQUERADE all -- * * 10.0.0.0/24 !10.0.0.0/24

These POSTROUTING rules are applied to instance traffic including tenant GRE networks. Though the traffic still goes into the GRE tunnel because this rule is POSTROUTING, the traffic is masqueraded to the compute servers IP and will never make it back to the instance.

The instance traffic being routed externally will not work, and it also breaks the metadata service so the instance will not come up correctly and the issue can be hard to debug.

See also the same issue caused by libvirt-bin's default network in #1387390

Using this method to configure a network will never be safe, for the same reason lxd recently removed all default network configuration out of the box.

Given the lxd containers are mostly bridged by juju, do we even need to configure this default subnet? It will work fine without it in bridged mode, as it does in the default out of the box configuration. I have no idea if juju can somehow use these non-bridged interfaces, but even if it can I think that this setup needs to be explicitly configured manually.

How to test:
(1) Setup an openstack installation (tested setup was juju 2.0-beta11-xenial-amd64, xenial-mitaka charms)

(2) Check which subnet the lxd containers are configured for, in my case it was simply 10.0.0.0/8 but can vary because of the detection code if you are using 10.0.0.0/8 subnets. Setup a GRE tenant network with this subnet.

(3) Setup a router for this tenant network, optionally create and connect an external network (not strictly necessary but hard to test/debug/observe without)

(4) Boot a new instance, observe the instance cannot contact metadata service. If you tcpdump the qrouter namespaces qg interface or the compute host, you will see the traffic is coming from the compute node's IP and not the instance's

See original description

Tags:

Trent Lloyd (lathiat) on 2016-07-11

description:	updated
description:	updated

Revision history for this message

James Page (james-page) wrote on 2016-07-11:

The requirement to configure the internal bridge for LXD + MAAS seems superfluous as Trent points out that all LXD containers get bridged to underlying network segments anyway.

Andrew McDermott (frobware) on 2016-07-11

tags:

added: network

James Tunnicliffe (dooferlad) on 2016-07-11

Changed in juju-core:
status:	New → Triaged
importance:	Undecided → High

Cheryl Jennings (cherylj) on 2016-07-14

tags:

added: 2.0

Revision history for this message

John A Meinel (jameinel) wrote on 2016-07-19:

My initial reading of this is that if we are setting up juju-br0 etc we should not be setting up lxdbr0 because there isn't anything that will be using it. There may still be places where we haven't moved over to setting up juju-br0, and we'll need some sort of bridging, but we should be moving to setting up a bridge-per-host-interface and not having to configure a separate 'lxdbr0' bridge (none of the containers we create would be using it).

Anastasia (anastasia-macmood) on 2016-08-03

Changed in juju-core:
milestone:	none → 2.0.0

Revision history for this message

Trent Lloyd (lathiat) wrote on 2016-08-10:

Hit this issue today in the Openstack 2.0 Training Course.. it broke the PRIVATE_SUBNET used (10.0.0.0/24)

Canonical Juju QA Bot (juju-qa-bot) on 2016-08-23

affects:	juju-core → juju
Changed in juju:
milestone:	2.0.0 → none
milestone:	none → 2.0.0

Alexis Bruemmer (alexis-bruemmer) on 2016-09-27

Changed in juju:
assignee:	nobody → Richard Harding (rharding)

Alexis Bruemmer (alexis-bruemmer) on 2016-10-05

Changed in juju:
milestone:	2.0.0 → 2.1.0

Revision history for this message

Trent Lloyd (lathiat) wrote on 2016-10-06:

I am seriously concerned about this being pushed to juju 2.1.0

This makes juju 2.0 unusable for most Openstack deployments, as all instance traffic for subnets in 10.0.0.0/8 will be broken where nova-compute hosts have lxd containers.

I would imagine that is a good majority of them.

Richard Harding (rharding) on 2016-10-10

Changed in juju:
milestone:	2.1.0 → 2.0.1

Revision history for this message

Trent Lloyd (lathiat) wrote on 2016-10-21:

It looks like this bug was somehow fixed somewhere. The code is still there, but it doesn't seem to be working it's not clear to me why. Also the most recent lxd got rid of using this file all together, but that's not in xenial (yet, anyway).

Revision history for this message

Trent Lloyd (lathiat) wrote on 2016-10-21:

Disregard, the above statement is wrong. The issue is still present.
Also, the following bug is a duplicate of this one: https://bugs.launchpad.net/juju/+bug/1615917

Ryan Beisner (1chb1n) on 2016-10-21

Changed in nova-compute (Juju Charms Collection):
status:	New → Invalid
tags:	added: uosci

Curtis Hovey (sinzui) on 2016-10-28

Changed in juju:
milestone:	2.0.1 → none

Revision history for this message

Anastasia (anastasia-macmood) wrote on 2017-02-02:

Marking as Won't Fix for 2.0.x as no further 2.0.x releases are planned.

Changed in juju:
assignee:	Richard Harding (rharding) → nobody
milestone:	none → 2.1.0

Revision history for this message

Anastasia (anastasia-macmood) wrote on 2017-02-15:

As per comment # 6, I am marking this as a duplicate of bug # 1615917 (which is a duplicate of bug # 1614364) which is being actively worked on and is currently scheduled to be fixed in 2.1.1.
Thank you for your report and patience, Trent!

Report a bug

This report contains Public information

Everyone can see this information.

Duplicate of bug #1614364 Remove

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.