juju-1.23beta3 breaks glance <-> mysql relation when glance is hosted in a container

Bug #1441811 reported by Jason Hobbs
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
juju-core
Invalid
High
James Tunnicliffe
1.23
Fix Released
Critical
James Tunnicliffe

Bug Description

In previous versions of juju, containers were bridged to the host's network and used it directly for networking without any NAT occurring.

In juju-1.23beta3, connections originating from LXC containers seem to be getting NAT'd so that to other services, their source IP appears to be the host's IP instead of the LXC container's IP. This is happening even though the LXC has an IP address on the same subnet as the host, so the NAT'ing doesn't really make any sense.

This breaks mysql <-> glance relation when glance is in a container, because MySQL uses source IP address based access controls and expects the inbound connections to come from the glance container's IP address, not the host's IP address.

OperationalError: (OperationalError) (1130, "Host '10.245.0.168' is not allowed to connect to this MySQL server") None None

More traceback:

http://paste.ubuntu.com/10775536/

juju status yaml:

http://paste.ubuntu.com/10775582/

all-machines.log:
https://pastebin.canonical.com/129182/

description: updated
Curtis Hovey (sinzui)
Changed in juju-core:
status: New → Triaged
importance: Undecided → High
milestone: none → 1.23-beta4
milestone: 1.23-beta4 → 1.24-alpha1
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Can't you add the container's host IP to the MySQL list of allowed addresses (or even the whole internal range) ?

Yes, using NAT for the new addressable containers feature is by design in both AWS and MAAS.
Since there's no juju-br0 (or any bridge using a host interface), the containers are behind their host, which in turn uses NAT and routing / forwarding to provide the containers access to the same subnet as their host. Using *any* host interface in a bridge for the sake of containers being addressable was proven flaky and inefficient on numerous occasions (e.g. we don't even have proper MAAS APIs to discover a node's NICs, *and* we can have biosdevname on or some extra preseed scripts configuring bonding, etc.).

So while I'm open to suggestions on how to solve this in general, rather than hot-fixing a specific case and possibly breaking others, I don't think this can be solved effectively only by Juju, and without/despite the lack of support from the MAAS API.

I've retriaged it for 1.23, as I don't want to block the tomorrow's 1.23-beta4 release, but there's no beta5 milestone to use.

Revision history for this message
James Page (james-page) wrote :

I think a snat should do the trick, so that traffic from the glance container appears from the correct IP address.

That may be over-simplistic - I'm not that familiar with how the NATing works in 1.23.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

In 1.23 addressable containers and their hosts are configured like this:

(In my case, $HOST_IP=10.14.0.100 (on $HOST_IF=eth0), $HOST_NET=10.14.0.0/24, $HOST_GW=10.14.0.1, $HOST_DNS=10.14.0.1 - discovered from /etc/resolv.conf on the host at run-time; lxcbr0 and virbr0 are the same as well as their ranges/IPs 10.0.3.0/24, 192.168.122.0/24; $CONTAINER#_IP all contain statically allocated IPs from $HOST_NET via MAAS API).

HOST
====

iptables:
-A POSTROUTING -o $HOST_IF -j SNAT --to-source $HOST_IP
-A FORWARD -s $HOST_NET -i virbr0 -j ACCEPT
-A FORWARD -d $HOST_NET -o virbr0 -j ACCEPT
-A FORWARD -d 192.168.122.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -s 192.168.122.0/24 -i virbr0 -j ACCEPT
-A FORWARD -i virbr0 -o virbr0 -j ACCEPT
-A FORWARD -s $HOST_NET -i lxcbr0 -j ACCEPT
-A FORWARD -d $HOST_NET -o lxcbr0 -j ACCEPT
-A FORWARD -o lxcbr0 -j ACCEPT
-A FORWARD -i lxcbr0 -j ACCEPT

IP routes:
default via $HOST_GW dev $HOST_IF
10.0.3.0/24 dev lxcbr0 proto kernel scope link src 10.0.3.1
$HOST_NET dev eth0 proto kernel scope link src $HOST_IP
$CONTAINER1_IP dev virbr0 scope link
$CONTAINER2_IP dev lxcbr0 scope link
$CONTAINER3_IP dev lxcbr0 scope link
$CONTAINER4_IP dev lxcbr0 scope link
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1

sysctl:
net.ipv4.ip_forwarding=1
net.ipv4.conf.all.proxy_arp=1

CONTAINER
=========

/etc/network/interfaces:
  # loopback interface
  auto lo
  iface lo inet loopback

  # interface "eth0"
  auto eth0
  iface eth0 inet manual
      dns-nameservers $HOST_DNS
      pre-up ip address add $CONTAINER1_IP/32 dev eth0 &> /dev/null || true
      up ip route replace $HOST_IP dev eth0
      up ip route replace default via $HOST_IP
      down ip route del default via $HOST_IP &> /dev/null || true
      down ip route del $HOST_IP dev eth0 &> /dev/null || true
      post-down ip address del $CONTAINER1_IP/32 dev eth0 &> /dev/null || true

lxc.conf generated for LXC containers:

# network config
# interface "eth0"
lxc.network.type = veth
lxc.network.link = lxcbr0
lxc.network.flags = up
lxc.network.name = eth0
lxc.network.hwaddr = 00:16:3e:xx:xx:xx
lxc.network.ipv4 = $CONTAINER1_IP/32
lxc.network.ipv4.gateway = $HOST_IP
lxc.network.mtu = 1500 # discovered from $HOST_IF

Nothing special is needed for KVM containers, except the iptables rules on the host.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Maybe we could generate a SNAT rule per container IP:

-A POSTROUTING -s $CONTAINER#_LOCAL_IP -o $HOST_IF -j SNAT --to-source $CONTAINER#_IP

...where $CONTAINER#_LOCAL_IP is from the 10.0.3.0/24 or 192.168.122.0/24 range for LXC and KVM containers respectively.

Alternatively, we could try a couple of SNAT rules - one for LXC, one for KVM (according to http://www.netfilter.org/documentation/HOWTO/NAT-HOWTO-6.html NAT-ing a range over another range should work transparently):

-A POSTROUTING -s 10.0.3.0/24 -o $HOST_IF -j SNAT --to-source $HOST_NET
-A POSTROUTING -s 192.168.122.0/24 -o $HOST_IF -j SNAT --to-source $HOST_NET

Thus the containers' addresses on $HOST_NET will be correct (e.g. mysql will see connections from container-hosted glance using the container's IP instead of its host's IP). However, it might not work for AWS - need to experiment.

Curtis Hovey (sinzui)
tags: added: charms network
Changed in juju-core:
assignee: nobody → James Tunnicliffe (dooferlad)
Revision history for this message
James Tunnicliffe (dooferlad) wrote :

We have an SNAT rule already, which is giving traffic from all containers the hosts IP address. Removing it restores the correct behaviour. I have done a quick ping test while watching traffic on EC2 with LXC and MAAS with KVM and it looks good now. Fix will land soon.

Changed in juju-core:
status: Triaged → In Progress
Curtis Hovey (sinzui)
tags: added: regression
Revision history for this message
Curtis Hovey (sinzui) wrote :

The fix committed to address this bug is not complete. Maas containers work, but aws containers are broken for all charms now, see bug 1442801.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

MAAS containers work only because we reverted the addressable containers behavior for both MAAS and EC2 to only work under a feature flag. So this is issue is no longer relevant and I think we should close it. The other bug 1442801 can be used to track the EC2 work.

Changed in juju-core:
status: In Progress → Invalid
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.24-alpha1 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.