container/lxd/initialisation_linux.go findNextAvailableIPv4Subnet should be careful about subnet allocation

Bug #1657850 reported by Samuel Cozannet
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Horacio Durán

Bug Description

This function aims at finding an available subnet for the lxd bridge that Juju will use on each machine to spawn LXD containers.

First of all, it assumes it can use 10.0.0.0/16, so if we actually are on that network, it may encounter an issue. A fix strategy would be to look at where the current machine stands, and select something that is not in that range

* if 10.0.0.0/16, move LXD to a 172.16.0.0/20
* If 172.16.0.0 move to 192.168.0.0/24

In any case, select something smaller than the initial network the machine is standing on to minimize collision risk.

Then once the subnet is selected, there is no test that it is not used by the host in some manner. It is actually not very simple to assess the subnet is OK. A couple of ideas:

* In doubt, ask the user for a proper subnet
* probe the selected subnet before picking it up (doing 1, 254, 2, 253... instead of 1, 2, 3... in order to maximize catching the gateway (usually the first or the last IP of the subnet)

# To get into a corner case that fails the deployment due to this issue:

1. Setup a VPC with CIDR 10.0.0.0/16 on AWS or any other cloud with 3 subnets (one for each AZ of the region) that have consecutive CIDRs (ex: 10.0.1.0/24, 10.0.2.0/24, 10.0.3.0/24)
2. Deploy a workload such as k8s core that uses LXD to store one of the charms
3. Scale out so that some consumers of that container go into a second subnet

in k8s core, at this stage, the master goes into the first subnet 1.0.1.0/24. It gets a LXD container for easyrsa that gets a 10.0.2.0/24 network (next available subnet).
At this point, the master cannot talk to the 10.0.2.0/24 'real' subnet anymore as lxdbr0 takes precendence.
Any node in this subnet will fail.

Tags: lxd networking
Changed in juju:
status: New → Triaged
importance: Undecided → High
milestone: none → 2.2.0
Revision history for this message
John A Meinel (jameinel) wrote :

It's true the existing subnet code is short sighted. But the intent has always been to move away from containers behind NAT, and I'd rather spend the effort to get containers into the provider subnet instead of a slightly-better-guessing-game.

There are two ways we're looking to do next steps.

1) containers in the host/provider subnet where the substrate allows it
2) containers on a fan bridge with the fan configured in all machines in the model. So everything within the model is able to talk to all the containers as if they were part of the host network.

If we're going to be putting effort into making it better I think we're better off moving toward one of those ends.

Revision history for this message
Samuel Cozannet (samuel-cozannet) wrote : Re: [Bug 1657850] Re: container/lxd/initialisation_linux.go findNextAvailableIPv4Subnet should be careful about subnet allocation

Agreed, I think I was in some sort of a corner case and I doubt we had
similar issues in other places.

I did some experimentation with the FAN, and there are some things you need
to be really careful about when using it in combination with LXD in a non
controlled network environment. I'll get my notes together and forward to
you if that can be helpful.

++
Sam

On Fri, Jan 20, 2017 at 6:52 AM, John A Meinel <email address hidden>
wrote:

> with the fan configured in all machines in the model

--
Samuel Cozannet
Cloud, Big Data and IoT Strategy Team
Business Development - Cloud and ISV Ecosystem
Changing the Future of Cloud
Ubuntu <http://ubuntu.com> / Canonical UK LTD <http://canonical.com> / Juju
<https://jujucharms.com>
<email address hidden>
mob: +33 616 702 389
skype: samnco
Twitter: @SaMnCo_23
[image: View Samuel Cozannet's profile on LinkedIn]
<https://es.linkedin.com/in/scozannet>

Revision history for this message
John A Meinel (jameinel) wrote :

see also: bug #1665648

Changed in juju:
assignee: nobody → Horacio Durán (hduran-8)
status: Triaged → In Progress
Revision history for this message
Anastasia (anastasia-macmood) wrote :

PR against 2.1: https://github.com/juju/juju/pull/7054

From the PR description: It doesn't do everything asked [in this bug report], but it does improve the situation because every machine won't try to choose exactly the same prefix.

Revision history for this message
Anastasia (anastasia-macmood) wrote :

@Samuel Cozannet,

I think that this has now been addressed. Juju 2.2-alpha1, or daily snap, is a great candidate for testing :D

I am marking this as Fix Committed based on the fact that the PR mentioned in comment #4 landed in develop (2.2) and the bug referred to in comment # 3 has been released.

Please re-test. If you are experiencing an issue, please file a separate bug with logs, steps to reproduce as well as desired behavior.

Changed in juju:
status: In Progress → Fix Committed
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.