MAAS provider with MAAS 1.9 - /etc/network/interfaces "auto eth0" gets removed and bridge is not setup

Bug #1494476 reported by Blake Rouse
46
This bug affects 5 people
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
High
Michael Foord
1.24
Fix Released
Critical
Michael Foord
1.25
Fix Released
High
Michael Foord

Bug Description

Using latest MAAS trunk with Juju 1.24 or 1.25 will cause Juju to fail to bootstrap the node.

This issue is how Juju sets up /etc/network/interfaces. Specifically this code here:
https://github.com/juju/juju/blob/master/provider/maas/environ.go#L1131

In MAAS trunk what will be 1.9, custom network configuration is now supported. All static IP addresses are now assigned statically on the node instead of over DHCP. This is the default behavior that people want so that if the DHCP server goes down and the node lease expires that the IP address does not get removed.

Because of this it causes and issue with how Juju parses /etc/network/interfaces. Juju expects it to look specifically like

auto eth0
iface eth0 inet dhcp

Which is no longer the case and will very rarely will be the case, even by default it wont be. It will look like this.

auto eth0
iface eth0 inet static
    address 192.168.122.10/24
    gateway 192.168.122.1

When Juju is bootstrapped on that node the user-data that is ran by cloud-init changes that configuration to:

iface eth0 inet static
    address 192.168.122.10/24
    gateway 192.168.122.1

Notice how the "auto eth0" has been removed and the bridge was not created. When the networking is bounced on the node, eth0 will not come up and Juju will not be able to SSH in cause the bootstrap to fail and the node to be useless as no one can SSH in.

Bootstrapping with environment option "disable-network-management: true" allows bootstrapping to work and services can be deployed but obliviously this prevents the ability to deploy services to LXC containers.

Bootstrapping with the "address-allocation" feature flag also works and LXC containers can be used and are registered as devices in MAAS.

Changed in juju-core:
status: New → Triaged
importance: Undecided → High
milestone: none → 1.25-beta1
tags: added: addressability lxc maas-provider network
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

This most likely will affect 1.24, 1.25, and master. Retargeting.

Changed in juju-core:
milestone: 1.25-beta1 → 1.26-alpha1
Revision history for this message
Christian Reis (kiko) wrote :

Does this affect 1.22? If so, unless 1.24 is backported into trusty, then this will block a 1.9 SRU.

Revision history for this message
Mark Ramm (mark-ramm) wrote :

We will need to get this fixed in 1.25 so whcih we will backport to trusty. What's the timeline for a 1.9 SRU?

Michael Foord (mfoord)
Changed in juju-core:
assignee: nobody → Michael Foord (mfoord)
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

I've got a confirmation that we need this fixed also for 1.24, as this will potentially prevent MAAS 1.9 SRU into trusty (e.g. if juju 1.24 is in trusty, it won't work properly with MAAS 1.9). Michael, please make sure you backport this to 1.24 as well, since the code changes in that part of the maas provider should be minimal, it should be easy to backport it.

Revision history for this message
Michael Foord (mfoord) wrote :

How do I reproduce this for manual testing (pretty sure I have a fix as discussed)? Using maas 1.9 (with KVM images) I see the machines have static addresses. However juju seems able to bootstrap and deploy new machines and containers without problem (/etc/network/interfaces has the right contents).

Revision history for this message
Michael Foord (mfoord) wrote :

Hmmm... my maas controller vm is on utopic, so the version of maas 1.9 maybe older than the bug. I'm upgrading the OS to check.

Revision history for this message
Michael Foord (mfoord) wrote :

I repaved the controller with trusty and then installed maas 1.9 alpha 2 from dailybuilds-qa-ok. I still can't repro the bug. juju 1.25 seems to work fine still.

Revision history for this message
Blake Rouse (blake-rouse) wrote :

You are pulling from the wrong ppa that does not have the newest curtin and MAAS will fallback to the old way. So you are not using the new storage or networking model.

Please use:

ppa:maas/next-proposed

Do not test with dailybuilds.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

There's a fix for 1.25 in a (yet-unproposed) branch here: https://github.com/voidspace/juju/tree/1494476-maas-networking

I have a slightly simpler fix for 1.24, which I still need to test on an actual 1.9 maas.

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Hi Dimiter,

If you have a PPA with the fix I'm happy to test. This is blocking us at the moment!

Thanks.

Revision history for this message
Mark Shuttleworth (sabdfl) wrote : Re: [Bug 1494476] Re: MAAS provider with MAAS 1.9 - /etc/network/interfaces "auto eth0" gets removed and bridge is not setup

Sign me up for this too, I need to get my autopilot runs rolling again
on 1.9!

Mark

Revision history for this message
Michael Foord (mfoord) wrote :

Fix for juju 1.25 proposed here: https://github.com/juju/juju/pull/3489
Pending review and more manual testing.

Revision history for this message
Michael Foord (mfoord) wrote :

The fix works with both MAAS 1.8 and 1.9, but we're seeing intermittent failures to initiate the mongo replicaset that we weren't seeing before. Investigating.

2015-10-13 12:16:31 INFO juju.worker.peergrouper initiate.go:77 finished MaybeInitiateMongoServer
2015-10-13 12:16:31 ERROR juju.cmd supercommand.go:429 cannot initiate replica set: cannot get replica set status: can't get local.system.replset config from self or any seed (EMPTYCONFIG)

Revision history for this message
Michael Foord (mfoord) wrote :

Fix tested and landing on 1.25 now. Ports to master and possibly 1.24 (unless dimiter prefers his fix) will happen next. The replicaset issue was due to a maas interface alias, which the cloudinit code was also mangling (pre-existing problem) - that's fixed too now.

Michael Foord (mfoord)
Changed in juju-core:
status: Triaged → In Progress
Revision history for this message
Michael Foord (mfoord) wrote :

Also landed in 1.24. Waiting for master to unblock to complete this.

Revision history for this message
Cheryl Jennings (cherylj) wrote :

As master is unblocked now, can you land your changes there?

Revision history for this message
Michael Foord (mfoord) wrote :

https://github.com/juju/juju/pull/3498

Landed a while ago, I just didn't set "fix committed". Sorry - done now.

Changed in juju-core:
status: In Progress → Fix Committed
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.