In some configurations, juju never achieves HA on maas 1.9 with trusty

Bug #1596066 reported by Curtis Hovey
4
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
Triaged
Critical
Unassigned

Bug Description

As seen in this example:
   http://reports.vapour.ws/releases/4099/job/functional-ha-recovery-maas-1-9/attempt/8

Juju fails to setup HA on maas 1.9 with trusty. HA works with maas 1.9 +xenial, AWS + trusty, rackspace + trusty and azure +trusty. Just the maas 1.9 + trusty combination fails.

The /functional-ha-recovery-maas-1-9 now tests xenial. A new job named functional-ha-recovery-maas-1-9-trusty can repeat this tests case.

We can see in this example that the model-migration branch can achieve HA on maas 1.9 + trusty
    http://reports.vapour.ws/releases/4097/job/functional-ha-recovery-maas-1-9/attempt/6

Curtis Hovey (sinzui)
summary: - Juju never achieves HA on maas 1.9 with trusty
+ In some configurations, juju never achieves HA on maas 1.9 with trusty
Revision history for this message
Cheryl Jennings (cherylj) wrote :

This seems to only happen in some MAASes. I have a vMAAS running MAAS 1.9.3 and was able to bring up HA with trusty machines.

In the case seen in CI, the secondary controllers were stuck in cloud-init when executing the bridge script:

"+ /usr/bin/python2 /var/tmp/add-juju-bridge.py --bridge-prefix=br- --one-time-backup --activate /etc/network/interfaces"

This may be related to bug #1590689

Revision history for this message
James Tunnicliffe (dooferlad) wrote :

Just recreated with master (2.0ish) and MAAS 1.9 with real hardware. What I have seen is that the machines take a long time to come up because the ifdown, wait, ifup that we do in the bridge script for some reason isn't bringing up br-interfaceName and it times out waiting for it to appear. While this is better than hanging forever, it isn't great.

I am going to see if I can work out if there is a useful thing we can watch for to find out when it is safe to run ifup.

Revision history for this message
James Tunnicliffe (dooferlad) wrote :

ubuntu@inara:~$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.1.1 0.0.0.0 UG 0 0 0 eth0
192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 br-eth1

Not sure if I like the idea of two routes to 192.168.1.0. I get the same duplicate routes on the other machines because I have >1 network card on all my nodes. This is due to trusty still having the "source /etc/network/interfaces.d/*.cfg" line at the end of the interfaces file. We need to remove this line or we can't actually manage all the networking behaviour.

Revision history for this message
Cheryl Jennings (cherylj) wrote :

Talked with James and Andy this morning on this, and it was determined to be a dup of bug #1590689. Andy is checking that the curtin fix for that bug is currently in proposed. If it is, the MAASes should be updated to that level, and this test should be re-run.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.