Comment 3 for bug 2039436

Revision history for this message
Joseph Phillips (manadart) wrote (last edit ):

The work-around mentioned above creates a latent issue.

Progressive subnet discovery was added for the manual provider under this patch:
https://github.com/juju/juju/pull/11899

The problem is that it will only discover subnets from new *NICs*.

This means that since all devices but one were disabled in order to bootstrap and enter HA, they were already in Juju without addresses. Once we added addresses to them, the subnets for those addresses were not added to Juju.

If they were, we would have been able to carve the different subnets into spaces, and set one of those spaces as configuration for "juju-ha-space", which would have ensured a unique local-cloud address that the peer-grouper could use to maintain the Mongo control plane.

Once we had a restart (soft or hard), the peer-grouper now in an error loop could not broadcast the address information needed to establish the Raft transport. No Raft - no leases - no API.