Juju 2.0 uses random IP for 'PUBLIC-ADDRESS' with MAAS 2.0

Bug #1616098 reported by Ante Karamatić
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
Critical
Dimiter Naydenov

Bug Description

When deploying beta13 with MAAS 2.0 we noticed that juju picks random host's IPs for public address. Therefore juju status will report:

UNIT WORKLOAD AGENT MACHINE PORTS PUBLIC-ADDRESS MESSAGE
ubuntu/0 unknown idle 6 192.168.1.5
ubuntu/1 unknown allocating 7 172.16.10.6 Waiting for agent initialization to finish

While in MAAS logs I can see:

2016-08-23 13:07:46 INFO juju.worker.instancepoller updater.go:269 machine "6" has new addresses: [public:192.168.1.5 public:172.16.10.5]
2016-08-23 13:09:54 INFO juju.worker.instancepoller updater.go:269 machine "7" has new addresses: [public:172.16.10.6 public:192.168.1.6]

Both 192.168.1.x belong to the same space in MAAS. And hostnames of both machines resolve to 192.168.1.x IPs, while 172.16.10.x IPs resolve to 'interface hostnames'.

I would expect consistency and at least have juju pick the public IPs from the same space, one that could be configurable. Or at least fall back to DNS resolution and try to reverse lookup host's hostname.

Tags: 4010 cpec
Ante Karamatić (ivoks)
tags: added: cpe
tags: added: cpec
removed: cpe
no longer affects: juju-core
Changed in juju:
milestone: none → 2.0-beta17
Changed in juju:
status: New → Triaged
importance: Undecided → Critical
assignee: nobody → Dimiter Naydenov (dimitern)
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

I've prepared a hotfix patch for Ante to try on site, to verify a was to fix this.
Later a proper fix based on the tests outcome will be proposed.

Changed in juju:
status: Triaged → In Progress
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

s/to verify a was/to verify a *way*/

Ante Karamatić (ivoks)
tags: added: 4010
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

First step towards fixing this, on MAAS 1.9, is proposed: https://github.com/juju/juju/pull/6120

I'll propose a follow-up, which does the same for MAAS 2.0.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

I'm putting this on hold, because John had expressed concerns about returning MAAS hostnames, which will be unresolvable on the Juju client machine, unless MAAS DNS's resolver is configured.

Instead, John suggests to use the planned JUJU NSS plugin that can trivially resolve hostnames to IPs. We should have a decision soon and I'll post an update.

Changed in juju:
status: In Progress → Won't Fix
status: Won't Fix → Opinion
Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

I think the important piece of this though is that we should explicitly map MAAS spaces into networks in the Juju model. Juju has the idea of a "public IP" but that's not really well defined in a MAAS setting, and we need to think through the problem deeply enough to have a tasteful approach.

Revision history for this message
Ante Karamatić (ivoks) wrote :

Until proper solution is in place, I'd just like to point that this is more serious than it looks. Not all IPs on nodes are accessible to juju controller or juju client. Some might be connected to a space that's not routed.

This means that choosing an IP from that space will impact usability. One will not be able to ssh to the node and it will forever stay in allocating state.

Revision history for this message
Ante Karamatić (ivoks) wrote :

Update on behavior with 2.0rc2.

Machines now continue deployment just fine, so they do establish connection to controller. However, initial problem still persists. juju status presents an IP that is not routable, so 'juju ssh 34' doesn't work. One needs to go into MAAS with machine ID to figure out the IP address.

Another datapoint. It looks like there's a pattern in IP selection. It selects IPs that start with the lowest number. And it does it in such a way where '100' is lower than '80'. So, if one has a setup with:

machine A - 100.80.0.1, 100.90.0.1
machine B - 100.80.0.2, 100.100.0.1

In case of machine A public IP will be 100.80.0.1, and in case of machine B public IP will be 100.100.0.1.

I see this pattern with every deploy.

Revision history for this message
Michael Foord (mfoord) wrote :

Juju knows about MAAS spaces but it doesn't know about routes. The algorithm for picking "the" public address, from all those available, is unaware of spaces. It should be aware of spaces, and pick an address from a space the controller is in. Alternatively/additionally the controller could try to connect via the address before selecting it as the preferred public address (once picked, an address with the correct scope/type will remain the public address until it is no longer available).

Changed in juju:
status: Opinion → In Progress
milestone: 2.0-beta17 → 2.0.0
Revision history for this message
Michael Foord (mfoord) wrote :

$ juju bootstrap foo lxd
Creating Juju controller "foo" on lxd/localhost
Looking for packaged Juju agent version 2.0.0 for amd64
No packaged binary found, preparing local Juju agent binary
To configure your system to better support LXD containers, please see: https://github.com/lxc/lxd/blob/master/doc/production-setup.md
Launching controller instance(s) on lxd/localhost...
 - preparing image

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Proposed https://github.com/juju/juju/pull/6426 as a prerequisite to the actual fix.

Changed in juju:
milestone: 2.0.0 → 2.0.1
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

The prerequisite PR #6426 was superseded by https://github.com/juju/juju/pull/6454, which is already approved waiting to land. Another prerequisite is necessary to add AllAddresses() method to the SSHClient API facade, used by juju ssh|scp, before the actual fix can be done.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

The final 2 prerequiste PRs proposed: https://github.com/juju/juju/pull/6467 (state/ changes), https://github.com/juju/juju/pull/6468 (api/ and apiserver/ changes).

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Finally, proposed the actual fix: https://github.com/juju/juju/pull/6481

Changed in juju:
status: In Progress → Fix Committed
Curtis Hovey (sinzui)
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.