Cluster relation reporting public IP instead of internal network IP on manually added machine

Bug #1773359 reported by Nick Moffitt
32
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
Low
Unassigned

Bug Description

We have production services that use the cluster relation to share internal IPs so the systems can connect to one another without the firewall. A relevant function docstring:

    """
    Use rsync to push a directory to all configured peers.

    Peers are configured in the 'cluster' relationship using
    charmhelpers.contrib.unison.ssh_authorized_peers(); this list can be
    overridden with the peers parameter.

    A list of the peers whose transfers were successful will be
    returned.
    """

When we manually deployed a new unit, this environment started using the public IP for that instead of the internally-routed one. This is on azure, so the internal network is a sort of virtual thing we define.

This has caused production outage (as cluster members could not connect) and we need discovery of the private IPs without azure's assistance.

Revision history for this message
Nick Moffitt (nick-moffitt) wrote :

On review, the symptoms only occur for the manually-provisioned unit in this azure environment. Our suspicion is that codepaths relating to the cluster relation are reporting via a query on the instance-id, and azure is telling juju about the private IP for the original unit but the ID for the manual unit is just manual:<public-ip>, so it derives from that.

This does not, on reflection, appear to be unique to 2.3.7 and may exist in all the 2.x series.

description: updated
John A Meinel (jameinel)
Changed in juju:
importance: Undecided → Medium
status: New → Triaged
summary: - Cluster relation reporting public IP instead of internal network IP
+ Cluster relation reporting public IP instead of internal network IP on
+ manually added machine
Revision history for this message
John A Meinel (jameinel) wrote :

We don't have a way of knowing that the machine is actually an instance from Azure (vs being any other machine that you might have running on some other cloud, or in some other data center).

It would be good to understand why it was necessary to bring it up manually, and understand what mechanisms we should be providing. Would it be better to "juju add-machine azure:instance-id" so that we know it is part of the same group, rather than adding it by IP address.

I'm also wondering *how* you added the machine. Did you "juju add-machine PUBLIC_IP" which is then why we advertise "this machine is known as PUBLIC_IP" to everyone else. We may know a private IP on the machine, but we don't necessarily know that the private IP is actually in the same network segment as the private IPs of other machines in your model.

It is true that Juju's modeling of what network spaces a manually added machine is part of is limited. IIRC, we only really use the one address you gave as part of the 'add-machine' call.

Revision history for this message
Nick Moffitt (nick-moffitt) wrote :

Manual provisioning was necessary because juju add-unit does not add new machines to an existing Availability Set, and Azure does not allow us to change the Availability Set of a given VM.

Is it possible to manually provision via azure:<instance-id>? We have only ever used ssh:<ip-address>, and yes we used the public IP address because that was the only one reachable on port 22 from the system where we run juju.

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 1773359] Re: Cluster relation reporting public IP instead of internal network IP on manually added machine

We don't actually support the azure instance-id syntax, but I was wondering
if we could enable that syntax and if it would lead to better experience
for you.

On Fri, Jun 8, 2018 at 1:42 PM, Nick Moffitt <email address hidden>
wrote:

> Manual provisioning was necessary because juju add-unit does not add new
> machines to an existing Availability Set, and Azure does not allow us to
> change the Availability Set of a given VM.
>
> Is it possible to manually provision via azure:<instance-id>? We have
> only ever used ssh:<ip-address>, and yes we used the public IP address
> because that was the only one reachable on port 22 from the system where
> we run juju.
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1773359
>
> Title:
> Cluster relation reporting public IP instead of internal network IP on
> manually added machine
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1773359/+subscriptions
>

Revision history for this message
Nick Moffitt (nick-moffitt) wrote :

On 12Jul2018 07:13AM (-0000), John A Meinel wrote:
> We don't actually support the azure instance-id syntax, but I was wondering
> if we could enable that syntax and if it would lead to better experience
> for you.

It seems like a good syntax, but I'm mainly concerned that it needs to
then include enough information on the unit based on its azure profile.
Is there an existing syntax that will keep more context than the
ssh:<hostname/ip> one?

--
Nick Moffitt

Revision history for this message
John A Meinel (jameinel) wrote :

So if we were adding by instance-id, presumably we would ask the provider
for any additional details that we need to know about the machine.
"ssh:hostname" just treats it as any machine (it could be a machine in a
completely different datacenter, etc), so we don't make any assumptions
about it being 'similar' to the other machines in your model.

On Thu, Jul 12, 2018 at 11:40 AM, Nick Moffitt <email address hidden>
wrote:

> On 12Jul2018 07:13AM (-0000), John A Meinel wrote:
> > We don't actually support the azure instance-id syntax, but I was
> wondering
> > if we could enable that syntax and if it would lead to better experience
> > for you.
>
> It seems like a good syntax, but I'm mainly concerned that it needs to
> then include enough information on the unit based on its azure profile.
> Is there an existing syntax that will keep more context than the
> ssh:<hostname/ip> one?
>
> --
> Nick Moffitt
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1773359
>
> Title:
> Cluster relation reporting public IP instead of internal network IP on
> manually added machine
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1773359/+subscriptions
>

Revision history for this message
Nick Moffitt (nick-moffitt) wrote :

On 12Jul2018 08:15AM (-0000), John A Meinel wrote:
> So if we were adding by instance-id, presumably we would ask the
> provider for any additional details that we need to know about the
> machine. "ssh:hostname" just treats it as any machine (it could be a
> machine in a completely different datacenter, etc), so we don't make
> any assumptions about it being 'similar' to the other machines in your
> model.

Right. You were asking me about the syntax of azure:<instance-id>: I'm
curious if you focused on the syntax because the functionality exists
already with a different syntax, or if that was synecdoche for the
overall functionality. Because we'd be happy to document any other way
to achieve this for our procedures, and consider syntax changes a
wishlist in that case.

--
Nick Moffitt

Revision history for this message
Canonical Juju QA Bot (juju-qa-bot) wrote :

This bug has not been updated in 2 years, so we're marking it Low importance. If you believe this is incorrect, please update the importance.

Changed in juju:
importance: Medium → Low
tags: added: expirebugs-bot
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.