virtual IP addresses should not be registered

Bug #1733968 reported by Felipe Reyes
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Joseph Phillips

Bug Description

juju registers all IP addresses configured in a machine, if the IP address is a virtual IP managed by pacemaker (or equivalent clustering solutions like keepalived), so far this hasn't been a problem, but we found an environment where for an undetermined reason the virtual IP is stored as the first element of the array in machineaddresses

...
    "machineaddresses": [
    {
        "origin": "machine",
        "networkscope": "local-cloud",
        "addresstype": "ipv4",
        "value": "192.168.140.21"
    },
    {
        "origin": "machine",
        "networkscope": "local-cloud",
        "addresstype": "ipv4",
        "value": "192.168.140.33"
    },
    {
        "origin": "machine",
        "networkscope": "local-machine",
        "addresstype": "ipv4",
        "value": "127.0.0.1"
    },
    {
        "origin": "machine",
        "networkscope": "local-machine",
        "addresstype": "ipv6",
        "value": "::1"
}
],
...

In this case the IP address 192.168.140.21 is a virtual IP that was migrated to another node, when the user tries to use 'juju run' with this unit the following error message is printed:

- MachineId: 0/lxc/1
  ReturnCode: 1
  Stderr: |
  error: unit "unit/0" not found on this machine
  Stdout: ""
  UnitId: unit/0

This is happening because jujud-machine-0 tries to talk to unit/0 via 192.168.140.21, the virtual IP that according to juju's database is the machine 0/lxc/1, but the address is actually configured in another machine (e.g. 1/lxc/1), so when juju detects this mismatch the operation is aborted.

Steps to reproduce:

cat << EOF>cfg.yaml
percona-cluster:
  vip: 192.168.11.5
  min-cluster-size: 3
EOF
juju deploy --config cfg.yaml -n 3 cs:trusty/percona-cluster
juju deploy cs:trusty/hacluster
juju add-relation hacluster percona-cluster

This will make juju record the IP address 192.168.11.5 when it shouldn't

[Workaround]

juju ssh 0 sudo su -
apt-get install mongodb-clients
mongo --ssl -u admin -p $(grep oldpassword /var/lib/juju/agents/machine-0/agent.conf | awk -e '{print $2}') localhost:37017/admin

In mongodb shell

use juju;
db.machines.update({"machineid": "1"}, {$pull: {"machineaddresses": {"value": "192.168.11.5"}}}, {multi: false});

This will remove the ip address 192.168.11.5 from the machine 1, this will be needed when 192.168.11.5 was migrated from machine 1 and the IP is the first element of the array.

Tags: network sts
Felipe Reyes (freyes)
tags: added: sts
Revision history for this message
Anastasia (anastasia-macmood) wrote :

@Felipe Reyes,

Could you please clarify Juju version?

Changed in juju:
status: New → Incomplete
Changed in juju-core:
status: New → Incomplete
Revision history for this message
Felipe Reyes (freyes) wrote : Re: [Bug 1733968] Re: virtual IP addresses should not be registered

On Tue, Dec 05, 2017 at 11:38:54AM -0000, Anastasia wrote:
> Could you please clarify Juju version?

using juju 1.25.13 . We don't consider it critical enough to have it fixed in a 1.25 version though, but the problem should still be there in juju 2.x

Changed in juju:
status: Incomplete → New
Changed in juju-core:
status: Incomplete → New
Revision history for this message
Anastasia (anastasia-macmood) wrote :

Thank you, @freyes \o/
I'll mark it as Won't Fix for juju-core and will keep it on juju.

Changed in juju-core:
status: New → Won't Fix
Revision history for this message
John A Meinel (jameinel) wrote :

I believe you should be able to set --config ignore-machine-addresses=true

Which tells Juju to only pay attention to IP addresses assigned by the underlying provider, and not local-only IP addresses.
That said, if we're on that machine, and it has that IP, and its in the same subnet as the rest of the IP addresses, it is hard for us to distinguish what is vs isn't a valid IP address.

A different option is that Juju could be told more explicitly about VIP for a given application, and thus know that a given address should be handled differently than the other addresses it sees.

Changed in juju:
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
Drew Freiberger (afreiberger) wrote :

We just had this affect a percona-cluster environment when the VIP for the cluster was assigned to two hosts at once due to a failure in percona clustering. See also lp:1742811.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

John,

There are "secondaries" so for VIPs there is a way to filter:

1. Linux kernel has a concept of a "secondary" address. If you do `ip route add x.x.x.x/y dev z` it loops through all addresses on an interface and will check if there's already an address with the same prefix that you are trying to add. If there is, it will mark the newly address as a secondary.

inet_insert_ifa which contains primary/secondary flagging code (IPv4, not sure about IPv6, probably similar)
https://elixir.free-electrons.com/linux/latest/source/net/ipv4/devinet.c#L440

the flag itself
https://elixir.free-electrons.com/linux/latest/source/include/uapi/linux/if_addr.h#L42
#define IFA_F_SECONDARY 0x01

2. iproute2 is aware of secondaries when you do `ip addr show`
https://git.launchpad.net/~usd-import-team/ubuntu/+source/iproute2/tree/ip/ipaddress.c?h=applied/ubuntu/xenial-updates#n1068

3. there's a sysctl to nuke secondaries on removal of a primary - seems to be set to 0 by default which means "to nuke".

http://elixir.free-electrons.com/linux/v4.14.13/source/Documentation/networking/ip-sysctl.txt#L1266
"promote_secondaries - BOOLEAN when a primary IP address is removed from this interface promote a corresponding secondary IP address instead of removing all the corresponding secondary IP addresses."

Nuking secondaries works like this:
https://elixir.free-electrons.com/linux/v4.14.13/source/net/ipv4/devinet.c#L326

3. heartbeat/IPaddr2 uses `ip addr add ...` and for our VIP use-cases it adds an address on a primary address' subnet => a secondary will be created.
https://github.com/ClusterLabs/resource-agents/blob/v4.1.0/heartbeat/ocf-binaries.in#L29
https://github.com/ClusterLabs/resource-agents/blob/v4.1.0/heartbeat/IPaddr2#L600

And notes promote_secondaries:
https://github.com/ClusterLabs/resource-agents/blob/v4.1.0/heartbeat/IPaddr2#L149-L156
"There must be at least one static IP address, which is not managed by the cluster, assigned to the network interface. If you can not assign any static IP address on the interface, modify this kernel parameter: sysctl -w net.ipv4.conf.all.promote_secondaries=1 # (or per device)"

Going back to p.1, there are two cases that are handled in the kernel:

1) multiple addresses per interface in different subnets (different prefix => no secondaries);
2) multiple addresses per interface in the same subnet (same prefix => one primary, k secondaries).

I don't think we ever assign an additional IP on an interface in the same subnet as the primary IP on the same interface for any purpose other than VIPs/floating IPs. Seems like a reliable criterion to ignore in addition to provider-level IPAM awareness.

Revision history for this message
Felipe Reyes (freyes) wrote :

On Sat, Jan 13, 2018 at 12:13:36AM -0000, Dmitrii Shcherbakov wrote:
> John,
>
> There are "secondaries" so for VIPs there is a way to filter:
>
> 1. Linux kernel has a concept of a "secondary" address. If you do `ip
> route add x.x.x.x/y dev z` it loops through all addresses on an
> interface and will check if there's already an address with the same
> prefix that you are trying to add. If there is, it will mark the newly
> address as a secondary.
>
> inet_insert_ifa which contains primary/secondary flagging code (IPv4, not sure about IPv6, probably similar)
> https://elixir.free-electrons.com/linux/latest/source/net/ipv4/devinet.c#L440
>
> the flag itself
> https://elixir.free-electrons.com/linux/latest/source/include/uapi/linux/if_addr.h#L42
> #define IFA_F_SECONDARY 0x01
>
> 2. iproute2 is aware of secondaries when you do `ip addr show`
> https://git.launchpad.net/~usd-import-team/ubuntu/+source/iproute2/tree/ip/ipaddress.c?h=applied/ubuntu/xenial-updates#n1068
>
>
> 3. there's a sysctl to nuke secondaries on removal of a primary - seems to be set to 0 by default which means "to nuke".
>
> http://elixir.free-electrons.com/linux/v4.14.13/source/Documentation/networking/ip-sysctl.txt#L1266
> "promote_secondaries - BOOLEAN when a primary IP address is removed from this interface promote a corresponding secondary IP address instead of removing all the corresponding secondary IP addresses."
>
> Nuking secondaries works like this:
> https://elixir.free-electrons.com/linux/v4.14.13/source/net/ipv4/devinet.c#L326
>
>
> 3. heartbeat/IPaddr2 uses `ip addr add ...` and for our VIP use-cases it adds an address on a primary address' subnet => a secondary will be created.
> https://github.com/ClusterLabs/resource-agents/blob/v4.1.0/heartbeat/ocf-binaries.in#L29
> https://github.com/ClusterLabs/resource-agents/blob/v4.1.0/heartbeat/IPaddr2#L600
>
> And notes promote_secondaries:
> https://github.com/ClusterLabs/resource-agents/blob/v4.1.0/heartbeat/IPaddr2#L149-L156
> "There must be at least one static IP address, which is not managed by the cluster, assigned to the network interface. If you can not assign any static IP address on the interface, modify this kernel parameter: sysctl -w net.ipv4.conf.all.promote_secondaries=1 # (or per device)"
>

Great research, I wasn't aware of this hint.

> Going back to p.1, there are two cases that are handled in the kernel:
>
> 1) multiple addresses per interface in different subnets (different prefix => no secondaries);

I've seen this scenario on the field.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Felipe, thanks!

Multiple addresses per interface in different subnets is a valid scenario (nothing restricts from having 2 subnets on a single broadcast domain or L2 segment). It's also natural to IPv6 at least because you get a link-local address in addition to a global-scoped address. This should be accounted for too in my view but seems to be only possible with querying IPAM AFAICS.

It would seem like case 2 would give a simple resolution for many field cases.

Changed in juju:
status: Incomplete → Triaged
Revision history for this message
Tim Penhey (thumper) wrote :

During a discussion with Mario and Billy, we think we came up with what sounds like a good solution.

A key problem here is that of identifying the VIPs. The simple solution here is to have the charm tell Juju that a particular IP address is a VIP and is owned by the unit and not the machine, so Juju shouldn't use it. Juju would still know about the address, but it would be flagged so as not to be one of the choices with the network jujuc commands like network-get.

This seems to fit the modeling approach quite nicely from our points of view, as often the VIP is passed in as application config, and then used by the unit to set up the local interface and set as relation data for other applications related to it. The unit would also need to call another jujuc command to tell Juju not to use that address in any networking call.

The command would need to be able to flag an ipaddress to not be used, and take multiple IP addresses. The command should also be able to re-enable an ipaddress for use, again taking multiple IP addresses.

tags: added: netw
tags: added: network
removed: netw
Ian Booth (wallyworld)
Changed in juju:
milestone: none → 2.4-beta1
Changed in juju:
milestone: 2.4-beta1 → none
Revision history for this message
Felipe Reyes (freyes) wrote :

I tested juju from edge (2.8-beta1+develop-59bbda0) with the intention to determine if maybe this issue was no longer present, but no luck, it's still there.

To reproduce it is possible to use this bundle (cdk+keepalived) https://pastebin.ubuntu.com/p/Vkxf3rCPTW/

Here it's the journal of my testing: https://pastebin.ubuntu.com/p/8rWVFC23Fb/

Ian Booth (wallyworld)
no longer affects: juju-core
Changed in juju:
milestone: none → 2.8-rc1
importance: Medium → High
Changed in juju:
assignee: nobody → Simon Richardson (simonrichardson)
Tim Penhey (thumper)
Changed in juju:
assignee: Simon Richardson (simonrichardson) → Joseph Phillips (manadart)
Revision history for this message
Pen Gale (pengale) wrote :

This requires some significant work still, as it involves adding features that Juju doesn't currently have.

It's still on our list of priorities, but it is unlikely to be done by the end of the week.

Going to bump it from the rc1 milestone for now.

Changed in juju:
milestone: 2.8-rc1 → 2.8.1
Changed in juju:
status: Triaged → In Progress
Ian Booth (wallyworld)
Changed in juju:
milestone: 2.8.1 → 2.8.2
Revision history for this message
Joseph Phillips (manadart) wrote :
Changed in juju:
status: In Progress → Fix Committed
Changed in juju:
status: Fix Committed → Fix Released
Revision history for this message
Steven Parker (sbparke) wrote :

I was able to replicate this bug in version 2.8.7.
So this may not be fixed completely.

To replicate I had an existing barbarian/ha-cluster installation.
Removed the relation between these charms which resulted in ha-cluster being removed.

I then recreated the relation and the VIP was grabbed for the subnet IP for the charm which had the pre-existing VIP running.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.