keepalived loses VIP on netplan apply or systemd restart/upgrade
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack-Ansible |
Won't Fix
|
Undecided
|
Unassigned |
Bug Description
Environment:
Underlay o/s: Ubuntu Bionic 18.04.1
Networking configured with Netplan with networkd renderer
$ cat /etc/openstack-
# Ansible managed
DISTRIB_ID="OSA"
DISTRIB_
DISTRIB_
DISTRIB_
Issue:
On a HAProxy host with keepalived installed, when systemd is upgraded/restarted or netplan apply is run, VIPs are being dropped from the interfaces so services become unavailable. As keepalived doesn't appear to monitor the VIPs, the VRRP failover never occurs when they dissapear.
Workaround:
A workaround solution has been documented here https:/
Problem with workaround:
Currently the keepalived override variables haproxy_
vrrp_instance internal {
interface <haproxy_
}
virtual_ipaddress {
<haproxy_
}
vrrp_instance external {
interface <haproxy_
}
virtual_ipaddress {
<haproxy_
}
However, for the workaround to work, the 'interface' knob of the stanza should still reference a physical interface. Also, the dummy interfaces both need to be not down (manually bringing them up puts them in an UNKNOWN state).
Request:
In order to utilise the workaround to make VIPs persist through a netplan apply and/or systemd upgrade/restart there would need to be a pair of variables per interface configurable. One to bind at the 'interface' knob and one to bind at the 'virtual_ipaddress' knob.
We can: /github. com/openstack/ openstack- ansible/ blob/442d53a4d5 8d2a58d91813dee 9ff96b51ef5063e /inventory/ group_vars/ haproxy/ keepalived. yml#L60
1) Add a new feature by adding an extra var in this line:
https:/
replacing: keepalived_ external_ interface | default( management_ bridge) }}
dev {{ haproxy_
with: keepalived_ external_ vip_interface | default( haproxy_ keepalived_ external_ interface | default( management_ bridge) ) }}
dev {{ haproxy_
2) We can add a release note pointing to a known issue for netplan, telling to use said variable, and, in the meantime a fix is released in keepalived packages, also propose a global group_var/extra var override for the users in the same reno.