[zabbix plugin] VIP becomes unavailable after its Controller reboot if Zabbix with OVS bridges are used

Bug #1644821 reported by Alexander Rubtsov on 2016-11-25
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
High
Olivier Bourdon

Bug Description

 === Environment ===
MOS: 9.0
Mode: HA
Zabbix: 2.5.1
Network template: br-mgmt and br-ex use OVS provider

 === Description ===
After reboot of the Controller running VIP, this VIP becomes unreachable

 === Steps to reproduce ===
Create an HA environment with Zabbix plugin enabled
[root@fuel ~]# fuel plugins
id | name | version | package_version | releases
---+-------------------+---------+-----------------+------------------------------------------------------------
1 | zabbix_monitoring | 2.5.1 | 3.0.0 | ubuntu (2015.1.0-7.0, liberty-8.0, liberty-9.0, mitaka-9.0)

Prepare and upload a network template which makes "br-mgmt" and "br-ex" bridges use OVS provider. Deploy the environment
...
- action: add-br
  name: br-mgmt
  provider: ovs
...
- action: add-br
  name: br-ex
  provider: ovs
...

Determine which Controller node is running VIP
# pcs status | grep "vip__management"
 vip__management (ocf::fuel:ns_IPaddr2): Started cic-0-1.domain.tld

Reboot this Controller
root@cic-0-1:~# reboot
Broadcast message from <email address hidden>
        (/dev/pts/23) at 20:30 ...

Ensure that VIP has migrated to another Controller
# pcs status | grep "vip__management"
 vip__management (ocf::fuel:ns_IPaddr2): Started cic-0-2.domain.tld

Wait for the initial Controller is back from reboot
root@cic-0-1:~# uptime
 11:46:26 up 3 min, 1 user, load average: 8.12, 2.66, 0.94

During the boot, OpenvSwitch re-created the bridges according to ovsdb records.
There is a record in ovsdb regarding "v_management" port/interface in br-mgmt bridge.
However, there is no actually "v_management" interface in system anymore (because it presents only on node where mgmt VIP is running).
Therefore, br-mgmt bridge looks like that:
root@cic-0-1:~# ovs-vsctl show
.....
    Bridge br-mgmt
.....
        Port v_management
            Interface v_management
                error: "could not open network device v_management (No such device)"
.....

root@cic-0-1:~/arp# ovs-vsctl list interface v_management
...
admin_state : []
...
error : "could not open network device v_management (No such device)"
...
name : v_management
...

Wait for Pacemaker to migrate the VIP back to the initial Controller according to resource stickiness (or you can migrate it manually to speed up the process)
# pcs status | grep "vip__management"
 vip__management (ocf::fuel:ns_IPaddr2): Started cic-0-1.domain.tld

 === Actual behavior ===
Despite the fact that Pacemaker shows VIP as "started", it's not reachable actually
root@cic-0-3:~# ping -DOnv 192.168.2.25
PING 192.168.2.25 (192.168.2.25) 56(84) bytes of data.
[1480020008.953425] no answer yet for icmp_seq=1
[1480020009.953287] no answer yet for icmp_seq=2
[1480020010.952807] no answer yet for icmp_seq=3
[1480020011.959394] no answer yet for icmp_seq=4

 === Expected behavior ===
Normally, after the migration, Pacemaker starts monitoring of VIP resource and detects that it doesn't respond:
less /var/log/daemon.log
...
ocf-ns_IPaddr2: ERROR: ARPING 192.168.2.25 from 192.168.2.6 br-ex Sent 3 probes (3 broadcast(s)) Received 0 response(s)
...

After the monitoring detected the failure, Pacemaker restart the resource, which includes deleting ports/interfaces from the bridge and inserting them back again.
It leads OVS bridge to correct state:
root@cic-0-1:~# ovs-vsctl show
.....
    Bridge br-mgmt
.....
        Port v_management
            Interface v_management
                type: internal
.....

And it makes VIP actually available:
root@cic-0-3:~# ping -DOnv 192.168.2.25
PING 192.168.2.25 (192.168.2.25) 56(84) bytes of data.
[1480020001.956581] 64 bytes from 192.168.2.25: icmp_seq=1 ttl=64 time=0.438 ms
[1480020002.956785] 64 bytes from 192.168.2.25: icmp_seq=2 ttl=64 time=0.332 ms
[1480020003.955785] 64 bytes from 192.168.2.25: icmp_seq=3 ttl=64 time=0.368 ms
[1480020004.954933] 64 bytes from 192.168.2.25: icmp_seq=4 ttl=64 time=0.490 ms
[1480020005.953754] 64 bytes from 192.168.2.25: icmp_seq=5 ttl=64 time=0.352 ms
[1480020006.953911] 64 bytes from 192.168.2.25: icmp_seq=6 ttl=64 time=0.332 ms

 === Possible reason ===
Probably, Zabbix VIP responds to Pacemaker's ARPING requests instead of mgmt-VIP which causes false-positive monitoring results
less /var/log/daemon.log
...
2016-11-24T20:40:15.120809+00:00 cic-0-1 ocf-ns_IPaddr2: INFO: ARPING 192.168.2.26 from 192.168.2.23 br-mgmt \
Unicast reply from 192.168.2.26 [9A:F3:C2:78:81:5B] 0.816ms \
Unicast reply from 192.168.2.26 [9A:F3:C2:78:81:5B] 0.570ms \
Unicast reply from 192.168.2.26 [9A:F3:C2:78:81:5B] 0.540ms \
Sent 3 probes (1 broadcast(s)) Received 3 response(s)
...

Alexander Rubtsov (arubtsov) wrote :

sla1 for 9.0-updates

Andras Olah (andras-olah) wrote :

Hi,

Let me add some info to the analysis. The false positive tests by arping happen because the default Linux settings allow any interface to respond to ARP requests (see e.g., https://lwn.net/Articles/45373/).

Here're the relevant settings in the haproxy namespace:
root@cic-0-1:~# ip netns exec haproxy sysctl net.ipv4.conf.all | grep arp
net.ipv4.conf.all.arp_accept = 1
net.ipv4.conf.all.arp_announce = 0
net.ipv4.conf.all.arp_filter = 0
net.ipv4.conf.all.arp_ignore = 0
net.ipv4.conf.all.arp_notify = 0
net.ipv4.conf.all.proxy_arp = 0
net.ipv4.conf.all.proxy_arp_pvlan = 0
root@cic-0-1:~# ip netns exec haproxy sysctl net.ipv4.conf.b_management | grep arp
net.ipv4.conf.b_management.arp_accept = 1
net.ipv4.conf.b_management.arp_announce = 0
net.ipv4.conf.b_management.arp_filter = 0
net.ipv4.conf.b_management.arp_ignore = 0
net.ipv4.conf.b_management.arp_notify = 0
net.ipv4.conf.b_management.proxy_arp = 0
net.ipv4.conf.b_management.proxy_arp_pvlan = 0

Therefore, the false positive arping tests happen if there are two VIP addresses managed by ns_IPaddr2 on the same subnet and one of them is up while the other is not accessible due to the OVS interface problem shown in the analysis above. The broadcast ARP requests are received and replied by the "other" VIP, while the ARPed IP is not accessible for normal IP traffic.

In my view, the proper solution would be to add checks to the service start script so that it checks not only that the port exists in the OVS bridge, but it also checks that the port is operational (e.g., the ofport of the corresponding interface is not -1).

As an additional measure, the ARP sysctl settings could be changed in such a way that interfaces respond to ARP requests only with their own IP. In my view, this would make sense in hosts having multiple interfaces on the same subnet.

Andras

Changed in fuel:
assignee: nobody → Fuel Plugin Zabbix (fuel-plugin-zabbix)
status: New → Confirmed
Swann Croiset (swann-w) wrote :

The Fuel plugin framework allows plugins to allocate VIP in any namespace. The vip description is here for zabbix plugin [0]
This bug should be fixed in Fuel but not into every plugin(s) IMO.

[0] https://github.com/openstack/fuel-plugin-external-zabbix/blob/2.5/network_roles.yaml#L1-L8

We have managed a similar situtation in another plugin [1] which is using itself 2 VIP in the same namespace "infrastructure_alerting" w/ the same subnet by configuring source-based IP routing and ARP filter [3]

[1] https://github.com/openstack/fuel-plugin-lma-infrastructure-alerting/blob/master/network_roles.yaml#L14
[2] https://github.com/openstack/fuel-plugin-lma-infrastructure-alerting/blob/master/deployment_scripts/puppet/modules/lma_infra_alerting/files/configure_arp_filter_for_vip#L16-L18

Changed in fuel:
assignee: Fuel Plugin Zabbix (fuel-plugin-zabbix) → Fuel Sustaining (fuel-sustaining-team)
Dmitry Pyzhov (dpyzhov) on 2016-12-13
Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Stanislaw Bogatkin (sbogatkin)
Changed in fuel:
assignee: Stanislaw Bogatkin (sbogatkin) → Fuel Sustaining (fuel-sustaining-team)
Dmitry Pyzhov (dpyzhov) on 2016-12-15
Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Stanislaw Bogatkin (sbogatkin)
Stanislaw Bogatkin (sbogatkin) wrote :

After some discussion we agreed that arp_filter enabling can lead to more problems than advantages. Much easier solution is to use different namespace for vips if plugin developers need such.

Changed in fuel:
assignee: Stanislaw Bogatkin (sbogatkin) → Fuel Plugin Zabbix (fuel-plugin-zabbix)
Swann Croiset (swann-w) wrote :

This is an easy response IMO, how do you expect us to manage to make compatible and upgradable plugin by introducing such change ? do you have any guidance ?

can you elaborate about the disadvantages w/ arp_filter ?

Stanislaw Bogatkin (sbogatkin) wrote :

Hi Swann,
you can easily use other namespace instead of system one named 'haproxy'. Just name your one accordingly ('zabbix' is okay, for example).

Disadvantages with arp_filter is the fact that IP addresses are owned by the complete host on Linux, not by particular interfaces - and it is why filtering disabled by default. So it is better to just create other namespace - it is cheap and we already done this for vrouter namespace, for example. Why do you want to mix different VIPs in one namespace?

Currently working on a fix

summary: - VIP becomes unavailable after its Controller reboot if Zabbix with OVS
- bridges are used
+ [zabbix plugin] VIP becomes unavailable after its Controller reboot if
+ Zabbix with OVS bridges are used
Changed in fuel:
assignee: Fuel Plugin Zabbix (fuel-plugin-zabbix) → Olivier Bourdon (olivierbourdon38)
Changed in fuel:
status: Confirmed → In Progress
Changed in fuel:
status: In Progress → Fix Committed
Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers