[L3HA] Keepalived 2.x.x tracks state of virtual_ipaddresses interfaces and router now

Bug #1874211 reported by Slawek Kaplonski
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Critical
Slawek Kaplonski

Bug Description

Patch https://review.opendev.org/#/c/707406/ introduced new mechanism which brings all HA router's interfaces to be DOWN if router is in backup mode.
And that works fine with keepalived 1.4.x but in keepalived 2.x.x it changed and keepalived now tracks by default interfaces of virtual_ipaddresses and routes. And will go to FAULT state if such interface is DOWN.
That cause problem with router which will never be transitioned to master state.

We should add "no_track" option to qg- and qr- interfaces in keepalived config file now.

Errors can be seen e.g. in result of tripleo job: https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_2b3/721574/1/check/tripleo-ci-centos-8-scenario007-standalone/2b3f794/logs/undercloud/var/log/journal.txt

Apr 21 12:21:45 standalone.localdomain Keepalived[147861]: Starting Keepalived v2.0.10 (11/12,2018)
Apr 21 12:21:45 standalone.localdomain Keepalived[147861]: Running on Linux 4.18.0-147.8.1.el8_1.x86_64 #1 SMP Thu Apr 9 13:49:54 UTC 2020 (built for Linux 4.18.0)
Apr 21 12:21:45 standalone.localdomain Keepalived[147861]: Command line: '/usr/sbin/keepalived' '-n' '-l' '-D' '-P' '-f'
Apr 21 12:21:45 standalone.localdomain Keepalived[147861]: '/var/lib/neutron/ha_confs/24a8996a-5d64-446d-afcd-e08c3d72d64c/keepalived.conf' '-p'
Apr 21 12:21:45 standalone.localdomain Keepalived[147861]: '/var/lib/neutron/ha_confs/24a8996a-5d64-446d-afcd-e08c3d72d64c.pid.keepalived' '-r'
Apr 21 12:21:45 standalone.localdomain Keepalived[147861]: '/var/lib/neutron/ha_confs/24a8996a-5d64-446d-afcd-e08c3d72d64c.pid.keepalived-vrrp'
Apr 21 12:21:45 standalone.localdomain Keepalived[147861]: '-D'
Apr 21 12:21:45 standalone.localdomain Keepalived[147861]: Opening file '/var/lib/neutron/ha_confs/24a8996a-5d64-446d-afcd-e08c3d72d64c/keepalived.conf'.
Apr 21 12:21:45 standalone.localdomain Keepalived[147861]: Starting VRRP child process, pid=147864
Apr 21 12:21:45 standalone.localdomain Keepalived_vrrp[147864]: Registering Kernel netlink reflector
Apr 21 12:21:45 standalone.localdomain Keepalived_vrrp[147864]: Registering Kernel netlink command channel
Apr 21 12:21:45 standalone.localdomain Keepalived_vrrp[147864]: Opening file '/var/lib/neutron/ha_confs/24a8996a-5d64-446d-afcd-e08c3d72d64c/keepalived.conf'.
Apr 21 12:21:45 standalone.localdomain Keepalived_vrrp[147864]: (Line 22) Cannot specify scope for IPv6 addresses (fe80::f816:3eff:fe0a:3675/64) - ignoring scope
Apr 21 12:21:45 standalone.localdomain Keepalived_vrrp[147864]: (Line 23) Cannot specify scope for IPv6 addresses (fe80::f816:3eff:fea8:ae56/64) - ignoring scope
Apr 21 12:21:45 standalone.localdomain Keepalived_vrrp[147864]: (VR_99) Ignoring track_interface ha-dcefeeaa-6d since own interface
Apr 21 12:21:45 standalone.localdomain Keepalived_vrrp[147864]: Assigned address 169.254.195.203 for interface ha-dcefeeaa-6d
Apr 21 12:21:45 standalone.localdomain Keepalived_vrrp[147864]: Assigned address fe80::f816:3eff:fec7:3b26 for interface ha-dcefeeaa-6d
Apr 21 12:21:45 standalone.localdomain Keepalived_vrrp[147864]: (VR_99) entering FAULT state
Apr 21 12:21:45 standalone.localdomain Keepalived_vrrp[147864]: Registering gratuitous ARP shared channel
Apr 21 12:21:45 standalone.localdomain Keepalived_vrrp[147864]: Registering gratuitous NDISC shared channel
Apr 21 12:21:45 standalone.localdomain Keepalived_vrrp[147864]: (VR_99) removing Virtual Routes
Apr 21 12:21:45 standalone.localdomain Keepalived_vrrp[147864]: (VR_99) removing VIPs.
Apr 21 12:21:45 standalone.localdomain Keepalived_vrrp[147864]: (VR_99) removing E-VIPs.
Apr 21 12:21:45 standalone.localdomain Keepalived_vrrp[147864]: (VR_99) removing Virtual Routes

Changed in neutron:
milestone: none → ussuri-rc1
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello:

I agree with the analysis here. According to [1][2], this option should be used, but not with the interfaces but with the IP addresses.

Another question we need to answer is if this config option will be accepted in 1.x (although this feature is not in this version) or if we need to create a specific config depending on the keepalived version.

Regards.

[1]https://github.com/acassen/keepalived/issues/1110
[2]https://manpages.debian.org/unstable/keepalived/keepalived.conf.5.en.html

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-tempest-plugin (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/721805

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
LIU Yulong (dragon889) wrote :

The fix directly addresses the issue:
https://review.opendev.org/#/c/721799/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/721799
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=dc9084a8ec9db08d7ea947e0b73581b748be5819
Submitter: Zuul
Branch: master

commit dc9084a8ec9db08d7ea947e0b73581b748be5819
Author: Slawek Kaplonski <email address hidden>
Date: Wed Apr 22 09:56:40 2020 +0000

    [L3 HA] Add "no_track" option to VIPs in keepalived config

    Patch [1] introduced new mechanism which only brings UP interfaces
    on master node of HA router. It works fine with keepalived 1.x
    but it is broken when keepalived 2.x was used (e.g. on Centos 8) as
    in this new version of keepalived by default all interfaces of VIPs
    and routes are tracked, and if one of them is DOWN, keepalived is
    going to FAULT state. Because of that router will never be
    transitioned to MASTER on any node.

    This patch fixes it by adding "no_track" option to all VIPs
    and routes in keepalived's config file.

    This "no_track" option isn't added to ha interface so this one
    is still tracked by keepalived.

    [1] https://review.opendev.org/#/c/707406/

    Closes-bug: #1874211

    Change-Id: Ic16cf83fe1d1576d91047adb2d4f9e07d57185b6

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/722212

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/722213

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/722214

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/722215

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/train)

Reviewed: https://review.opendev.org/722212
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=f2d234e459ce7ca4ab049364920e11d770ce3615
Submitter: Zuul
Branch: stable/train

commit f2d234e459ce7ca4ab049364920e11d770ce3615
Author: Slawek Kaplonski <email address hidden>
Date: Wed Apr 22 09:56:40 2020 +0000

    [L3 HA] Add "no_track" option to VIPs in keepalived config

    Patch [1] introduced new mechanism which only brings UP interfaces
    on master node of HA router. It works fine with keepalived 1.x
    but it is broken when keepalived 2.x was used (e.g. on Centos 8) as
    in this new version of keepalived by default all interfaces of VIPs
    and routes are tracked, and if one of them is DOWN, keepalived is
    going to FAULT state. Because of that router will never be
    transitioned to MASTER on any node.

    This patch fixes it by adding "no_track" option to all VIPs
    and routes in keepalived's config file.

    This "no_track" option isn't added to ha interface so this one
    is still tracked by keepalived.

    [1] https://review.opendev.org/#/c/707406/

    Closes-bug: #1874211

    Change-Id: Ic16cf83fe1d1576d91047adb2d4f9e07d57185b6
    (cherry picked from commit dc9084a8ec9db08d7ea947e0b73581b748be5819)

tags: added: in-stable-train
tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/stein)

Reviewed: https://review.opendev.org/722213
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=13f01238b6f077c1547b5f7d8b2d85c405580b4b
Submitter: Zuul
Branch: stable/stein

commit 13f01238b6f077c1547b5f7d8b2d85c405580b4b
Author: Slawek Kaplonski <email address hidden>
Date: Wed Apr 22 09:56:40 2020 +0000

    [L3 HA] Add "no_track" option to VIPs in keepalived config

    Patch [1] introduced new mechanism which only brings UP interfaces
    on master node of HA router. It works fine with keepalived 1.x
    but it is broken when keepalived 2.x was used (e.g. on Centos 8) as
    in this new version of keepalived by default all interfaces of VIPs
    and routes are tracked, and if one of them is DOWN, keepalived is
    going to FAULT state. Because of that router will never be
    transitioned to MASTER on any node.

    This patch fixes it by adding "no_track" option to all VIPs
    and routes in keepalived's config file.

    This "no_track" option isn't added to ha interface so this one
    is still tracked by keepalived.

    [1] https://review.opendev.org/#/c/707406/

    Closes-bug: #1874211

    Change-Id: Ic16cf83fe1d1576d91047adb2d4f9e07d57185b6
    (cherry picked from commit dc9084a8ec9db08d7ea947e0b73581b748be5819)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/rocky)

Reviewed: https://review.opendev.org/722214
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=e24bdb3e9dea261232f5e94a919387e2d4515b64
Submitter: Zuul
Branch: stable/rocky

commit e24bdb3e9dea261232f5e94a919387e2d4515b64
Author: Slawek Kaplonski <email address hidden>
Date: Wed Apr 22 09:56:40 2020 +0000

    [L3 HA] Add "no_track" option to VIPs in keepalived config

    Patch [1] introduced new mechanism which only brings UP interfaces
    on master node of HA router. It works fine with keepalived 1.x
    but it is broken when keepalived 2.x was used (e.g. on Centos 8) as
    in this new version of keepalived by default all interfaces of VIPs
    and routes are tracked, and if one of them is DOWN, keepalived is
    going to FAULT state. Because of that router will never be
    transitioned to MASTER on any node.

    This patch fixes it by adding "no_track" option to all VIPs
    and routes in keepalived's config file.

    This "no_track" option isn't added to ha interface so this one
    is still tracked by keepalived.

    [1] https://review.opendev.org/#/c/707406/

    Closes-bug: #1874211

    Change-Id: Ic16cf83fe1d1576d91047adb2d4f9e07d57185b6
    (cherry picked from commit dc9084a8ec9db08d7ea947e0b73581b748be5819)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (stable/queens)

Change abandoned by Slawek Kaplonski (<email address hidden>) on branch: stable/queens
Review: https://review.opendev.org/722215

tags: added: neutron-proactive-backport-potential
tags: removed: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron rocky-eol

This issue was fixed in the openstack/neutron rocky-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.