Default gateway in HA router namespace not set if using Keepalived 1.x

Bug #1890400 reported by Alexander Gräb
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Slawek Kaplonski

Bug Description

With Rocky commit f2d234e introduces a change making Neutron more compatible with Keepalived 2.x. A new option `no_track` is passed to Keepalived by the L3 agent which Keepalived 1.x doesn't recognize, thus the configuration is partially not applied by Keepalived 1.x. resulting in a missing default gateway. Thus instances using HA routers to communicate with the outside world are unable to reach the internet and cannot be reached via their floating IP addresses.

There are some workarounds to trigger the creating of the default gateway like disable and enable the router again or disable/re-start the L3 agent which hosts the master namespace of the HA router.

Steps to reproduce:
1. Create an HA router
2. Add the gateway network (now the default gateway should be set within the routers master network namespace but there is none)
3. Connect a tenant subnet to the router
4. Create an instance connected to the tenant network created in step 3
5. Try to reach the internet from within the instances created in step 4

I was able to get some log output out of Keepalived:
Tue Aug 4 11:59:58 2020: Starting Keepalived v1.3.9 (10/21,2017)
Tue Aug 4 11:59:58 2020: Opening file '/var/lib/neutron/ha_confs/019f8036-8730-4584-9e07-c4a6504447ab/keepalived.conf'.
Tue Aug 4 11:59:58 2020: Starting VRRP child process, pid=2864
Tue Aug 4 11:59:58 2020: Registering Kernel netlink reflector
Tue Aug 4 11:59:58 2020: Registering Kernel netlink command channel
Tue Aug 4 11:59:58 2020: Registering gratuitous ARP shared channel
Tue Aug 4 11:59:58 2020: Opening file '/var/lib/neutron/ha_confs/019f8036-8730-4584-9e07-c4a6504447ab/keepalived.conf'.
Tue Aug 4 11:59:58 2020: Unknown configuration entry 'no_track' for ip address - ignoring
Tue Aug 4 11:59:58 2020: Unknown configuration entry 'no_track' for ip address - ignoring
Tue Aug 4 11:59:58 2020: Cannot specify scope for IPv6 addresses (fe80::f816:3eff:fe2c:6622/64) - ignoring scope
Tue Aug 4 11:59:58 2020: VRRP parsed invalid IP no_track. skipping IP...
Tue Aug 4 11:59:58 2020: unknown route keyword no_track
Tue Aug 4 11:59:58 2020: VRRP_Instance(VR_23) removing protocol VIPs.
Tue Aug 4 11:59:58 2020: VRRP_Instance(VR_23) removing protocol E-VIPs.
Tue Aug 4 11:59:58 2020: Using LinkWatch kernel netlink reflector...
Tue Aug 4 11:59:58 2020: VRRP_Instance(VR_23) Entering BACKUP STATE
Tue Aug 4 11:59:58 2020: VRRP sockpool: [ifindex(1033), proto(112), unicast(0), fd(9,10)]

You see it complaining about the 'no_track' option.

We use Kolla containers with Ubuntu base. Even though Keepalived 2 was release quite a while ago, it still only provides Keepalived 1.x via the package repositories. Even Kolla in the latest version still uses Ubuntu 18.04 as base with Keepalived 1.x. Theoretically all users using Kolla containers with Ubuntu base (other base images not tested) are affected. There seems to be no apt sources for Keepalived 2.x for Ubuntu 18.04. You need to compile it from source in order to get a newer version.

Maybe it should depend on the Keepalived version whether to pass the 'no_track' option or make it configurable.

tags: added: l3-ha
Changed in neutron:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello:

Although this is not recommended, we'll need to make a runtime check. We can't enforce the keepalived version and we should support both.

We need something similar to https://review.opendev.org/#/c/726079/1/neutron/agent/linux/dhcp.py.

Regards.

Revision history for this message
Alexander Gräb (alexander.graeb) wrote :

Sounds reasonable. The Keepalived version could be determined with `keepalived -v`.

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Hi,

I just wanted to propose something similar what Rodolfo proposed.
If You want to propose patch, please assign this bug to You, otherwise I can propose patch today.

Revision history for this message
Alexander Gräb (alexander.graeb) wrote :

If you propose a patch I would be glad to try it out.

Changed in neutron:
assignee: nobody → Slawek Kaplonski (slaweq)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/745641

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/745641
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=7abe0ee34c367b4abf84820048b4aed643fc1162
Submitter: Zuul
Branch: master

commit 7abe0ee34c367b4abf84820048b4aed643fc1162
Author: Slawek Kaplonski <email address hidden>
Date: Tue Aug 11 10:47:24 2020 +0200

    Add 'keepalived_use_no_track' config option

    Patch [1] added option "no_track" to the keepalived's config file which
    is generated by L3 agent in HA mode.
    This was added to handle properly keepalived 2.x and interfaces which
    are in DOWN state in the backup nodes.
    But this "no_track" option is not compatible with keepalived 1.x series
    which is available e.g. on Ubuntu 18.04.

    As there is no easy way to check automatically if keepalived supports or
    not this config flag, this patch introduces new config option
    "keepalived_use_no_track".
    If this config option will be set to False, neutron L3 agent will not
    add "no_track" to the keepalived's config.

    As master branch is moving to gate on Ubuntu 20.04 where keepalived 2.x
    is already available, this new config option default value is set to
    True.

    [1] https://review.opendev.org/#/c/721799/

    Change-Id: I2dfdb9f56de28d56ca0f240ff34fa7c3a12e339b
    Closes-Bug: #1890400

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/747855

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/747856

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/747857

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/747862

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ussuri)

Reviewed: https://review.opendev.org/747855
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=43079e3fc919abd15f6b352ccc38cc8b6218e9e9
Submitter: Zuul
Branch: stable/ussuri

commit 43079e3fc919abd15f6b352ccc38cc8b6218e9e9
Author: Slawek Kaplonski <email address hidden>
Date: Tue Aug 11 10:47:24 2020 +0200

    Add 'keepalived_use_no_track' config option

    Patch [1] added option "no_track" to the keepalived's config file which
    is generated by L3 agent in HA mode.
    This was added to handle properly keepalived 2.x and interfaces which
    are in DOWN state in the backup nodes.
    But this "no_track" option is not compatible with keepalived 1.x series
    which is available e.g. on Ubuntu 18.04.

    As there is no easy way to check automatically if keepalived supports or
    not this config flag, this patch introduces new config option
    "keepalived_use_no_track".
    If this config option will be set to False, neutron L3 agent will not
    add "no_track" to the keepalived's config.

    As master branch is moving to gate on Ubuntu 20.04 where keepalived 2.x
    is already available, this new config option default value is set to
    True.

    [1] https://review.opendev.org/#/c/721799/

    Change-Id: I2dfdb9f56de28d56ca0f240ff34fa7c3a12e339b
    Closes-Bug: #1890400
    (cherry picked from commit 7abe0ee34c367b4abf84820048b4aed643fc1162)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/rocky)

Reviewed: https://review.opendev.org/747862
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=8ffbcb5a854ac12977e275f96f5f68f6cf41e24f
Submitter: Zuul
Branch: stable/rocky

commit 8ffbcb5a854ac12977e275f96f5f68f6cf41e24f
Author: Slawek Kaplonski <email address hidden>
Date: Tue Aug 11 10:47:24 2020 +0200

    Add 'keepalived_use_no_track' config option

    Patch [1] added option "no_track" to the keepalived's config file which
    is generated by L3 agent in HA mode.
    This was added to handle properly keepalived 2.x and interfaces which
    are in DOWN state in the backup nodes.
    But this "no_track" option is not compatible with keepalived 1.x series
    which is available e.g. on Ubuntu 18.04.

    As there is no easy way to check automatically if keepalived supports or
    not this config flag, this patch introduces new config option
    "keepalived_use_no_track".
    If this config option will be set to False, neutron L3 agent will not
    add "no_track" to the keepalived's config.

    As master branch is moving to gate on Ubuntu 20.04 where keepalived 2.x
    is already available, this new config option default value is set to
    True.

    [1] https://review.opendev.org/#/c/721799/

    Conflicts:
        neutron/conf/agent/l3/config.py
        neutron/tests/functional/agent/linux/test_keepalived.py

    Change-Id: I2dfdb9f56de28d56ca0f240ff34fa7c3a12e339b
    Closes-Bug: #1890400
    (cherry picked from commit 7abe0ee34c367b4abf84820048b4aed643fc1162)

tags: added: in-stable-rocky
tags: added: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/train)

Reviewed: https://review.opendev.org/747856
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=b2b9fd66b9e673a4cbc43258ab0ac306db2f38d1
Submitter: Zuul
Branch: stable/train

commit b2b9fd66b9e673a4cbc43258ab0ac306db2f38d1
Author: Slawek Kaplonski <email address hidden>
Date: Tue Aug 11 10:47:24 2020 +0200

    Add 'keepalived_use_no_track' config option

    Patch [1] added option "no_track" to the keepalived's config file which
    is generated by L3 agent in HA mode.
    This was added to handle properly keepalived 2.x and interfaces which
    are in DOWN state in the backup nodes.
    But this "no_track" option is not compatible with keepalived 1.x series
    which is available e.g. on Ubuntu 18.04.

    As there is no easy way to check automatically if keepalived supports or
    not this config flag, this patch introduces new config option
    "keepalived_use_no_track".
    If this config option will be set to False, neutron L3 agent will not
    add "no_track" to the keepalived's config.

    As master branch is moving to gate on Ubuntu 20.04 where keepalived 2.x
    is already available, this new config option default value is set to
    True.

    [1] https://review.opendev.org/#/c/721799/

    Conflicts:
        neutron/conf/agent/l3/config.py

    Change-Id: I2dfdb9f56de28d56ca0f240ff34fa7c3a12e339b
    Closes-Bug: #1890400
    (cherry picked from commit 7abe0ee34c367b4abf84820048b4aed643fc1162)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/stein)

Reviewed: https://review.opendev.org/747857
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=abd4e849257b401a5e7c2d61b753cc6ab52bdc6d
Submitter: Zuul
Branch: stable/stein

commit abd4e849257b401a5e7c2d61b753cc6ab52bdc6d
Author: Slawek Kaplonski <email address hidden>
Date: Tue Aug 11 10:47:24 2020 +0200

    Add 'keepalived_use_no_track' config option

    Patch [1] added option "no_track" to the keepalived's config file which
    is generated by L3 agent in HA mode.
    This was added to handle properly keepalived 2.x and interfaces which
    are in DOWN state in the backup nodes.
    But this "no_track" option is not compatible with keepalived 1.x series
    which is available e.g. on Ubuntu 18.04.

    As there is no easy way to check automatically if keepalived supports or
    not this config flag, this patch introduces new config option
    "keepalived_use_no_track".
    If this config option will be set to False, neutron L3 agent will not
    add "no_track" to the keepalived's config.

    As master branch is moving to gate on Ubuntu 20.04 where keepalived 2.x
    is already available, this new config option default value is set to
    True.

    [1] https://review.opendev.org/#/c/721799/

    Conflicts:
        neutron/conf/agent/l3/config.py

    Change-Id: I2dfdb9f56de28d56ca0f240ff34fa7c3a12e339b
    Closes-Bug: #1890400
    (cherry picked from commit 7abe0ee34c367b4abf84820048b4aed643fc1162)

tags: added: in-stable-stein
Revision history for this message
Corey Bryant (corey.bryant) wrote :

I opened the following bug against this fix because there's no run-time check of keepalived version making this fix not desirable if running with keepalived < 2.x:
https://bugs.launchpad.net/ubuntu/+source/neutron/+bug/1896506

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron rocky-eol

This issue was fixed in the openstack/neutron rocky-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.