when neutron l3 ha is enabled, restart l3 agent may result multi active router

Bug #1703078 reported by Jeffrey Zhang
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
kolla
Fix Released
Undecided
Jeffrey Zhang
Ocata
Fix Released
Undecided
Jeffrey Zhang
kolla-ansible
Fix Released
Undecided
Jeffrey Zhang
Ocata
Fix Released
Undecided
Jeffrey Zhang

Bug Description

The root cause is that keepalived is not remove the vip address fro ha-xxx interface during starting.
A workaround for this is run "neutron-netns-cleanup" script before starting neutron l3 agent. which wil remove all router namespace and ensure the ha-xx is removed.

neutron-netns-cleanup require netstat command.

gongysh@ubuntu64:~$ neutron l3-agent-list-hosting-router 15d2ad43-eef6-4fa3-a73d-213ea904b0b4 2>/dev/null
+--------------------------------------+-----------+----------------+-------+----------+
| id | host | admin_state_up | alive | ha_state |
+--------------------------------------+-----------+----------------+-------+----------+
| 28154fed-87c6-4f7f-afc1-075d006310a5 | control03 | True | :-) | standby |
| 72a93457-04f0-43c5-9e6f-1fb806b5e77d | control01 | True | :-) | active |
| fb129ab1-bff6-4cb7-bad1-89d3d1083a6e | control02 | True | :-) | active |

Changed in kolla:
milestone: none → pike-3
description: updated
Changed in kolla:
assignee: nobody → Jeffrey Zhang (jeffrey4l)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla (master)

Fix proposed to branch: master
Review: https://review.openstack.org/481798

Changed in kolla:
status: New → In Progress
Changed in kolla-ansible:
milestone: none → pike-3
assignee: nobody → Jeffrey Zhang (jeffrey4l)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (master)

Fix proposed to branch: master
Review: https://review.openstack.org/481969

Changed in kolla-ansible:
status: New → In Progress
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (master)

Reviewed: https://review.openstack.org/481798
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=985255acfeced0f77142185d8f38cf9f7bcbfde0
Submitter: Jenkins
Branch: master

commit 985255acfeced0f77142185d8f38cf9f7bcbfde0
Author: Jeffrey Zhang <email address hidden>
Date: Sat Jul 8 14:31:05 2017 +0800

    Install net-tools for neutron-base container

    neutron-netns-cleanup script requires netstat command which is provided
    by net-tools package.

    Change-Id: Ic9417d2eb03e0dd93f7c668b189b4ad9c72eae0f
    Closes-Bug: #1703078

Changed in kolla:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/482998

Revision history for this message
MarginHu (margin2017) wrote :

I want to know what influences will multi active router bring .
In my enviroment , I enabled neutrn l3 agent and have the following configuration

l3_ha = true
max_l3_agents_per_router = 3
min_l3_agents_per_router = 3

and I tested with your steps then found it didn't bring bad things.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla (stable/ocata)

Reviewed: https://review.openstack.org/482998
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=293d36b54f10fd930362b5ac16dc62ec611ddd14
Submitter: Jenkins
Branch: stable/ocata

commit 293d36b54f10fd930362b5ac16dc62ec611ddd14
Author: Jeffrey Zhang <email address hidden>
Date: Sat Jul 8 14:31:05 2017 +0800

    Install net-tools for neutron-base container

    neutron-netns-cleanup script requires netstat command which is provided
    by net-tools package.

    Change-Id: Ic9417d2eb03e0dd93f7c668b189b4ad9c72eae0f
    Closes-Bug: #1703078
    (cherry picked from commit 985255acfeced0f77142185d8f38cf9f7bcbfde0)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 5.0.0.0b3

This issue was fixed in the openstack/kolla 5.0.0.0b3 development milestone.

shake.chen (shake-chen)
Changed in kolla-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (master)

Reviewed: https://review.openstack.org/481969
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=58964d6825492636405497567e0de968ea81f222
Submitter: Jenkins
Branch: master

commit 58964d6825492636405497567e0de968ea81f222
Author: Jeffrey Zhang <email address hidden>
Date: Sat Jul 8 13:58:46 2017 +0800

    Clear all l3 related namespace before starting neutron-l3-agent

    Remove all l3 related namespaces in case of multiple active routers in
    l3 high available mode. The root cause is that keepalived does not
    remove the vip address from nic during starting.

    neutron-vpnaas-agent is subclass of l3 agent, so should remove all l3
    related namespace before starting vpnaas agent.

    Closes-Bug: #1703078
    Depends-On: Ic9417d2eb03e0dd93f7c668b189b4ad9c72eae0f
    Change-Id: I05c1faf2551bb5e70c299e884adf58cd2af52739

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/494078

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 5.0.0.0rc1

This issue was fixed in the openstack/kolla-ansible 5.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 4.0.3

This issue was fixed in the openstack/kolla 4.0.3 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (stable/ocata)

Reviewed: https://review.openstack.org/494078
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=0bb7b386cb11ac0feeab15ec04c94688153383d1
Submitter: Zuul
Branch: stable/ocata

commit 0bb7b386cb11ac0feeab15ec04c94688153383d1
Author: Jeffrey Zhang <email address hidden>
Date: Sat Jul 8 13:58:46 2017 +0800

    Clear all l3 related namespace before starting neutron-l3-agent

    Remove all l3 related namespaces in case of multiple active routers in
    l3 high available mode. The root cause is that keepalived does not
    remove the vip address from nic during starting.

    neutron-vpnaas-agent is subclass of l3 agent, so should remove all l3
    related namespace before starting vpnaas agent.

    Closes-Bug: #1703078
    Depends-On: Ic9417d2eb03e0dd93f7c668b189b4ad9c72eae0f
    Change-Id: I05c1faf2551bb5e70c299e884adf58cd2af52739
    (cherry picked from commit 58964d6825492636405497567e0de968ea81f222)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 4.0.4

This issue was fixed in the openstack/kolla-ansible 4.0.4 release.

Revision history for this message
Jesús Arias Gil (jesusarias95) wrote :

I have the same error with the rocky version

3 controllers in ha, when deactivate the router ha it works correctly, however, when active it has started to change the standby state to active every 5 seconds in the 3 controllers.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.