neutron router shows active on a dead agent

Bug #1682145 reported by on 2017-04-12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Brian Haley

Bug Description

When a router is active on only one network node and if the network node goes down by any reason, router still shows active status in controller

neutron l3-agent-list-hosting-router e5bae5bd-40ae-45b2-837d-9d00a74a1e1b
neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
| id | host | admin_state_up | alive | ha_state |
+--------------------------------------+--------------+----------------+-------+----------+ |
| db94053c-f7a7-4bf8-a7ed-aa01f7c8ef34 | netowkr-node1| True | xxx | active |

Logic at, checks to update status based on dead agents, needs to executed in this scenario too.

Trevor McCasland (twm2016) wrote :

Versioned github link from bug description:

Adding a case for the scenario described in the bug description will fix the issue but is it HA if you only have one node? "an HA cluster is a two-node cluster, since that is the minimum required to provide redundancy"

I think the bug can be reworded to exit HA mode if only one network node is detected, unless you have a use case for this?

Changed in neutron:
status: New → Opinion
tags: added: l3-ha (ymadhavi) wrote :

Following is the scenario/use case which we are trying

1. networknode1, networkde2 exists in the environment.
2. HA router created and router is active on networknode1 and standby on networknode2
3. networknode1 goes down due to some reason, now router is active on networknode2 and standby on networknode1
4. networknode2 also goes down due to some reason, l3 agent is dead.

But router still shows active on networknode2.

Changed in neutron:
status: Opinion → New
Ann Taraday (akamyshnikova) wrote :

The original bug about this is
Actually at first we were setting standby for all dead agents but this caused us another bug, so I have to change logic here

Fix proposed to branch: master

Changed in neutron:
assignee: nobody → Drew Thorstensen (thorst)
status: New → In Progress
Ann Taraday (akamyshnikova) wrote :

In my opinion, this bug should be marked as an known issue with proper description in docs.

Drew Thorstensen (thorst) wrote :

Ann - why do you say that? It does not seem to be functionally correct from my perspective.

If a L3 agent is down, the router is still active. It has failed over. The router is not in standby, it just failed over.

If we had a more granular state of 'degraded', I could see that being useful. But that seems more pervasive.

Ann Taraday (akamyshnikova) wrote :

Pay attention to the links that I put with my first comment, I don't want us to go through the same cycle.

Changed in neutron:
assignee: Drew Thorstensen (thorst) → (ymadhavi)
Changed in neutron:
assignee: (ymadhavi) → Matthew Edmonds (edmondsw)
Changed in neutron:
assignee: Matthew Edmonds (edmondsw) → Brian Haley (brian-haley)
Changed in neutron:
assignee: Brian Haley (brian-haley) → Ann Taraday (akamyshnikova)
zhaobo (zhaobo6) on 2018-03-19
Changed in neutron:
importance: Undecided → Medium
Changed in neutron:
assignee: Ann Taraday (akamyshnikova) → Brian Haley (brian-haley)

Submitter: Zuul
Branch: master

commit b62d1bfdf71c2f8810d9b143d50127b8f3a4942d
Author: Drew Thorstensen <email address hidden>
Date: Fri Apr 21 08:02:17 2017 -0400

    Router should flip to standby if all L3 nodes down

    A HA router should always be active unless all of the agents hosting
    that router go down. In that event, the router should switch to
    standby. This behavior changed with review:

    That review seemed to be accounting for a flakey message bus. This
    change should account for that, but also revert to the original behavior
    of the router state only changing when its backing agent hosts are down.

    Change-Id: I89c3b2546382624f175f8de4de621c3e53adf527
    Closes-Bug: 1682145

Changed in neutron:
status: In Progress → Fix Released

This issue was fixed in the openstack/neutron development milestone.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers