L3 HA: 2 masters after reboot of controller
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
High
|
venkata anil |
Bug Description
ENV: Mitaka 3 controllers 45 computes DVR + L3 HA (L3 HA as well affected)
After reboot of controller on which l3 agent is active, another l3 agent becomes active. When rebooted node recover, that l3 agent becomes active as well - this lead to extra loss of external connectivity in tenant network. After some time the only one agent remains to be active - the one from rebooted node. Sometimes connectivity does not come back, as snat port ends up on wrong host.
The root cause of this problem is that routers are processed by l3 agent before openvswitch agent sets up appropriate ha ports, so for some time recovered ha routers is isolated from ha routers on other hosts and becomes active.
The possible solution for this is proper serialization of ha network creation by l3 agent after ha network is set up on controller.
With 100 routers and networks this issues has been reproduced with every reboot.
Actually this is L3 HA problem, it is just increased with DVR as the number of ports that openvswith agent should handle is higher.
summary: |
- L3 HA + DVR: 2 masters after reboot of controller + L3 HA: 2 masters after reboot of controller |
description: | updated |
description: | updated |
Changed in neutron: | |
status: | New → Confirmed |
Changed in neutron: | |
importance: | Undecided → High |
Changed in neutron: | |
assignee: | Ann Taraday (akamyshnikova) → venkata anil (anil-venkata) |
Changed in neutron: | |
assignee: | venkata anil (anil-venkata) → Ann Taraday (akamyshnikova) |
Changed in neutron: | |
assignee: | Ann Taraday (akamyshnikova) → venkata anil (anil-venkata) |
Changed in neutron: | |
assignee: | venkata anil (anil-venkata) → Ann Taraday (akamyshnikova) |
Changed in neutron: | |
assignee: | Ann Taraday (akamyshnikova) → John Schwarz (jschwarz) |
Changed in neutron: | |
status: | Fix Released → Confirmed |
Changed in neutron: | |
assignee: | John Schwarz (jschwarz) → venkata anil (anil-venkata) |
we were hitting this problem too, we solved it making sure that the l3 agent is started after the l2 agent is running.