L3 HA state transitions raceful; Metadata proxy not always opened and closed correctly
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
Medium
|
Assaf Muller |
Bug Description
L3 HA configures keepalived to invoke a bash script asynchronously whenever it performs a state transition. This script writes the new state to disk and spawns or destroys the metadata proxy depending on the new state. Manual testing has revealed that when keepalived changes state twice consecutively, for example, to standby and then to master, the standby and master scripts are called in the correct order, but may then execute code in their scripts out of order.
For example, keepalived changes state to standby and then immediately to master.
notify_standby is executed, followed by notify_master. However, notify_master writes 'master' to disk first, then notify_standby writes 'standby'. Spawning and destroying the metadata proxy may also be performed out of order.
Currently, the state is written to disk for debugging purposes and so the effect of the bug is that a new master may not have the metadata proxy up, and that routers going to standby may not shut down the proxy.
Note that the bash notifier scripts will be replaced by Python scripts during the Kilo cycle. The Python script will write the new state to disk, then inform the agent of a state transition via a Unix domain socket. The agent will then manage the metadata proxy and perform additional actions. Since bp/report-
How to reproduce:
Three L3 agents, max_l3_
tags: | added: juno-backport-potential |
Changed in neutron: | |
importance: | Undecided → Medium |
Changed in neutron: | |
milestone: | none → kilo-3 |
status: | Fix Committed → Fix Released |
Changed in neutron: | |
milestone: | kilo-3 → 2015.1.0 |
About the severity of the bug, apart from what's outlined above, the bug becomes very visible with bp/report- ha-router- master, because at that point the router states are reported to the controller and via the API. Once the bug reproduces I see two masters and the like.