L3 HA: check scripts are written after keepalived is (re)started

Bug #1674780 reported by Ihar Hrachyshka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Low
Ihar Hrachyshka

Bug Description

Code inspection showed that L3 HA implementation outputs config file for keepalived; then (re)starts the daemon, and only then attempts to write check scripts. It is a race condition vector that would show up if it would take it longer for the agent to write the check scripts; or if the agent would fail to write them at all due to some other bug. In which case, the daemon and the router may have fallen back to backup state.

We should first prepare all files, then (re)start keepalived.

Tags: l3-ha
Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

Low importance because it doesn't seem like an actual production issue (the check scripts are executed twice with 2 sec interval before considering it a failure; and script write should not take long).

Changed in neutron:
importance: Undecided → Low
status: New → In Progress
Changed in neutron:
assignee: nobody → Ihar Hrachyshka (ihar-hrachyshka)
Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :
tags: added: l3-ha
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/447679
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=9ee58550e0db2b96eef982621a6a3715a285fb6c
Submitter: Jenkins
Branch: master

commit 9ee58550e0db2b96eef982621a6a3715a285fb6c
Author: Ihar Hrachyshka <email address hidden>
Date: Mon Mar 20 20:36:29 2017 +0000

    Write vrrp_script before (re)starting keepalived

    Otherwise if keepalived decides to trigger the script before we write it
    out, or when we fail to generate the script, then the router may become
    broken.

    Closes-Bug: 1674780
    Change-Id: I2cea3159fd84c40506254fbd688cb1d745c9bf1c

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 11.0.0.0b1

This issue was fixed in the openstack/neutron 11.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.