During L->N upgrade, the floating ip connectivity is lost.

Bug #1698373 reported by Sofer Athlan-Guyot on 2017-06-16
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
High
Sofer Athlan-Guyot

Bug Description

Hi,

After a discussion with Miguel Angel Ajo, it seems that the cut in floating ip is not needed, ie downtime to the l3 agents shouldn't translate into floating ip unreachability. Full explanation here[1]

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1419751#c13

Ben Nemec (bnemec) wrote :

Is this actually M->N? Direct L->N upgrades are not supported.

Changed in tripleo:
status: Confirmed → Triaged
assignee: nobody → Sofer Athlan-Guyot (sofer-athlan-guyot)
importance: Critical → High

For reference, another report of failure: https://bugzilla.redhat.com/show_bug.cgi?id=1434621

Changed in tripleo:
status: Triaged → In Progress
Changed in tripleo:
milestone: pike-3 → pike-rc1
Ben Nemec (bnemec) wrote :
Changed in tripleo:
status: In Progress → Fix Released

Change abandoned by Athlan-Guyot sofer (<email address hidden>) on branch: stable/newton
Review: https://review.openstack.org/474967
Reason: Fixed with updated code in packaging. Not clear what fixed it.

Reviewed: https://review.openstack.org/474967
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=bce61783bc175e98b535c678d90829344dab5c47
Submitter: Jenkins
Branch: stable/newton

commit bce61783bc175e98b535c678d90829344dab5c47
Author: Sofer Athlan-Guyot <email address hidden>
Date: Fri Jun 16 14:53:14 2017 +0200

    Keep floating ip reachability during pacemaker migration.

    neutron-netns-cleanup-clone and neutron-ovs-cleanup-clone resources
    are only one shot resources that need to be activated during bootup
    and shutdown of node. Triggering a stop has the side effect of
    removing entries in the ovs db, making floating ips unreachable.

    Those are triggered by the the
    /usr/lib/ocf/resource.d/neutron/{OVSCleanup,NetnsCleanup} pacemaker
    resource. They, in turn, use the
    /usr/lib/ocf/lib/neutron/neutron-{ovs,netns}-cleanup scripts (not the
    systemd unit file). We temporarily disable any action by configuring
    the executable to be "/bin/echo" instead of the
    /usr/bin/neutron-{ovs,netns}-cleanup and removing the "--force" option
    in neutron-netns-cleanup.

    As those resources are cloned resources we need to make sure that the
    modification is done on all controller nodes before we take action on
    the controller bootstraping node. To do that we move most of Step1 to
    Step0 and make the bootstrap node action happens at Step1 of the
    pacemaker controller upgrade.

    Furthermore we make sure that the ext bridges, if ovs is used, are not
    in secure mode by setting them to standalone during the upgrade
    process and back to whatever they were before.

    Eventually we need to take care of the os-net-config upgrade. It can
    add new parameters to the interface definition which will force a
    restart of the interfaces. To avoid that we add the --no-activate
    option. Currently no major change in os-net-config are required for
    the overcloud to continue running after the upgrade.

    Co-Authored-By: Raoul Scarazzini <email address hidden>
    Change-Id: Ib5d7b447808b51f6e436eaf6d661606132155a23
    Depends-On: Ieb5ad6ad429c8388a1cbbd650339b6eecd9b7997
    Closes-Bug: #1698373

tags: added: in-stable-newton

This issue was fixed in the openstack/tripleo-heat-templates 5.3.3 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.