l3 HA network management is racey

Bug #1548285 reported by Kevin Benton
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
John Schwarz

Bug Description

The logic surrounding the creation of the L3 HA network doesn't handle races where the network could be deleted after its existence is checked for. It also doesn't handle the case where the network doesn't exist but another creation happens before it gets to create the network.

Changed in neutron:
assignee: nobody → Kevin Benton (kevinbenton)
tags: added: l3-ha
Changed in neutron:
status: New → In Progress
Revision history for this message
Kevin Benton (kevinbenton) wrote :
Changed in neutron:
importance: Undecided → Medium
Revision history for this message
LIU Yulong (dragon889) wrote :
Changed in neutron:
assignee: Kevin Benton (kevinbenton) → John Schwarz (jschwarz)
Changed in neutron:
assignee: John Schwarz (jschwarz) → Kevin Benton (kevinbenton)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/285572

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Kevin Benton (<email address hidden>) on branch: master
Review: https://review.openstack.org/282876

Changed in neutron:
milestone: none → mitaka-rc1
Changed in neutron:
milestone: mitaka-rc1 → newton-1
Changed in neutron:
assignee: Kevin Benton (kevinbenton) → John Schwarz (jschwarz)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/292950

tags: added: mitaka-rc-potential
tags: removed: mitaka-rc-potential
Changed in neutron:
assignee: John Schwarz (jschwarz) → Kevin Benton (kevinbenton)
Changed in neutron:
assignee: Kevin Benton (kevinbenton) → John Schwarz (jschwarz)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/282876
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=7512d8aa26a945a695e889e0a97c6414cec6ac10
Submitter: Jenkins
Branch: master

commit 7512d8aa26a945a695e889e0a97c6414cec6ac10
Author: Kevin Benton <email address hidden>
Date: Sun Feb 21 21:31:59 2016 -0800

    Make L3 HA interface creation concurrency safe

    This patch creates a function to handle the creation of the
    L3HA interfaces for a router in a manner that handles the
    HA network not existing or an existing one being deleted
    by another worker before the interfaces could be created.

    Closes-Bug: #1548285
    Change-Id: Ibac0c366362aa76615e448fbe11d6d6b031732fe

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/301695

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/mitaka)

Reviewed: https://review.openstack.org/301695
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=93cdf8eb559cf78895200b3ec6775afe60b6c638
Submitter: Jenkins
Branch: stable/mitaka

commit 93cdf8eb559cf78895200b3ec6775afe60b6c638
Author: Kevin Benton <email address hidden>
Date: Sun Feb 21 21:31:59 2016 -0800

    Make L3 HA interface creation concurrency safe

    This patch creates a function to handle the creation of the
    L3HA interfaces for a router in a manner that handles the
    HA network not existing or an existing one being deleted
    by another worker before the interfaces could be created.

    Closes-Bug: #1548285
    Change-Id: Ibac0c366362aa76615e448fbe11d6d6b031732fe
    (cherry-picked from commit 7512d8aa26a945a695e889e0a97c6414cec6ac10)

tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/305772

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 8.1.0

This issue was fixed in the openstack/neutron 8.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/314250

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)
Download full text (36.9 KiB)

Reviewed: https://review.openstack.org/314250
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=3bf73801df169de40d365e6240e045266392ca63
Submitter: Jenkins
Branch: master

commit a323769143001d67fd1b3b4ba294e59accd09e0e
Author: Ryan Moats <email address hidden>
Date: Tue Oct 20 15:51:37 2015 +0000

    Revert "Improve performance of ensure_namespace"

    This reverts commit 81823e86328e62850a89aef9f0b609bfc0a6dacd.

    Unneeded optimization: this commit only improves execution
    time on the order of milliseconds, which is less than 1% of
    the total router update execution time at the network node.

    This also

    Closes-bug: #1574881

    Change-Id: Icbcdf4725ba7d2e743bb6761c9799ae436bd953b

commit 7fcf0253246832300f13b0aa4cea397215700572
Author: OpenStack Proposal Bot <email address hidden>
Date: Thu Apr 21 07:05:16 2016 +0000

    Imported Translations from Zanata

    For more information about this automatic import see:
    https://wiki.openstack.org/wiki/Translations/Infrastructure

    Change-Id: I9e930750dde85a9beb0b6f85eeea8a0962d3e020

commit 643b4431606421b09d05eb0ccde130adbf88df64
Author: OpenStack Proposal Bot <email address hidden>
Date: Tue Apr 19 06:52:48 2016 +0000

    Imported Translations from Zanata

    For more information about this automatic import see:
    https://wiki.openstack.org/wiki/Translations/Infrastructure

    Change-Id: I52d7460b3265b5460b9089e1cc58624640dc7230

commit 1ffea42ccdc14b7a6162c1895bd8f2aae48d5dae
Author: OpenStack Proposal Bot <email address hidden>
Date: Mon Apr 18 15:03:30 2016 +0000

    Updated from global requirements

    Change-Id: Icb27945b3f222af1d9ab2b62bf2169d82b6ae26c

commit b970ed5bdac60c0fa227f2fddaa9b842ba4f51a7
Author: Kevin Benton <email address hidden>
Date: Fri Apr 8 17:52:14 2016 -0700

    Clear DVR MAC on last agent deletion from host

    Once all agents are deleted from a host, the DVR MAC generated
    for that host should be deleted as well to prevent a buildup of
    pointless flows generated in the OVS agent for hosts that don't
    exist.

    Closes-Bug: #1568206
    Change-Id: I51e736aa0431980a595ecf810f148ca62d990d20
    (cherry picked from commit 92527c2de2afaf4862fddc101143e4d02858924d)

commit eee9e58ed258a48c69effef121f55fdaa5b68bd6
Author: Mike Bayer <email address hidden>
Date: Tue Feb 9 13:10:57 2016 -0500

    Add an option for WSGI pool size

    Neutron currently hardcodes the number of
    greenlets used to process requests in a process to 1000.
    As detailed in
    http://lists.openstack.org/pipermail/openstack-dev/2015-December/082717.html

    this can cause requests to wait within one process
    for available database connection while other processes
    remain available.

    By adding a wsgi_default_pool_size option functionally
    identical to that of Nova, we can lower the number of
    greenlets per process to be more in line with a typical
    max database connection pool size.

    DocImpact: a previously unused configuration value
               wsgi_default_pool_size is now used to a...

tags: added: neutron-proactive-backport-potential
Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 9.0.0.0b1

This issue was fixed in the openstack/neutron 9.0.0.0b1 development milestone.

Revision history for this message
John Schwarz (jschwarz) wrote :

In regards to discussions on whether or not to include this in stable/liberty: when creating an HA router and concurrently deleted the last of a tenant's HA router, the race condition of deleting the HA network is possible, even though the creation of an HA router demands an HA router. As a result, a router might not be created properly or alternatively created without the HA resources required (thus the HA router cannot operate easily). The solution is to manually re-create the HA router.

In some cases, the lack of resources can lead the l3 agents to an infinite loop, requiring in addition a restart of the agent.

This race condition can happen easily when running rally's create_and_delete_routers sample task and on large deployment with removal and creation of HA routers.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (stable/liberty)

Change abandoned by John Schwarz (<email address hidden>) on branch: stable/liberty
Review: https://review.openstack.org/305772

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.openstack.org/292950
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=2a076272688487650192d75ff261f6fdf7835516
Submitter: Jenkins
Branch: master

commit 2a076272688487650192d75ff261f6fdf7835516
Author: Kevin Benton <email address hidden>
Date: Mon Mar 14 09:54:07 2016 -0700

    Make create_object_with_dependency cleanup

    This adjusts the create_object_with_dependency helper function
    to attempt to cleanup any dependency it was responsible for creating
    if it encounters a failure in trying to attach a child to the
    dependency.

    Change-Id: I363f3a299c55e5063b4239028728bb5593132010
    Related-Bug: #1548285

tags: removed: neutron-proactive-backport-potential
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.