create_security_group code may get into endless loop

Bug #1475938 reported by Eugene Nikanorov
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Oleg Bondarev
Kilo
Fix Released
Undecided
Unassigned

Bug Description

That damn piece of code again.

In some cases when network is created for tenant and default security group is created in the process, there may be concurrent network or sg creation happening.
That leads to a condition when the code fetches default sg, it's not there, tries to add it - it's already there, then it tries to fetch it again, but due to REPEATABLE READ isolation method, the query returns empty result, as in the first attempt.
As a result, such logic will hang in the loop forever.

Reproducible with rally create_and_delete_ports test.

Changed in fuel:
assignee: nobody → Eugene Nikanorov (enikanorov)
tags: added: sg-fw
affects: fuel → neutron
Changed in neutron:
importance: Undecided → High
description: updated
description: updated
description: updated
Changed in neutron:
status: New → In Progress
Changed in neutron:
assignee: Eugene Nikanorov (enikanorov) → Oleg Bondarev (obondarev)
Revision history for this message
Sandhya Dasu (sadasu) wrote :

If the fix is in progress, could you please add a link to the code review?

Changed in neutron:
assignee: Oleg Bondarev (obondarev) → Henry Gessau (gessau)
Changed in neutron:
assignee: Henry Gessau (gessau) → Oleg Bondarev (obondarev)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/203384
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=80ee562dec3f397ea6c18a4ca3a1e69ab996341e
Submitter: Jenkins
Branch: master

commit 80ee562dec3f397ea6c18a4ca3a1e69ab996341e
Author: Eugene Nikanorov <email address hidden>
Date: Sun Jul 19 03:17:43 2015 +0400

    Fix _ensure_default_security_group logic

    In a case when first attempt to fetch default security group
    fails and attempt to add it fails too due to a concurrent insertion,
    later attempt to fetch the same default sg may fail due to
    REPEATABLE READ transaction isolation level.
    For this case RetryRequest should be issued to restart the
    whole transaction and be able to see default group.

    The patch also removes 'while True' logic as it's unsafe

    Closes-Bug: #1475938
    Change-Id: I20f65d3eae9421429aced1f4586cb6988ab577ff

Changed in neutron:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (feature/pecan)

Fix proposed to branch: feature/pecan
Review: https://review.openstack.org/218710

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (feature/pecan)
Download full text (155.6 KiB)

Reviewed: https://review.openstack.org/218710
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=2c5f44e1b3bd4ed8a0b7232fd293b576cc8c1c87
Submitter: Jenkins
Branch: feature/pecan

commit f35d1c5c50dccbef1a2e079f967b82f0df0e22e9
Author: Adelina Tuvenie <email address hidden>
Date: Thu Aug 27 02:27:28 2015 -0700

    Fixes wrong neutron Hyper-V Agent name in constants

    Change Id03fb147e11541be309c1cd22ce27e70fadc28b5 moved the
    AGENT_TYPE_HYPERV constant from common.constants to
    plugins.ml2.drivers.hyperv.constants but change the value of the
    constant from 'HyperV agent' to 'hyperv'. This patch changes
    the name back to 'HyperV agent'

    Change-Id: If74b4b2a84811e266c8b12e70bf6bfe74ed4ea21
    Partial-Bug: #1487598

commit de604de334854e2eb6b4312ff57920564cbd4459
Author: OpenStack Proposal Bot <email address hidden>
Date: Sun Aug 30 01:39:06 2015 +0000

    Updated from global requirements

    Change-Id: Ie52aa3b59784722806726e4046bd07f4a4d97328

commit f0415ac20eaf5ab4abb9bd4839bf6d04ceee85d0
Author: armando-migliaccio <email address hidden>
Date: Fri Aug 28 13:53:04 2015 -0700

    Revert "Add support for unaddressed port"

    This implementation may expose a vulnerability where a malicious
    user can sieze the opportunity of a time window where a port
    may land unaddressed on a shared network, thus allowing him/her
    to suck up all the tenant traffic he/she wants....oh the shivers.

    This reverts commit d4c52b7f5a36a103a92bf9dcda7f371959112292.

    Change-Id: I7ebdaa8d3defa80eab90e460fde541a5bdd8864c

commit 013fdcd2a6d45dbe4de5d6e7077e5e9b60985ef9
Author: Assaf Muller <email address hidden>
Date: Fri Aug 28 16:41:07 2015 -0400

    Improve logging upon failure in iptables functional tests

    This will help us nail down a more accurate and efficient logstash
    query.

    Change-Id: Iee4238e358f7b056e373c7be8d6aa3202117a680
    Related-Bug: #1478847

commit 622dea818d851224a43d5276a81d5ce8a6eebb76
Author: Ivar Lazzaro <email address hidden>
Date: Mon Aug 17 17:17:42 2015 -0700

    handle gw_info outside of the db transaction on router creation

    Move the gateway interface creation outside the DB transaction
    to avoid lock timeout.

    Change-Id: I5a78d7f32e8ca912016978105221d5f34618af19
    Closes-bug: 1485809

commit 5b27d290a0a95f6247fc5a0fe6da1e7d905e6b2d
Author: Assaf Muller <email address hidden>
Date: Wed Aug 26 10:07:03 2015 -0400

    Remove ml2 resource extension success logging

    This is the cause of a tremendous amount of logs, for no
    perceivable gain. A normal dvr run in the gate shows this debug
    message around 120K times, which is way too much.

    Closes-Bug: #1489952

    Change-Id: I26fca8515d866a7cc1638d07fa33bc04479ae221

commit 8d3faf549cba2f58c872ef4121b2481e73464010
Author: huangpengtao <email address hidden>
Date: Fri Aug 28 23:20:46 2015 +0800

    Replace "prt" variable by "port"

    the local variable prt is meaningless,
    and port is used popular.

    Change-Id: I20849102cf5b4d84433c46791b4b1e2a22dc4739

commit ee374e7a5f4dea538fcd942f5...

tags: added: in-feature-pecan
Thierry Carrez (ttx)
Changed in neutron:
milestone: none → liberty-3
status: Fix Committed → Fix Released
tags: added: kilo-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/229153

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/kilo)

Reviewed: https://review.openstack.org/229153
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=96276d5f9b092a226aec6657796c4a1c5d25e6af
Submitter: Jenkins
Branch: stable/kilo

commit 96276d5f9b092a226aec6657796c4a1c5d25e6af
Author: Eugene Nikanorov <email address hidden>
Date: Sun Jul 19 03:17:43 2015 +0400

    Fix _ensure_default_security_group logic

    In a case when first attempt to fetch default security group
    fails and attempt to add it fails too due to a concurrent insertion,
    later attempt to fetch the same default sg may fail due to
    REPEATABLE READ transaction isolation level.
    For this case RetryRequest should be issued to restart the
    whole transaction and be able to see default group.

    The patch also removes 'while True' logic as it's unsafe

    Closes-Bug: #1475938
    Change-Id: I20f65d3eae9421429aced1f4586cb6988ab577ff
    (cherry-picked from commit 80ee562dec3f397ea6c18a4ca3a1e69ab996341e)

tags: added: in-stable-kilo
Thierry Carrez (ttx)
Changed in neutron:
milestone: liberty-3 → 7.0.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.openstack.org/231081
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=aaaa07277b4493e40675a8d29dd8b40681ef8610
Submitter: Jenkins
Branch: master

commit aaaa07277b4493e40675a8d29dd8b40681ef8610
Author: Kevin Benton <email address hidden>
Date: Mon Oct 5 09:27:46 2015 -0700

    eliminate retries inside of _ensure_default_security_group

    Since we have to worry about REPEATABLE READ and have logic to
    deal with that by throwing a RetryRequest, let's just simplify
    the code and eliminate the other retry mechanism completey.

    Related-bug: #1475938
    Change-Id: I0aab460f60e690a369f09d59a75bb4ca5a7c33f6

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.