Failure to allocate tunnel id when creating networks concurrently
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| neutron |
High
|
Eugene Nikanorov | ||
| Juno |
Undecided
|
Unassigned | ||
| Kilo |
Undecided
|
Unassigned |
Bug Description
When multiple networks are created concurrently, the following trace is observed:
WARNING neutron.
DEBUG neutron.context [req-2995f877-
DEBUG neutron.
DEBUG neutron.context [req-6dcfb91d-
ERROR neutron.
TRACE neutron.
TRACE neutron.
TRACE neutron.
TRACE neutron.
TRACE neutron.
TRACE neutron.
TRACE neutron.
TRACE neutron.
TRACE neutron.
TRACE neutron.
TRACE neutron.
TRACE neutron.
TRACE neutron.
TRACE neutron.
TRACE neutron.
TRACE neutron.
TRACE neutron.
Additional conditions: multiserver deployment and mysql.
Changed in neutron: | |
assignee: | nobody → Eugene Nikanorov (enikanorov) |
importance: | Undecided → High |
tags: | added: ml2 |
Changed in neutron: | |
status: | New → In Progress |
tags: | added: juno-backport-potential |
Jay Pipes (jaypipes) wrote : | #2 |
Is this using MySQL Galera as the backend database server? And if so, is the Galera setup using only a single writer node?
Eugene Nikanorov (enikanorov) wrote : | #3 |
Yes, it's with galera, single writer node.
But that doesn't matter. The issue would be the same with single mysql backend.
Oleg Bondarev (obondarev) wrote : | #4 |
Seems following patch reveals the problem: https:/
OpenStack Infra (hudson-openstack) wrote : | #5 |
Fix proposed to branch: master
Review: https:/
Changed in neutron: | |
assignee: | Eugene Nikanorov (enikanorov) → Ed Bak (ed-bak2) |
Changed in neutron: | |
assignee: | Ed Bak (ed-bak2) → Eugene Nikanorov (enikanorov) |
Changed in neutron: | |
assignee: | Eugene Nikanorov (enikanorov) → Russell Bryant (russellb) |
Changed in neutron: | |
assignee: | Russell Bryant (russellb) → Eugene Nikanorov (enikanorov) |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit 6617f8fccc8d995
Author: Eugene Nikanorov <email address hidden>
Date: Mon Nov 17 11:00:49 2014 +0400
Change transaction isolation so retry logic could work properly
Lower isolation level from REPEATABLE READ to READ COMMITTED for
transaction that is used to create a network.
This allows retry logic to see changes done in other connections
while doing the same query.
Perform that only for mysql db backend.
Change-Id: I6b9d9212c37fe0
Closes-Bug: #1382064
Changed in neutron: | |
status: | In Progress → Fix Committed |
Fix proposed to branch: stable/juno
Review: https:/
Change abandoned by Ryan Tidwell (<email address hidden>) on branch: stable/juno
Review: https:/
Reason: Cherry-picked change reverted on master
Fix proposed to branch: master
Review: https:/
YAMAMOTO Takashi (yamamoto) wrote : | #10 |
copy-and-paste from https:/
enikanorov
12-16 19:51
Patch Set 8:
Ok, I'll add this to the bug.
So my current understanding is the following:
when REPEATABLE READ is used, each distinct query in the transaction creates a snapshot on DB backend side that is used when going along the query or when issuing the same query in that transaction. When READ COMMITTED is used, each fetch reaches table directly and that, IMO, increases possible contention that leads to much more frequent deadlocks. Previous version of the patch (that set isolation level for each connection globally) demonstrated it in the gates. But that's only my guess of technical reasons, I can be wrong here.
Context manager is nice suggestion, thanks.
Fix proposed to branch: master
Review: https:/
Change abandoned by enikanorov (<email address hidden>) on branch: master
Review: https:/
Changed in neutron: | |
milestone: | none → kilo-2 |
status: | Fix Committed → Fix Released |
Changed in neutron: | |
status: | Fix Released → In Progress |
milestone: | kilo-2 → kilo-3 |
Change abandoned by Ed Bak (<email address hidden>) on branch: master
Review: https:/
Reason: Not needed any longer. https:/
Changed in neutron: | |
assignee: | Eugene Nikanorov (enikanorov) → Assaf Muller (amuller) |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit 5dbb34b56fc42d9
Author: Eugene Nikanorov <email address hidden>
Date: Thu Jan 22 15:54:29 2015 +0300
Refactor retry mechanism used in some DB operations
Use oslo_db helper that will allow to restart the whole
transaction in case it needs a certain operation to be repeated.
This is a workaround for the REPEATABLE READ problem where
retrying logic will not work because queries inside a transation
will not see updates made by other transactions.
So, run every attempt in a separate transaction.
Change-Id: I68f9ae80198797
Closes-Bug: #1382064
Changed in neutron: | |
status: | In Progress → Fix Committed |
Changed in neutron: | |
assignee: | Assaf Muller (amuller) → Eugene Nikanorov (enikanorov) |
Changed in neutron: | |
status: | Fix Committed → Fix Released |
Changed in neutron: | |
milestone: | kilo-3 → 2015.1.0 |
Related fix proposed to branch: master
Review: https:/
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit 1d9fd2aec00cb85
Author: Eugene Nikanorov <email address hidden>
Date: Mon May 11 01:34:35 2015 +0400
Randomize tunnel id query to avoid contention
When networks are created rapidly, neutron-servers compete
for segmentation ids which creates too much contention and
may lead to inability to choose available id in hardcoded amount
of attempts (11)
Randomize tunnel id selection so that condition is not hit.
Change-Id: I7068f90fe4927e
Related-Bug: #1382064
Closes-Bug: #1454434
Related fix proposed to branch: neutron-pecan
Review: https:/
Related fix proposed to branch: stable/kilo
Review: https:/
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/kilo
commit 9bc323316a27676
Author: Eugene Nikanorov <email address hidden>
Date: Mon May 11 01:34:35 2015 +0400
Randomize tunnel id query to avoid contention
When networks are created rapidly, neutron-servers compete
for segmentation ids which creates too much contention and
may lead to inability to choose available id in hardcoded amount
of attempts (11)
Randomize tunnel id selection so that condition is not hit.
Change-Id: I7068f90fe4927e
Related-Bug: #1382064
Closes-Bug: #1454434
(cherry picked from commit 1d9fd2aec00cb85
tags: | added: in-stable-kilo |
Related fix proposed to branch: stable/juno
Review: https:/
Change abandoned by stephen-ma (<email address hidden>) on branch: stable/juno
Review: https:/
Fix proposed to branch: stable/juno
Review: https:/
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/juno
commit 4cd1b58c8c4ae2a
Author: Eugene Nikanorov <email address hidden>
Date: Thu Jan 22 15:54:29 2015 +0300
Refactor retry mechanism used in some DB operations
Use oslo_db helper that will allow to restart the whole
transaction in case it needs a certain operation to be repeated.
This is a workaround for the REPEATABLE READ problem where
retrying logic will not work because queries inside a transation
will not see updates made by other transactions.
So, run every attempt in a separate transaction.
Conflicts:
(cherry picked from commit 5dbb34b56fc42d9
Change-Id: I68f9ae80198797
Closes-Bug: #1382064
tags: | added: in-stable-juno |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/juno
commit 07d3d4401f34a19
Author: Eugene Nikanorov <email address hidden>
Date: Mon May 11 01:34:35 2015 +0400
Randomize tunnel id query to avoid contention
When networks are created rapidly, neutron-servers compete
for segmentation ids which creates too much contention and
may lead to inability to choose available id in hardcoded amount
of attempts (11)
Randomize tunnel id selection so that condition is not hit.
(cherry picked from commit 1d9fd2aec00cb85
Conflicts:
Related-Bug: #1382064
Closes-Bug: #1454434
Change-Id: I7068f90fe4927e
Fix proposed to branch: master /review. openstack. org/129288
Review: https:/