floating ip delete deadlock

Bug #1422504 reported by Kevin Fox
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Oleg Bondarev
Juno
Fix Released
Undecided
Unassigned
Kilo
New
Undecided
Unassigned

Bug Description

rdo juno:

2015-02-16 13:54:11.772 3612 ERROR neutron.api.v2.resource [req-5c6e13d3-56d6-476b-a961-e767aea637e5 None] delete failed
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource Traceback (most recent call last):
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib/python2.7/site-packages/neutron/api/v2/resource.py", line 87, in resource
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource result = method(request=request, **args)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib/python2.7/site-packages/neutron/api/v2/base.py", line 476, in delete
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource obj_deleter(request.context, id, **kwargs)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib/python2.7/site-packages/neutron/db/l3_dvr_db.py", line 183, in delete_floatingip
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource self).delete_floatingip(context, id)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib/python2.7/site-packages/neutron/db/l3_db.py", line 1178, in delete_floatingip
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource router_id = self._delete_floatingip(context, id)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib/python2.7/site-packages/neutron/db/l3_db.py", line 840, in _delete_floatingip
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource l3_port_check=False)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/plugin.py", line 984, in delete_port
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource port_db, binding = db.get_locked_port_and_binding(session, id)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/db.py", line 141, in get_locked_port_and_binding
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource with_lockmode('update').
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2369, in one
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource ret = list(self)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2411, in __iter__
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource self.session._autoflush()
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 1198, in _autoflush
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource self.flush()
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 1919, in flush
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource self._flush(objects)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2037, in _flush
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource transaction.rollback(_capture_exception=True)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 60, in __exit__
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource compat.reraise(exc_type, exc_value, exc_tb)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2001, in _flush
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource flush_context.execute()
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 372, in execute
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource rec.execute(self)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 555, in execute
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource uow
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 117, in delete_obj
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource cached_connections, mapper, table, delete)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 694, in _emit_delete_statements
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource c = connection.execute(statement, del_objects)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 729, in execute
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource return meth(self, multiparams, params)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 321, in _execute_on_connection
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource return connection._execute_clauseelement(self, multiparams, params)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 826, in _execute_clauseelement
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource compiled_sql, distilled_params
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 958, in _execute_context
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource context)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1156, in _handle_dbapi_exception
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource util.raise_from_cause(newraise, exc_info)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource reraise(type(exception), exception, tb=exc_tb)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 951, in _execute_context
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource context)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 436, in do_execute
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource cursor.execute(statement, parameters)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/MySQLdb/cursors.py", line 174, in execute
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource self.errorhandler(self, exc, value)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource raise errorclass, errorvalue
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource DBDeadlock: (OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction') 'DELETE FROM floatingips WHERE floatingips.id = %s' ('1423b851-2a7e-471f-b790-c35c7dade6cf',)
2015-02-16 13:54:11.772 3612 TRACE neutron.api.v2.resource
2015-02-16 13:54:11.784 3612 INFO neutron.wsgi [req-5c6e13d3-56d6-476b-a961-e767aea637e5 None] 172.20.96.65 - - [16/Feb/2015 13:54:11] "DELETE //v2.0/floatingips/1423b851-2a7e-471f-b790-c35c7dade6cf.json HTTP/1.1" 500 359 4.735289

Elena Ezhova (eezhova)
tags: added: db
Revision history for this message
Vikram Hosakote (vhosakot) wrote :

Hi Kevin,

Can you provide the exact steps to reproduce this bug.

Changed in neutron:
assignee: nobody → Vikram Hosakote (vhosakot)
Revision history for this message
Kevin Fox (kevpn) wrote :

Afraid not. Was too long ago.

Revision history for this message
Vikram Hosakote (vhosakot) wrote :

Not enough data to reproduce this bug.

Hence, moved to "Incomplete" state.

Changed in neutron:
status: New → Incomplete
assignee: Vikram Hosakote (vhosakot) → nobody
Changed in neutron:
assignee: nobody → Eugene Nikanorov (enikanorov)
status: Incomplete → Confirmed
importance: Undecided → Medium
Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

This might be already fixed for Kilo.
Will look into it.

Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

This probably was not seen in gates because gates do not do multinode deployment and dvr tests.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/180466

Changed in neutron:
assignee: Eugene Nikanorov (enikanorov) → Oleg Bondarev (obondarev)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/180466
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=45ea2cf10033e12c63b8ce2cd78b04755d0aba64
Submitter: Jenkins
Branch: master

commit 45ea2cf10033e12c63b8ce2cd78b04755d0aba64
Author: Oleg Bondarev <email address hidden>
Date: Wed May 6 12:50:11 2015 +0300

    Wrap ML2 delete_port with db retry decorator

    ML2 delete_port operation currently involves locking ports
    and bindings tables which may lead to DBDeadlock errors in certain
    cases when several ports are deleted concurrently.
    That may happen due to specifics of Galera working in active-active
    mode: it may throw deadlock errors when it fails to validate
    a change with other members of the cluster.
    The fix adds retries to delete port operation to overcome such
    deadlocks

    Closes-Bug: #1422504
    Change-Id: I684691d59c5ac370d74314c3c91857dc709b2d9b

Changed in neutron:
status: In Progress → Fix Committed
tags: added: kilo-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/194174

Thierry Carrez (ttx)
Changed in neutron:
milestone: none → liberty-1
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (feature/pecan)

Fix proposed to branch: feature/pecan
Review: https://review.openstack.org/196701

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (feature/pecan)

Change abandoned by Kyle Mestery (<email address hidden>) on branch: feature/pecan
Review: https://review.openstack.org/196701
Reason: This is lacking the functional fix [1], so I'll propose a new merge commit which includes that one.

[1] https://review.openstack.org/#/c/196711/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (feature/pecan)

Fix proposed to branch: feature/pecan
Review: https://review.openstack.org/196920

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/kilo)

Reviewed: https://review.openstack.org/194174
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=151b28cf1ce697f481d65b8d5bee1e3a4becbbc5
Submitter: Jenkins
Branch: stable/kilo

commit 151b28cf1ce697f481d65b8d5bee1e3a4becbbc5
Author: Oleg Bondarev <email address hidden>
Date: Wed May 6 12:50:11 2015 +0300

    Wrap ML2 delete_port with db retry decorator

    ML2 delete_port operation currently involves locking ports
    and bindings tables which may lead to DBDeadlock errors in certain
    cases when several ports are deleted concurrently.
    That may happen due to specifics of Galera working in active-active
    mode: it may throw deadlock errors when it fails to validate
    a change with other members of the cluster.
    The fix adds retries to delete port operation to overcome such
    deadlocks

    Closes-Bug: #1422504
    Change-Id: I684691d59c5ac370d74314c3c91857dc709b2d9b
    (cherry picked from commit 45ea2cf10033e12c63b8ce2cd78b04755d0aba64)

tags: added: in-stable-kilo
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/juno)

Fix proposed to branch: stable/juno
Review: https://review.openstack.org/231985

tags: added: juno-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/juno)

Reviewed: https://review.openstack.org/231985
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=f0bb982bfc81bb7c5bafde1430c51bea8f8ab9c9
Submitter: Jenkins
Branch: stable/juno

commit f0bb982bfc81bb7c5bafde1430c51bea8f8ab9c9
Author: Oleg Bondarev <email address hidden>
Date: Wed May 6 12:50:11 2015 +0300

    Wrap ML2 delete_port with db retry decorator

    ML2 delete_port operation currently involves locking ports
    and bindings tables which may lead to DBDeadlock errors in certain
    cases when several ports are deleted concurrently.
    That may happen due to specifics of Galera working in active-active
    mode: it may throw deadlock errors when it fails to validate
    a change with other members of the cluster.
    The fix adds retries to delete port operation to overcome such
    deadlocks

    Conflicts:
        neutron/plugins/ml2/plugin.py
        neutron/tests/unit/ml2/test_ml2_plugin.py

    Closes-Bug: #1422504
    Change-Id: I684691d59c5ac370d74314c3c91857dc709b2d9b
    (cherry picked from commit 45ea2cf10033e12c63b8ce2cd78b04755d0aba64)
    (cherry picked from commit 151b28cf1ce697f481d65b8d5bee1e3a4becbbc5)

tags: added: in-stable-juno
Thierry Carrez (ttx)
Changed in neutron:
milestone: liberty-1 → 7.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.