[Pluggable IPAM] Deadlock on simultaneous update subnet and ip allocation from subnet

Bug #1572474 reported by Pavel Bondar
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Pavel Bondar

Bug Description

Observed in logs 'Lock wait timeout exceeded; try restarting transaction' [1], when two requests are concurently executed in neutron:
- request A calls update subnet req-5f9fc363-4b22-48e0-97e2-504aa7c3dda3
- request B calls create port on the same subnet req-ccd11684-ad2b-4937-a3c1-dc46aaa36b2d
As a result both requests failed.

Request A tries to delete 'ipamallocationpools' for subnet_id and it effectivelly removes 'ipamavailabilityranges' by foreign key.
Request B allocates ip and modifies av_range record in 'ipamavailabilityranges'.
So looks like collision caused by concurent access to 'ipamavailabilityranges' table.

[1] http://logs.openstack.org/23/181023/68/check/gate-tempest-dsvm-neutron-full/a9180e0/logs/screen-q-svc.txt.gz#_2016-04-19_15_43_05_837

StackTrace with both requests failed:
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters [req-5f9fc363-4b22-48e0-97e2-504aa7c3dda3 tempest-NetworksIpV6Test-714183411 -] DBAPIError exception wrapped from (pymysql.err.InternalError) (1205, u'Lock wait timeout exceeded; try restarting transaction') [SQL: u'DELETE FROM ipamallocationpools WHERE ipamallocationpools.ipam_subnet_id = %(ipam_subnet_id_1)s'] [parameters: {u'ipam_subnet_id_1': u'0b896671-8cc2-4e08-bbfe-05655e6c479c'}]
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last):
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters context)
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 450, in do_execute
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters)
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/cursors.py", line 158, in execute
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters result = self._query(query)
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/cursors.py", line 308, in _query
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters conn.query(q)
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 820, in query
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 1002, in _read_query_result
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters result.read()
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 1285, in read
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters first_packet = self.connection._read_packet()
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 966, in _read_packet
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters packet.check_error()
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 394, in check_error
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters err.raise_mysql_exception(self._data)
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/err.py", line 120, in raise_mysql_exception
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters _check_mysql_exception(errinfo)
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/err.py", line 115, in _check_mysql_exception
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters raise InternalError(errno, errorvalue)
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters InternalError: (1205, u'Lock wait timeout exceeded; try restarting transaction')
2016-04-19 15:43:05.837 17992 ERROR oslo_db.sqlalchemy.exc_filters
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource [req-5f9fc363-4b22-48e0-97e2-504aa7c3dda3 tempest-NetworksIpV6Test-714183411 -] update failed
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource Traceback (most recent call last):
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/opt/stack/new/neutron/neutron/api/v2/resource.py", line 84, in resource
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource result = method(request=request, **args)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/opt/stack/new/neutron/neutron/api/v2/base.py", line 579, in update
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource return self._update(request, id, body, **kwargs)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/oslo_db/api.py", line 148, in wrapper
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource ectxt.value = e.inner_exc
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource self.force_reraise()
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource six.reraise(self.type_, self.value, self.tb)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource return f(*args, **kwargs)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/opt/stack/new/neutron/neutron/api/v2/base.py", line 624, in _update
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource obj = obj_updater(request.context, id, **kwargs)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/opt/stack/new/neutron/neutron/plugins/ml2/plugin.py", line 912, in update_subnet
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource context, id, subnet)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/opt/stack/new/neutron/neutron/db/db_base_plugin_v2.py", line 764, in update_subnet
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource db_pools)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/opt/stack/new/neutron/neutron/db/ipam_pluggable_backend.py", line 381, in update_db_subnet
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource self._ipam_update_allocation_pools(context, ipam_driver, subnet_copy)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/opt/stack/new/neutron/neutron/db/ipam_pluggable_backend.py", line 159, in _ipam_update_allocation_pools
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource ipam_driver.update_subnet(subnet_request)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/opt/stack/new/neutron/neutron/ipam/drivers/neutrondb_ipam/driver.py", line 440, in update_subnet
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource subnet.update_allocation_pools(subnet_request.allocation_pools, cidr)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/opt/stack/new/neutron/neutron/ipam/drivers/neutrondb_ipam/driver.py", line 377, in update_allocation_pools
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource self.subnet_manager.delete_allocation_pools(session)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/opt/stack/new/neutron/neutron/ipam/drivers/neutrondb_ipam/db_api.py", line 99, in delete_allocation_pools
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource ipam_subnet_id=self._ipam_subnet_id).delete()
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 3048, in delete
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource delete_op.exec_()
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 1127, in exec_
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource self._do_exec()
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 1311, in _do_exec
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource mapper=self.mapper)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 1034, in execute
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource bind, close_with_result=True).execute(clause, params or {})
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 914, in execute
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource return meth(self, multiparams, params)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource return connection._execute_clauseelement(self, multiparams, params)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1010, in _execute_clauseelement
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource compiled_sql, distilled_params
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1146, in _execute_context
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource context)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1337, in _handle_dbapi_exception
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource util.raise_from_cause(newraise, exc_info)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 200, in raise_from_cause
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource reraise(type(exception), exception, tb=exc_tb, cause=cause)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource context)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 450, in do_execute
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource cursor.execute(statement, parameters)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/pymysql/cursors.py", line 158, in execute
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource result = self._query(query)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/pymysql/cursors.py", line 308, in _query
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource conn.query(q)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 820, in query
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 1002, in _read_query_result
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource result.read()
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 1285, in read
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource first_packet = self.connection._read_packet()
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 966, in _read_packet
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource packet.check_error()
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 394, in check_error
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource err.raise_mysql_exception(self._data)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/pymysql/err.py", line 120, in raise_mysql_exception
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource _check_mysql_exception(errinfo)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/pymysql/err.py", line 115, in _check_mysql_exception
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource raise InternalError(errno, errorvalue)
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource DBError: (pymysql.err.InternalError) (1205, u'Lock wait timeout exceeded; try restarting transaction') [SQL: u'DELETE FROM ipamallocationpools WHERE ipamallocationpools.ipam_subnet_id = %(ipam_subnet_id_1)s'] [parameters: {u'ipam_subnet_id_1': u'0b896671-8cc2-4e08-bbfe-05655e6c479c'}]
2016-04-19 15:43:05.847 17992 ERROR neutron.api.v2.resource
2016-04-19 15:43:05.863 17992 INFO neutron.wsgi [req-5f9fc363-4b22-48e0-97e2-504aa7c3dda3 tempest-NetworksIpV6Test-714183411 -] 127.0.0.1 - - [19/Apr/2016 15:43:05] "PUT /v2.0/subnets/f5613ce4-2fd8-4ab7-a87e-558b1b1b7bc2 HTTP/1.1" 500 363 51.651743
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters [req-ccd11684-ad2b-4937-a3c1-dc46aaa36b2d - -] DBAPIError exception wrapped from (pymysql.err.InternalError) (1205, u'Lock wait timeout exceeded; try restarting transaction') [SQL: u'INSERT INTO ipallocations (port_id, ip_address, subnet_id, network_id) VALUES (%(port_id)s, %(ip_address)s, %(subnet_id)s, %(network_id)s)'] [parameters: {'subnet_id': u'f5613ce4-2fd8-4ab7-a87e-558b1b1b7bc2', 'network_id': u'33d94f9c-e24b-448f-900c-0249fbd6c96d', 'port_id': 'b81c0244-1eb9-4716-b85f-253deed19b19', 'ip_address': u'2003::3'}]
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last):
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters context)
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 450, in do_execute
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters)
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/cursors.py", line 158, in execute
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters result = self._query(query)
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/cursors.py", line 308, in _query
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters conn.query(q)
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 820, in query
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 1002, in _read_query_result
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters result.read()
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 1285, in read
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters first_packet = self.connection._read_packet()
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 966, in _read_packet
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters packet.check_error()
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 394, in check_error
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters err.raise_mysql_exception(self._data)
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/err.py", line 120, in raise_mysql_exception
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters _check_mysql_exception(errinfo)
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/pymysql/err.py", line 115, in _check_mysql_exception
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters raise InternalError(errno, errorvalue)
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters InternalError: (1205, u'Lock wait timeout exceeded; try restarting transaction')
2016-04-19 15:43:05.862 17994 ERROR oslo_db.sqlalchemy.exc_filters

Revision history for this message
Pavel Bondar (pasha117) wrote :

Merging https://review.openstack.org/#/c/292207/ will effectively resolve this issue, since AV Ranges are removed all together in this case.

description: updated
summary: - [Pluggable IPAM] Deadlock on simultanious update subnet and ip
+ [Pluggable IPAM] Deadlock on simultaneous update subnet and ip
allocation from subnet
Doug Wiegley (dougwig)
Changed in neutron:
status: New → Confirmed
importance: Undecided → High
milestone: none → newton-1
Revision history for this message
Pavel Bondar (pasha117) wrote :

Backportable fix for that: https://review.openstack.org/#/c/309067/

Revision history for this message
Miguel Lavalle (minsel) wrote :
Changed in neutron:
assignee: Pavel Bondar (pasha117) → Carl Baldwin (carl-baldwin)
status: Confirmed → In Progress
Changed in neutron:
assignee: Carl Baldwin (carl-baldwin) → Pavel Bondar (pasha117)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/309067
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=310074b2d457a6de2fdf141d4ede6b6044efc002
Submitter: Jenkins
Branch: master

commit 310074b2d457a6de2fdf141d4ede6b6044efc002
Author: Pavel Bondar <email address hidden>
Date: Thu Apr 21 17:31:13 2016 +0300

    Check if pool update is needed in reference driver

    Commit 6ed8f45fdf529bacae32b074844aa1320b005b51 had some negative impact
    on concurrent ip allocations. To make ipam driver aware of subnet
    updates (mostly for thirdparty drivers) ipam driver is always called with
    allocation pools even if pools are not changed.

    Current way of handling that event is deleting old pools and creating
    new pools. But on scale it may cause issues, because of this:
    - deleting allocation pools removes availability ranges by foreign key;
    - any ip allocation modifies availability range;
    These events concurently modify availability range records causing
    deadlocks.

    This fix prevents deleting and recreating pools and availability ranges
    in cases where allocation pools are not changed. So it eliminates
    negative impact on concurency added by always calling ipam driver on
    subnet update.
    This fix aims to provide backportable solution to be used with
    6ed8f45fdf529bacae32b074844aa1320b005b51.

    Complete solution that eliminates concurrent modifications in
    availability range table is expected to be devivered with
    ticket #1543094, but it will not be backportable because of the scope of
    the change.

    Change-Id: I29e03a79c34b150a822697f7b556ed168a57c064
    Related-Bug: #1534625
    Closes-Bug: #1572474

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 9.0.0.0b1

This issue was fixed in the openstack/neutron 9.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/mitaka)
Download full text (3.2 KiB)

Reviewed: https://review.openstack.org/305788
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=bf631b226d3fa66e57069fcd9a49b7746b69c91c
Submitter: Jenkins
Branch: stable/mitaka

commit bf631b226d3fa66e57069fcd9a49b7746b69c91c
Author: Pavel Bondar <email address hidden>
Date: Thu Jan 21 17:01:22 2016 +0300

    Always call ipam driver on subnet update

    Backport contains two commits:
    - 6ed8f45fdf529bacae32b074844aa1320b005b51 had some negative impact on
      concurrency;
    - 310074b2d457a6de2fdf141d4ede6b6044efc002 fixes that negative impact;

    COMMIT 1:
    Previously ipam driver was not called on subnet update
    if allocation_pools are not in request.
    Changing it to call ipam driver each time subnet update is requested.
    If subnet_update is called without allocation_pools, then old allocation
    pools are passed to ipam driver.

    Contains a bit of refactoring to make that happen:

    - validate_allocation_pools is already called during update
    subnet workflow in update_subnet method, so just removing it;

    - reworked update_db_subnet workflow;

    previous workflow was:
    call driver allocation -> make local allocation -> rollback driver
    allocation in except block if local allocation failed.

    new workflow:
    make local allocation -> call driver allocation.
    By changing order of execution we eliminating need of rollback in this
    method, since failure in local allocation is rolled back by database
    transaction rollback.

    - make a copy of incoming subnet dict;
    _update_subnet_allocation_pools from ipam_backend_mixin removes
    'allocation_pools' from subnet_dict, so create an unchanged copy to
    pass it to ipam driver

    COMMIT 2:
    Check if pool update is needed in reference driver

    Commit 6ed8f45fdf529bacae32b074844aa1320b005b51 had some negative impact
    on concurrent ip allocations. To make ipam driver aware of subnet
    updates (mostly for thirdparty drivers) ipam driver is always called
    with allocation pools even if pools are not changed.

    Current way of handling that event is deleting old pools and creating
    new pools. But on scale it may cause issues, because of this:
    - deleting allocation pools removes availability ranges by foreign key;
    - any ip allocation modifies availability range;
    These events concurently modify availability range records causing
    deadlocks.

    This fix prevents deleting and recreating pools and availability ranges
    in cases where allocation pools are not changed. So it eliminates
    negative impact on concurency added by always calling ipam driver on
    subnet update.
    This fix aims to provide backportable solution to be used with
    6ed8f45fdf529bacae32b074844aa1320b005b51.

    Complete solution that eliminates concurrent modifications in
    availability range table is expected to be devivered with
    ticket #1543094, but it will not be backportable because of the scope of
    the change.

    Change-Id: Ie4bd85d2ff2ab39bf803c5ef6c6ead151dbb74d7
    Closes-Bug: #1534625
    Closes-Bug: #1572474
    (cherry picked fr...

Read more...

tags: added: in-stable-mitaka
tags: added: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/344890

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/liberty)
Download full text (3.3 KiB)

Reviewed: https://review.openstack.org/344890
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=25de5633545ee62c0f7551cf237fa1d3420352b4
Submitter: Jenkins
Branch: stable/liberty

commit 25de5633545ee62c0f7551cf237fa1d3420352b4
Author: Pavel Bondar <email address hidden>
Date: Thu Jan 21 17:01:22 2016 +0300

    Always call ipam driver on subnet update

    Backport contains two commits:
    - 6ed8f45fdf529bacae32b074844aa1320b005b51 had some negative impact on
      concurrency;
    - 310074b2d457a6de2fdf141d4ede6b6044efc002 fixes that negative impact;

    COMMIT 1:
    Previously ipam driver was not called on subnet update
    if allocation_pools are not in request.
    Changing it to call ipam driver each time subnet update is requested.
    If subnet_update is called without allocation_pools, then old allocation
    pools are passed to ipam driver.

    Contains a bit of refactoring to make that happen:

    - validate_allocation_pools is already called during update
    subnet workflow in update_subnet method, so just removing it;

    - reworked update_db_subnet workflow;

    previous workflow was:
    call driver allocation -> make local allocation -> rollback driver
    allocation in except block if local allocation failed.

    new workflow:
    make local allocation -> call driver allocation.
    By changing order of execution we eliminating need of rollback in this
    method, since failure in local allocation is rolled back by database
    transaction rollback.

    - make a copy of incoming subnet dict;
    _update_subnet_allocation_pools from ipam_backend_mixin removes
    'allocation_pools' from subnet_dict, so create an unchanged copy to
    pass it to ipam driver

    COMMIT 2:
    Check if pool update is needed in reference driver

    Commit 6ed8f45fdf529bacae32b074844aa1320b005b51 had some negative impact
    on concurrent ip allocations. To make ipam driver aware of subnet
    updates (mostly for thirdparty drivers) ipam driver is always called
    with allocation pools even if pools are not changed.

    Current way of handling that event is deleting old pools and creating
    new pools. But on scale it may cause issues, because of this:
    - deleting allocation pools removes availability ranges by foreign key;
    - any ip allocation modifies availability range;
    These events concurently modify availability range records causing
    deadlocks.

    This fix prevents deleting and recreating pools and availability ranges
    in cases where allocation pools are not changed. So it eliminates
    negative impact on concurency added by always calling ipam driver on
    subnet update.
    This fix aims to provide backportable solution to be used with
    6ed8f45fdf529bacae32b074844aa1320b005b51.

    Complete solution that eliminates concurrent modifications in
    availability range table is expected to be devivered with
    ticket #1543094, but it will not be backportable because of the scope of
    the change.

    Manually resolved conflicts.

    Change-Id: Ie4bd85d2ff2ab39bf803c5ef6c6ead151dbb74d7
    Closes-Bug: #1534625
    Clo...

Read more...

tags: added: in-stable-liberty
Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 7.1.2

This issue was fixed in the openstack/neutron 7.1.2 release.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 8.2.0

This issue was fixed in the openstack/neutron 8.2.0 release.

tags: removed: neutron-proactive-backport-potential
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.