DBDeadlock occurs when multiple neutron server starts at the same time

Bug #1617499 reported by Kahou Lei
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Kahou Lei

Bug Description

I have multiple controllers and each controllers are running neutron-server. When I start the neutron-server, some of the controllers occasionally crash with DBDeadlock exceptions:

2016-08-27 00:32:11.937 193309 CRITICAL neutron [-] DBDeadlock: (_mysql_exceptions.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction') [SQL: u'INSERT INTO ml2_vlan_allocations (physical_network, vlan_id, allocated) VALUES (%s, %s, %s)'] [parameters: (('bond0', 1, 0), ('bond0', 2, 0), ('bond0', 3, 0), ('bond0', 4, 0), ('bond0', 5, 0), ('bond0', 6, 0), ('bond0', 7, 0), ('bond0', 8, 0) ... displaying 10 of 4095 total bound parameter sets ... ('bond0', 4094, 0), ('bond0', 4095, 0))]
2016-08-27 00:32:11.937 193309 ERROR neutron Traceback (most recent call last):
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/bin/neutron-server", line 10, in <module>
2016-08-27 00:32:11.937 193309 ERROR neutron sys.exit(main_wsgi_eventlet())
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/cmd/eventlet/server/__init__.py", line 19, in main_wsgi_eventlet
2016-08-27 00:32:11.937 193309 ERROR neutron wsgi_eventlet.main()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/server/wsgi_eventlet.py", line 50, in main
2016-08-27 00:32:11.937 193309 ERROR neutron server.boot_server(_eventlet_wsgi_server)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/server/__init__.py", line 35, in boot_server
2016-08-27 00:32:11.937 193309 ERROR neutron server_func()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/server/wsgi_eventlet.py", line 27, in _eventlet_wsgi_server
2016-08-27 00:32:11.937 193309 ERROR neutron neutron_api = service.serve_wsgi(service.NeutronApiService)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/service.py", line 117, in serve_wsgi
2016-08-27 00:32:11.937 193309 ERROR neutron LOG.exception(_LE('Unrecoverable error: please check log '
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 204, in __exit__
2016-08-27 00:32:11.937 193309 ERROR neutron six.reraise(self.type_, self.value, self.tb)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/service.py", line 114, in serve_wsgi
2016-08-27 00:32:11.937 193309 ERROR neutron service.start()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/service.py", line 87, in start
2016-08-27 00:32:11.937 193309 ERROR neutron self.wsgi_app = _run_wsgi(self.app_name)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/service.py", line 225, in _run_wsgi
2016-08-27 00:32:11.937 193309 ERROR neutron app = config.load_paste_app(app_name)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/common/config.py", line 246, in load_paste_app
2016-08-27 00:32:11.937 193309 ERROR neutron app = deploy.loadapp("config:%s" % config_path, name=app_name)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 247, in loadapp
2016-08-27 00:32:11.937 193309 ERROR neutron return loadobj(APP, uri, name=name, **kw)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 272, in loadobj
2016-08-27 00:32:11.937 193309 ERROR neutron return context.create()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 710, in create
2016-08-27 00:32:11.937 193309 ERROR neutron return self.object_type.invoke(self)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 144, in invoke
2016-08-27 00:32:11.937 193309 ERROR neutron **context.local_conf)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/paste/deploy/util.py", line 55, in fix_call
2016-08-27 00:32:11.937 193309 ERROR neutron val = callable(*args, **kw)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/paste/urlmap.py", line 25, in urlmap_factory
2016-08-27 00:32:11.937 193309 ERROR neutron app = loader.get_app(app_name, global_conf=global_conf)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 350, in get_app
2016-08-27 00:32:11.937 193309 ERROR neutron name=name, global_conf=global_conf).create()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 710, in create
2016-08-27 00:32:11.937 193309 ERROR neutron return self.object_type.invoke(self)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 144, in invoke
2016-08-27 00:32:11.937 193309 ERROR neutron **context.local_conf)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/paste/deploy/util.py", line 55, in fix_call
2016-08-27 00:32:11.937 193309 ERROR neutron val = callable(*args, **kw)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/auth.py", line 71, in pipeline_factory
2016-08-27 00:32:11.937 193309 ERROR neutron app = loader.get_app(pipeline[-1])
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 350, in get_app
2016-08-27 00:32:11.937 193309 ERROR neutron name=name, global_conf=global_conf).create()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 710, in create
2016-08-27 00:32:11.937 193309 ERROR neutron return self.object_type.invoke(self)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 146, in invoke
2016-08-27 00:32:11.937 193309 ERROR neutron return fix_call(context.object, context.global_conf, **context.local_conf)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/paste/deploy/util.py", line 55, in fix_call
2016-08-27 00:32:11.937 193309 ERROR neutron val = callable(*args, **kw)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/api/v2/router.py", line 81, in factory
2016-08-27 00:32:11.937 193309 ERROR neutron return cls(**local_config)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/api/v2/router.py", line 85, in __init__
2016-08-27 00:32:11.937 193309 ERROR neutron plugin = manager.NeutronManager.get_plugin()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/manager.py", line 248, in get_plugin
2016-08-27 00:32:11.937 193309 ERROR neutron return weakref.proxy(cls.get_instance().plugin)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/manager.py", line 242, in get_instance
2016-08-27 00:32:11.937 193309 ERROR neutron cls._create_instance()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 254, in inner
2016-08-27 00:32:11.937 193309 ERROR neutron return f(*args, **kwargs)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/manager.py", line 228, in _create_instance
2016-08-27 00:32:11.937 193309 ERROR neutron cls._instance = cls()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/manager.py", line 121, in __init__
2016-08-27 00:32:11.937 193309 ERROR neutron plugin_provider)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/manager.py", line 161, in _get_plugin_instance
2016-08-27 00:32:11.937 193309 ERROR neutron return plugin_class()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/quota/resource_registry.py", line 121, in wrapper
2016-08-27 00:32:11.937 193309 ERROR neutron return f(*args, **kwargs)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/plugin.py", line 146, in __init__
2016-08-27 00:32:11.937 193309 ERROR neutron self.type_manager.initialize()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/managers.py", line 182, in initialize
2016-08-27 00:32:11.937 193309 ERROR neutron driver.obj.initialize()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/type_vlan.py", line 167, in initialize
2016-08-27 00:32:11.937 193309 ERROR neutron self._sync_vlan_allocations()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/type_vlan.py", line 161, in _sync_vlan_allocations
2016-08-27 00:32:11.937 193309 ERROR neutron session.delete(alloc)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 490, in __exit__
2016-08-27 00:32:11.937 193309 ERROR neutron self.rollback()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 60, in __exit__
2016-08-27 00:32:11.937 193309 ERROR neutron compat.reraise(exc_type, exc_value, exc_tb)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 487, in __exit__
2016-08-27 00:32:11.937 193309 ERROR neutron self.commit()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 392, in commit
2016-08-27 00:32:11.937 193309 ERROR neutron self._prepare_impl()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 372, in _prepare_impl
2016-08-27 00:32:11.937 193309 ERROR neutron self.session.flush()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2004, in flush
2016-08-27 00:32:11.937 193309 ERROR neutron self._flush(objects)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2122, in _flush
2016-08-27 00:32:11.937 193309 ERROR neutron transaction.rollback(_capture_exception=True)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 60, in __exit__
2016-08-27 00:32:11.937 193309 ERROR neutron compat.reraise(exc_type, exc_value, exc_tb)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2086, in _flush
2016-08-27 00:32:11.937 193309 ERROR neutron flush_context.execute()
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 373, in execute
2016-08-27 00:32:11.937 193309 ERROR neutron rec.execute(self)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 532, in execute
2016-08-27 00:32:11.937 193309 ERROR neutron uow
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 174, in save_obj
2016-08-27 00:32:11.937 193309 ERROR neutron mapper, table, insert)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 748, in _emit_insert_statements
2016-08-27 00:32:11.937 193309 ERROR neutron execute(statement, multiparams)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 914, in execute
2016-08-27 00:32:11.937 193309 ERROR neutron return meth(self, multiparams, params)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection
2016-08-27 00:32:11.937 193309 ERROR neutron return connection._execute_clauseelement(self, multiparams, params)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1010, in _execute_clauseelement
2016-08-27 00:32:11.937 193309 ERROR neutron compiled_sql, distilled_params
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1146, in _execute_context
2016-08-27 00:32:11.937 193309 ERROR neutron context)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1337, in _handle_dbapi_exception
2016-08-27 00:32:11.937 193309 ERROR neutron util.raise_from_cause(newraise, exc_info)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
2016-08-27 00:32:11.937 193309 ERROR neutron reraise(type(exception), exception, tb=exc_tb)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1116, in _execute_context
2016-08-27 00:32:11.937 193309 ERROR neutron context)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 95, in do_executemany
2016-08-27 00:32:11.937 193309 ERROR neutron rowcount = cursor.executemany(statement, parameters)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/MySQLdb/cursors.py", line 206, in executemany
2016-08-27 00:32:11.937 193309 ERROR neutron r = r + self.execute(query, a)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/MySQLdb/cursors.py", line 174, in execute
2016-08-27 00:32:11.937 193309 ERROR neutron self.errorhandler(self, exc, value)
2016-08-27 00:32:11.937 193309 ERROR neutron File "/usr/lib64/python2.7/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
2016-08-27 00:32:11.937 193309 ERROR neutron raise errorclass, errorvalue
2016-08-27 00:32:11.937 193309 ERROR neutron DBDeadlock: (_mysql_exceptions.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction') [SQL: u'INSERT INTO ml2_vlan_allocations (physical_network, vlan_id, allocated) VALUES (%s, %s, %s)'] [parameters: (('bond0', 1, 0), ('bond0', 2, 0), ('bond0', 3, 0), ('bond0', 4, 0), ('bond0', 5, 0), ('bond0', 6, 0), ('bond0', 7, 0), ('bond0', 8, 0) ... displaying 10 of 4095 total bound parameter sets ... ('bond0', 4094, 0), ('bond0', 4095, 0))]

The problem occurs more oftern under this following situation:

1. network_vlan_ranges is large (My condition is from 1 to 4095)
2. slow controller

Kahou Lei (kahou82)
Changed in neutron:
assignee: nobody → Kahou Lei (kahou82)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/361534

Changed in neutron:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/361534
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=55b6c6a63a97c86ce06ff4dc87d6f253f0f6fea4
Submitter: Jenkins
Branch: master

commit 55b6c6a63a97c86ce06ff4dc87d6f253f0f6fea4
Author: Kahou Lei <email address hidden>
Date: Fri Aug 26 17:43:38 2016 -0700

    Fixes DBDeadlock race condition during driver initialization.

    When multiple controllers are initially built, the ml2_vlan_allocations
    will be empty and the neutron servers under different controllers will
    try to populate the entry at the same time. This will cause this
    DBDeadlock error as they try to access the db at the same time.

    This patch set is to add db_api.retry_db_error decorator in the
    _sync_vlan_allocations method to avoid this race condition.

    Change-Id: I7c0e3ae515f592a5852e6decf6820b103f146761
    Closes-bug: 1617499

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 10.0.0.0b2

This issue was fixed in the openstack/neutron 10.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/447467

Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

Total service crash, set to High.

Changed in neutron:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/newton)

Reviewed: https://review.openstack.org/447467
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=2da513f856af5453d796acb55866873ff7991662
Submitter: Jenkins
Branch: stable/newton

commit 2da513f856af5453d796acb55866873ff7991662
Author: Kahou Lei <email address hidden>
Date: Fri Aug 26 17:43:38 2016 -0700

    Fixes DBDeadlock race condition during driver initialization.

    When multiple controllers are initially built, the ml2_vlan_allocations
    will be empty and the neutron servers under different controllers will
    try to populate the entry at the same time. This will cause this
    DBDeadlock error as they try to access the db at the same time.

    This patch set is to add db_api.retry_db_error decorator in the
    _sync_vlan_allocations method to avoid this race condition.

    Change-Id: I7c0e3ae515f592a5852e6decf6820b103f146761
    Closes-bug: 1617499
    (cherry picked from commit 55b6c6a63a97c86ce06ff4dc87d6f253f0f6fea4)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 9.3.0

This issue was fixed in the openstack/neutron 9.3.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.