DBDeadlock during router creation
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Mirantis OpenStack |
Fix Released
|
High
|
Oleg Bondarev |
Bug Description
Heat OSTF test failed in http://
Here is an error in heat test logs:
2015-07-22 05:15:28 ERROR (nose_storage_
Traceback (most recent call last):
File "/usr/lib/
testMethod()
File "/usr/lib/
parameters[
File "/usr/lib/
router = self._create_
File "/usr/lib/
router = self.neutron_
File "/usr/lib/
ret = self.function(
File "/usr/lib/
return self.post(
File "/usr/lib/
headers=
File "/usr/lib/
self.
File "/usr/lib/
exception_
File "/usr/lib/
status_
InternalServerE
Here is an error in neutron server log:
DBDeadlock: (OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction') 'SELECT ml2_network_
Full traceback http://
Full logs snapshot will be attached below.
Environment configuration:
ISO fuel-7.
Changed in mos: | |
milestone: | none → 7.0 |
tags: | added: neutron |
Changed in mos: | |
status: | New → Confirmed |
Changed in mos: | |
importance: | Undecided → Medium |
Changed in mos: | |
assignee: | MOS Neutron (mos-neutron) → Oleg Bondarev (obondarev) |
Changed in mos: | |
status: | Confirmed → In Progress |
status: | In Progress → Fix Committed |
Changed in mos: | |
status: | Confirmed → Fix Committed |
Changed in mos: | |
status: | Fix Committed → Fix Released |
I inspected neutron server logs on all controllers and found that deadlock usually happens when router port is created in parallel with dhcp port(s) creation on other servers. Generally we have simultaneous port creation. Port creation involves locking 'ports' and 'binding' tables: get_locked_ port_and_ binding( ) ml2 db method, which essentially does: query(models_ v2.Port) .
enable_ eagerloads( False).
filter_ by(id=port_ id).
with_ lockmode( 'update' ).
one() ) query(models. PortBinding) .
enable_ eagerloads( False).
filter_ by(port_ id=port_ id).
with_ lockmode( 'update' ).
one( ))
port = (session.
binding = (session.
I'm not sure how exacly this may lead to deadlock. It may probably happen due to specifics of Galera working in active-active
mode: throwing deadlock errors when it fails to validate a change with other members of the cluster.
I'm going to apply fix similar to https:/ /review. openstack. org/#/c/ 180466/. Though it's more a workaround, it should fix the issue with the only downside of a slight delay in port creation in a very rare circumstances.