FlushError for automatic IPAllocations when subnet create is retried

Bug #1706750 reported by Ihar Hrachyshka on 2017-07-26
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
High
Ihar Hrachyshka

Bug Description

The error is seen in Red Hat CI: https://bugzilla.redhat.com/show_bug.cgi?id=1470784
There are hits in Kibana too: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22FlushError%3A%20New%20instance%20%3CIPAllocation%20at%5C%22

In server.log:

2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation [req-3c97d83e-3beb-4b28-a29f-697fc1ee8bc6 5f2e0d72ef8e4fb18b500759cfe5e503 205f2a3630a0426f8da20b4467ea003b - default default] POST failed.: FlushError: New instance <IPAllocation at 0x8b3f7d0> with identity key (<class 'neutron.db.models_v2.IPAllocation'>, ('2003::1:f816:3eff:fe3f:aa00', '8032a1cd-9236-4126-83fa-2f039ed2b890', 'c0542b84-bcd0-48e1-ace1-6da1f50246dc')) conflicts with persistent instance <IPAllocation at 0x89ac310>
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation Traceback (most recent call last):
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/pecan/core.py", line 678, in __call__
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self.invoke_controller(controller, args, kwargs, state)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/pecan/core.py", line 569, in invoke_controller
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation result = controller(*args, **kwargs)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 94, in wrapped
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation setattr(e, '_RETRY_EXCEEDED', True)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self.force_reraise()
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, self.tb)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 90, in wrapped
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation return f(*args, **kwargs)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 150, in wrapper
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation ectxt.value = e.inner_exc
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self.force_reraise()
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, self.tb)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 138, in wrapper
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation return f(*args, **kwargs)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 129, in wrapped
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation LOG.debug("Retry wrapper got retriable exception: %s", e)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self.force_reraise()
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, self.tb)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 125, in wrapped
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation return f(*dup_args, **dup_kwargs)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/pecan_wsgi/controllers/utils.py", line 70, in wrapped
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation return f(*args, **kwargs)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/pecan_wsgi/controllers/resource.py", line 152, in post
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation return self.create(resources)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/pecan_wsgi/controllers/resource.py", line 170, in create
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation return {key: creator(*creator_args, **creator_kwargs)}
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 665, in inner
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation return f(self, context, *args, **kwargs)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 164, in wrapped
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation return method(*args, **kwargs)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 94, in wrapped
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation setattr(e, '_RETRY_EXCEEDED', True)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self.force_reraise()
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, self.tb)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 90, in wrapped
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation return f(*args, **kwargs)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 150, in wrapper
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation ectxt.value = e.inner_exc
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self.force_reraise()
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, self.tb)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 138, in wrapper
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation return f(*args, **kwargs)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 129, in wrapped
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation LOG.debug("Retry wrapper got retriable exception: %s", e)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self.force_reraise()
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, self.tb)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 125, in wrapped
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation return f(*dup_args, **dup_kwargs)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/plugin.py", line 985, in create_subnet
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation result, mech_context = self._create_subnet_db(context, subnet)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/plugin.py", line 977, in _create_subnet_db
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self._create_subnet_postcommit(context, result, net_db, ipam_sub)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 164, in wrapped
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation return method(*args, **kwargs)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 94, in wrapped
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation setattr(e, '_RETRY_EXCEEDED', True)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self.force_reraise()
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, self.tb)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 90, in wrapped
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation return f(*args, **kwargs)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 150, in wrapper
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation ectxt.value = e.inner_exc
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self.force_reraise()
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, self.tb)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 138, in wrapper
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation return f(*args, **kwargs)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 129, in wrapped
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation LOG.debug("Retry wrapper got retriable exception: %s", e)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self.force_reraise()
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, self.tb)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 125, in wrapped
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation return f(*dup_args, **dup_kwargs)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/db_base_plugin_v2.py", line 675, in _create_subnet_postcommit
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation result, ipam_subnet)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/ipam_pluggable_backend.py", line 491, in add_auto_addrs_on_network_ports
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation allocated.create()
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/objects/base.py", line 225, in decorator
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation res = func(self, *args, **kwargs)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/objects/base.py", line 632, in create
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self.modify_fields_to_db(fields))
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/objects/db/api.py", line 61, in create_object
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation context.session.add(db_obj)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 567, in __exit__
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self.rollback()
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation compat.reraise(exc_type, exc_value, exc_tb)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 564, in __exit__
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self.commit()
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 461, in commit
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self._prepare_impl()
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 441, in _prepare_impl
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self.session.flush()
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2171, in flush
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation self._flush(objects)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2291, in _flush
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation transaction.rollback(_capture_exception=True)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation compat.reraise(exc_type, exc_value, exc_tb)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2255, in _flush
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation flush_context.execute()
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 389, in execute
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation rec.execute(self)
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 548, in execute
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation uow
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 156, in save_obj
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation base_mapper, states, uowtransaction
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 312, in _organize_states_for_save
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation state_str(existing)))
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation FlushError: New instance <IPAllocation at 0x8b3f7d0> with identity key (<class 'neutron.db.models_v2.IPAllocation'>, ('2003::1:f816:3eff:fe3f:aa00', '8032a1cd-9236-4126-83fa-2f039ed2b890', 'c0542b84-bcd0-48e1-ace1-6da1f50246dc')) conflicts with persistent instance <IPAllocation at 0x89ac310>
2017-07-13 15:20:51.064 85185 ERROR neutron.pecan_wsgi.hooks.translation

Just before the traceback, we see retry triggered by StaleDataError:

2017-07-13 15:20:50.668 85185 DEBUG neutron.db.api [req-3c97d83e-3beb-4b28-a29f-697fc1ee8bc6 5f2e0d72ef8e4fb18b500759cfe5e503 205f2a3630a0426f8da20b4467ea003b - default default] Retry wrapper got retriable exception: UPDATE statement on table 'standardattributes' expected to update 1 row(s); 0 were matched. wrapped /usr/lib/python2.7/site-packages/neutron/db/api.py:129
2017-07-13 15:20:50.669 85185 DEBUG oslo_db.api [req-3c97d83e-3beb-4b28-a29f-697fc1ee8bc6 5f2e0d72ef8e4fb18b500759cfe5e503 205f2a3630a0426f8da20b4467ea003b - default default] Performing DB retry for function neutron.db.db_base_plugin_v2._create_subnet_postcommit wrapper /usr/lib/python2.7/site-packages/oslo_db/api.py:152

This is probably related to https://review.openstack.org/#/c/396718/

Changed in neutron:
importance: Undecided → High
assignee: nobody → Ihar Hrachyshka (ihar-hrachyshka)
status: New → Confirmed
tags: added: db l3-ipam-dhcp
Changed in neutron:
status: Confirmed → In Progress

Reviewed: https://review.openstack.org/485328
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=d207fb92354858e7443113a4b5d642a465a3af27
Submitter: Jenkins
Branch: master

commit d207fb92354858e7443113a4b5d642a465a3af27
Author: Ihar Hrachyshka <email address hidden>
Date: Wed Jul 19 11:56:55 2017 -0700

    Fixed FlushError on subnet creation retry

    Automatically allocated fixed IP addresses were not reflected on cached
    port records, which triggered FlushError if the allocating method failed
    and triggered a retry.

    Change-Id: Ia7725094827b554aa6d928c7daa026959a237991
    Closes-Bug: #1706750

Changed in neutron:
status: In Progress → Fix Released

This issue was fixed in the openstack/neutron 11.0.0.0rc1 release candidate.

tags: added: neutron-proactive-backport-potential
tags: removed: neutron-proactive-backport-potential

We still see this failure happening in Red Hat CI with the same symptoms for OSP12 (Pike), reopening.

Changed in neutron:
status: Fix Released → Confirmed

For reference, another OSP bug for Pike with the failure: https://bugzilla.redhat.com/show_bug.cgi?id=1506668

Some more info:

just before the failure of subnet create request, the following error happens for a port delete request sent by dhcp agent:

2017-10-25 21:54:45.861 130093 ERROR oslo_messaging.rpc.server [req-95a13ee8-93a2-4142-b83f-8640752f5523 - - - - -] Exception during message handling: IpAddressAllocationNotFound: Unable to find IP address 2003::f816:3eff:fe73:8a72 on subnet 0c005759-7305-419e-9d0e-3b1ddd4c7373
2017-10-25 21:54:45.861 130093 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/neutron/api/rpc/handlers/dhcp_rpc.py", line 246, in release_dhcp_port
2017-10-25 21:54:45.861 130093 ERROR oslo_messaging.rpc.server plugin.delete_ports_by_device_id(context, device_id, network_id)
2017-10-25 21:54:45.861 130093 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/neutron/db/db_base_plugin_v2.py", line 1348, in delete_ports_by_device_id
2017-10-25 21:54:45.861 130093 ERROR oslo_messaging.rpc.server self.delete_port(context, port_id)
[...]
2017-10-25 21:54:45.861 130093 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/neutron/db/db_base_plugin_v2.py", line 1336, in delete_port
2017-10-25 21:54:45.861 130093 ERROR oslo_messaging.rpc.server self.ipam.delete_port(context, id)
2017-10-25 21:54:45.861 130093 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/neutron/db/ipam_pluggable_backend.py", line 429, in delete_port
2017-10-25 21:54:45.861 130093 ERROR oslo_messaging.rpc.server port['fixed_ips'])
2017-10-25 21:54:45.861 130093 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/neutron/db/ipam_pluggable_backend.py", line 95, in _ipam_deallocate_ips
2017-10-25 21:54:45.861 130093 ERROR oslo_messaging.rpc.server "external system for %s", addresses)
2017-10-25 21:54:45.861 130093 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/neutron/ipam/drivers/neutrondb_ipam/driver.py", line 227, in deallocate
2017-10-25 21:54:45.861 130093 ERROR oslo_messaging.rpc.server ip_address=address)

The attempts was retried:

controller-0/var/log/neutron/server.log:2017-10-25 21:54:45.343 130093 DEBUG oslo_db.api [req-95a13ee8-93a2-4142-b83f-8640752f5523 - - - - -] Performing DB retry for function neutron.plugins.ml2.plugin.delete_port wrapper /usr/lib/python2.7/site-packages/oslo_db/api.py:152

It may be the case where the latter retry recreates the ipam allocation object that server later clashes on.

One thing to clarify (that took me a while to realize) is that the clash happens on IPAllocation objects, but DHCP agent port delete failure happens on IpamAllocation object.

Related fix proposed to branch: master
Review: https://review.openstack.org/518989

With https://review.openstack.org/#/c/518989/ the delete_port retry succeeds in the server log:

2017-11-21 01:07:39.953 112684 DEBUG neutron.db.ipam_pluggable_backend [req-06d844ff-e5f5-44cf-b039-42947abb7148 - - - - -] Address 2003::f816:3eff:fe23:4bfe already deallocated. _ipam_deallocate_ips /usr/lib/python2.7/site-packages/neutron/db/ipam_pluggable_backend.py:79

Sadly, the subnet_create part is still messed by the retry and bails with FlushError. We may need to combine the deallocation fix (for delete port) with allocation one (for subnet create) to fix the race completely.

Related fix proposed to branch: master
Review: https://review.openstack.org/529169

Download full text (4.2 KiB)

This is what we see in the neutron database when it happens:

MariaDB [ovs_neutron]> select * from ipallocations;
+--------------------------------------+---------------------------+--------------------------------------+--------------------------------------+
| port_id | ip_address | subnet_id | network_id |
+--------------------------------------+---------------------------+--------------------------------------+--------------------------------------+
| 05c2f5b2-3f78-4056-a228-1570f9b33a86 | 169.254.192.3 | 88144aa4-c7e4-4add-bc25-ef7a84c48b90 | 30f0d27e-28ca-4c2f-9328-dcb8f60b8b3b |
| 1e9fde35-e827-4a84-9864-3f27ba133922 | 169.254.192.11 | 88144aa4-c7e4-4add-bc25-ef7a84c48b90 | 30f0d27e-28ca-4c2f-9328-dcb8f60b8b3b |
| 32f6ff50-e947-45ef-8c7d-e3dc2ee6134b | 2003::f816:3eff:fec4:2a93 | eed623f4-cbad-488c-b230-7f4a8b1514eb | eb341132-6d55-4fb6-ba47-9dc2c483279a |
| 492fcd7b-3765-4a7b-a3e4-08a7691b81b5 | 2003::f816:3eff:fecb:6ecd | 6460e359-02c5-4414-a5ed-b0c0b16442d8 | 1b5068f9-0137-4ffd-a728-59ebf475da89 |
| 687018c4-33ea-46c3-b222-08f5eccde4a5 | 2003::f816:3eff:fed2:5006 | eed623f4-cbad-488c-b230-7f4a8b1514eb | eb341132-6d55-4fb6-ba47-9dc2c483279a |
| 839c36b3-7163-456f-8abc-5cb0379289bd | 2003::f816:3eff:fea0:8a1b | ae170789-7e4a-4bad-8e71-ad450b080ec3 | ad34d9bd-af3f-4c14-a6ce-c6f71755b010 |
| 91219016-93b9-4bba-92ad-e9420a68e073 | 2003::f816:3eff:fe2f:c74 | ae170789-7e4a-4bad-8e71-ad450b080ec3 | ad34d9bd-af3f-4c14-a6ce-c6f71755b010 |
| 9b57e452-53a4-4f06-984d-93b5be91e1fb | 2003::f816:3eff:feef:a17c | eed623f4-cbad-488c-b230-7f4a8b1514eb | eb341132-6d55-4fb6-ba47-9dc2c483279a |
| bc07db93-e417-4dd3-91d2-110b125d5c6d | 2003::f816:3eff:fef3:90e6 | ae170789-7e4a-4bad-8e71-ad450b080ec3 | ad34d9bd-af3f-4c14-a6ce-c6f71755b010 |
| c2cbbb71-9297-413c-9e8e-3ad6c480eb3d | 10.0.0.222 | 77cf5146-93fd-4568-b3a9-bd5f01ae4933 | 6ad06166-6413-4316-93ed-26a4f13d22f5 |
| c2f43e30-b7d1-45f3-ba1b-12354fb3a03b | 2003::f816:3eff:feb1:17e | 6460e359-02c5-4414-a5ed-b0c0b16442d8 | 1b5068f9-0137-4ffd-a728-59ebf475da89 |
| d0b0acfc-351d-44cf-996c-7feb940def55 | 169.254.192.9 | 88144aa4-c7e4-4add-bc25-ef7a84c48b90 | 30f0d27e-28ca-4c2f-9328-dcb8f60b8b3b |
| f0a5b408-8cdb-47ce-839d-11c26538ece3 | 2003::f816:3eff:feae:40c6 | 6460e359-02c5-4414-a5ed-b0c0b16442d8 | 1b5068f9-0137-4ffd-a728-59ebf475da89 |
+--------------------------------------+---------------------------+--------------------------------------+--------------------------------------+
13 rows in set (0.00 sec)

MariaDB [ovs_neutron]> select * from ipamallocations;
+---------------------------+-----------+--------------------------------------+
| ip_address | status | ipam_subnet_id |
+---------------------------+-----------+--------------------------------------+
| 10.0.0.222 | ALLOCATED | 0a5b582e-cd96-485d-8706-a81d881be94c |
| 169.254.192.11 | ALLOCATED | fb50227f-d8b7-4b67-8f8e-b1294f2ef4fb |
| 169.254.192.3 | ALLOCATED | fb50227f-d8b7-4b67-8f8e-b1294f2ef4fb |
| 169.254.192.9 ...

Read more...

Changed in neutron:
status: Confirmed → In Progress

Reviewed: https://review.openstack.org/529169
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=d4c41071645d0c25485d93886fbc09ca671ea18b
Submitter: Zuul
Branch: master

commit d4c41071645d0c25485d93886fbc09ca671ea18b
Author: Ihar Hrachyshka <email address hidden>
Date: Tue Dec 19 13:54:02 2017 -0800

    ipam: don't commit IPAllocation while IpamAllocation is rolled back

    Don't mix old and new session management styles. Mixing them is not
    guaranteed to work. The method is not switched to new enginefacade yet,
    so stick to the old style. Since OVO create() method already opens a
    nested transaction, do nothing in IPAM layer.

    This change fixes an issue where sometimes, whenever
    add_auto_addrs_on_network_ports raises a retriable exception like
    StaleDataError, corresponding IPAllocation is still persisted in the
    database without being rolled back and without the corresponding
    IpamAllocation model persisted. Later, this situation may break port and
    network deletion for resources affected by the issue, failing with the
    following error:

    Unable to find IP address 2003::f816:3eff:fed2:5006 on subnet
    eed623f4-cbad-488c-b230-7f4a8b1514eb

    It happens because whenever writer.using context manager exits, and it's
    top level manager, it commits the changes to database. Since
    writer.using doesn't know that it's being executed in scope of the old
    facade .begin() call, it doesn't postpone commitment as needed.

    The bug was introduced in Pike as part of the switch to the new
    enginefacade (that was never fully completed), specifically by:
    I50be115ea69f805b48b02aebe4259ec2c839830e

    For existing setups that already have IPAllocation objects without
    IpamAllocation counterparts, the solution is to remove those
    dangling IPAllocations manually from the database. (That's of course
    assuming they use Neutron DB IPAM plugin in the first place.)

    Note: If we ever decide to remove nested transactions from OVO base
    layer, we may need to revisit this code to keep ignore-on-failure
    semantics.

    Change-Id: Ic9e146f51d9a120011892828d3a67b0630f4c5ce
    Closes-Bug: #1706750

Changed in neutron:
status: In Progress → Fix Released
tags: added: neutron-proactive-backport-potential

Change abandoned by Ihar Hrachyshka (<email address hidden>) on branch: master
Review: https://review.openstack.org/518989

Change abandoned by Ihar Hrachyshka (<email address hidden>) on branch: master
Review: https://review.openstack.org/518988

Reviewed: https://review.openstack.org/529616
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=78e3dbc4756652721b8a7dc9b5b0bb9d33022f08
Submitter: Zuul
Branch: stable/pike

commit 78e3dbc4756652721b8a7dc9b5b0bb9d33022f08
Author: Ihar Hrachyshka <email address hidden>
Date: Tue Dec 19 13:54:02 2017 -0800

    ipam: don't commit IPAllocation while IpamAllocation is rolled back

    Don't mix old and new session management styles. Mixing them is not
    guaranteed to work. The method is not switched to new enginefacade yet,
    so stick to the old style. Since OVO create() method already opens a
    nested transaction, do nothing in IPAM layer.

    This change fixes an issue where sometimes, whenever
    add_auto_addrs_on_network_ports raises a retriable exception like
    StaleDataError, corresponding IPAllocation is still persisted in the
    database without being rolled back and without the corresponding
    IpamAllocation model persisted. Later, this situation may break port and
    network deletion for resources affected by the issue, failing with the
    following error:

    Unable to find IP address 2003::f816:3eff:fed2:5006 on subnet
    eed623f4-cbad-488c-b230-7f4a8b1514eb

    It happens because whenever writer.using context manager exits, and it's
    top level manager, it commits the changes to database. Since
    writer.using doesn't know that it's being executed in scope of the old
    facade .begin() call, it doesn't postpone commitment as needed.

    The bug was introduced in Pike as part of the switch to the new
    enginefacade (that was never fully completed), specifically by:
    I50be115ea69f805b48b02aebe4259ec2c839830e

    For existing setups that already have IPAllocation objects without
    IpamAllocation counterparts, the solution is to remove those
    dangling IPAllocations manually from the database. (That's of course
    assuming they use Neutron DB IPAM plugin in the first place.)

    Note: If we ever decide to remove nested transactions from OVO base
    layer, we may need to revisit this code to keep ignore-on-failure
    semantics.

    Change-Id: Ic9e146f51d9a120011892828d3a67b0630f4c5ce
    Closes-Bug: #1706750
    (cherry picked from commit d4c41071645d0c25485d93886fbc09ca671ea18b)

tags: added: in-stable-pike

This issue was fixed in the openstack/neutron 12.0.0.0b3 development milestone.

This issue was fixed in the openstack/neutron 11.0.3 release.

tags: removed: neutron-proactive-backport-potential
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.