Possible race condition between n-cpu and n-api when deleting a building instance

Bug #1914777 reported by David Vallee Delisle
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
melanie witt
Ussuri
Fix Released
High
melanie witt
Victoria
Fix Released
High
melanie witt

Bug Description

Description
===========
In tripleo, we have a ci test that tries to delete an instance while it's building. Basically, we make sure that n-cpu can't boot it, so n-cpu destroys and recreate it all the time, until n-api receives a delete.

When the delete happens in between spawn from n-cpu, it's failing.

https://bugs.launchpad.net/tripleo/+bug/1914174 comment #6 has all the details.

Steps to reproduce
==================
* Create a server
* Delete right between the moment where n-cpu destroys it and respawns it.

Expected result
===============
We should delete

Actual result
=============
This traceback [1]

[1]
~~~
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi [req-c29af162-9e0a-4c1d-b55f-c11470e412ba 629d59dfb295418a93fbc75ef8a4c36e 2ea8047d0ed14eb48b8a6ee5070df67a - default default] Unexpected exception in API method: AttributeError: 'NoneType' object has no attribute 'uuid'
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi Traceback (most recent call last):
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/objects/instance.py", line 655, in destroy
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi hard_delete=hard_delete)
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/db/api.py", line 549, in instance_destroy
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi hard_delete=hard_delete)
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/db/sqlalchemy/api.py", line 179, in wrapper
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi return f(*args, **kwargs)
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/oslo_db/api.py", line 154, in wrapper
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi ectxt.value = e.inner_exc
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi self.force_reraise()
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi six.reraise(self.type_, self.value, self.tb)
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi raise value
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/oslo_db/api.py", line 142, in wrapper
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi return f(*args, **kwargs)
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/db/sqlalchemy/api.py", line 222, in wrapped
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi return f(context, *args, **kwargs)
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/db/sqlalchemy/api.py", line 1166, in instance_destroy
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi raise exception.ConstraintNotMet()
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi nova.exception.ConstraintNotMet: Constraint not met.
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi During handling of the above exception, another exception occurred:
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi Traceback (most recent call last):
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/compute/api.py", line 2170, in _delete
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi if self._delete_while_booting(context, instance):
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/compute/api.py", line 2112, in _delete_while_booting
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi instance.destroy()
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 226, in wrapper
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi return fn(self, *args, **kwargs)
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/objects/instance.py", line 659, in destroy
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi reason='host changed')
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi nova.exception.ObjectActionError: Object action destroy failed because: host changed
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi During handling of the above exception, another exception occurred:
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi Traceback (most recent call last):
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/api/openstack/wsgi.py", line 671, in wrapped
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi return f(*args, **kwargs)
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/api/openstack/compute/servers.py", line 993, in delete
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi self._delete(req.environ['nova.context'], req, id)
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/api/openstack/compute/servers.py", line 798, in _delete
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi self.compute_api.delete(context, instance)
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/compute/api.py", line 225, in inner
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi return function(self, context, instance, *args, **kwargs)
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/compute/api.py", line 152, in inner
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi return f(self, context, instance, *args, **kw)
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/compute/api.py", line 2492, in delete
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi self._delete_instance(context, instance)
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/compute/api.py", line 2484, in _delete_instance
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi task_state=task_states.DELETING)
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/compute/api.py", line 2207, in _delete
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi self._local_delete_cleanup(context, instance)
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/site-packages/nova/compute/api.py", line 2124, in _local_delete_cleanup
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi context, instance.uuid)
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi AttributeError: 'NoneType' object has no attribute 'uuid'
nova-api.log.1:2021-02-02 08:51:20.093 19 ERROR nova.api.openstack.wsgi
~~~

Revision history for this message
melanie witt (melwitt) wrote :

Copy pasting my comment from the other bug:

"It looks like the following nova change introduced a regression in our handling of "delete while booting" instances:

https://review.opendev.org/c/openstack/nova/+/702368

and this bug has apparently already been reported:

https://bugs.launchpad.net/tripleo/+bug/1892837

We will need to fix this in nova and backport it through to ussuri."

Marking this as High because it's a regression.

Changed in nova:
assignee: nobody → melanie witt (melwitt)
importance: Undecided → High
status: New → In Progress
tags: added: api
Revision history for this message
melanie witt (melwitt) wrote :

Regression tests and fix proposed starting here:

https://review.opendev.org/c/openstack/nova/+/775307

Revision history for this message
melanie witt (melwitt) wrote :
Changed in nova:
status: In Progress → Fix Released
Revision history for this message
melanie witt (melwitt) wrote :
Revision history for this message
melanie witt (melwitt) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 21.2.0

This issue was fixed in the openstack/nova 21.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 22.2.0

This issue was fixed in the openstack/nova 22.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 23.0.0.0rc1

This issue was fixed in the openstack/nova 23.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/805605

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/805605
Committed: https://opendev.org/openstack/nova/commit/14e43f385e6d243b6efd11a777d082e63b66367c
Submitter: "Zuul (22348)"
Branch: master

commit 14e43f385e6d243b6efd11a777d082e63b66367c
Author: Balazs Gibizer <email address hidden>
Date: Mon Aug 23 10:56:58 2021 +0200

    Avoid unbound instance_uuid var during delete

    The patch I03cf285ad83e09d88cdb702a88dfed53c01610f8 fixed most of the
    possible cases for this to happen but missed one. An early enough
    exception during _delete() can cause that the instance_uuid never gets
    defined but then we try to use it during the finally block. This patch
    moves the saving of the instance_uuid to the top of the try block to
    avoid the issue.

    Change-Id: Ib3073d7f595c8927532b7c49fc7e5ffe80d508b9
    Closes-Bug: #1940812
    Related-Bug: #1914777

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/xena)

Related fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/816488

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/nova/+/816488
Committed: https://opendev.org/openstack/nova/commit/00cba396134846dbbcf68cc9cf08e80618322bbe
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 00cba396134846dbbcf68cc9cf08e80618322bbe
Author: Balazs Gibizer <email address hidden>
Date: Mon Aug 23 10:56:58 2021 +0200

    Avoid unbound instance_uuid var during delete

    The patch I03cf285ad83e09d88cdb702a88dfed53c01610f8 fixed most of the
    possible cases for this to happen but missed one. An early enough
    exception during _delete() can cause that the instance_uuid never gets
    defined but then we try to use it during the finally block. This patch
    moves the saving of the instance_uuid to the top of the try block to
    avoid the issue.

    Change-Id: Ib3073d7f595c8927532b7c49fc7e5ffe80d508b9
    Closes-Bug: #1940812
    Related-Bug: #1914777
    (cherry picked from commit 14e43f385e6d243b6efd11a777d082e63b66367c)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/wallaby)

Related fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/nova/+/828839

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/nova/+/828839
Committed: https://opendev.org/openstack/nova/commit/fc20dc2ea23d511a0f741901d7f03135bb269165
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit fc20dc2ea23d511a0f741901d7f03135bb269165
Author: Balazs Gibizer <email address hidden>
Date: Mon Aug 23 10:56:58 2021 +0200

    Avoid unbound instance_uuid var during delete

    The patch I03cf285ad83e09d88cdb702a88dfed53c01610f8 fixed most of the
    possible cases for this to happen but missed one. An early enough
    exception during _delete() can cause that the instance_uuid never gets
    defined but then we try to use it during the finally block. This patch
    moves the saving of the instance_uuid to the top of the try block to
    avoid the issue.

    Change-Id: Ib3073d7f595c8927532b7c49fc7e5ffe80d508b9
    Closes-Bug: #1940812
    Related-Bug: #1914777
    (cherry picked from commit 14e43f385e6d243b6efd11a777d082e63b66367c)
    (cherry picked from commit 00cba396134846dbbcf68cc9cf08e80618322bbe)

tags: added: in-stable-wallaby
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.