DELETE fails with HTTP 500, StaleDataError: UPDATE statement on table 'instance_mappings' expected to update 1 row(s); 0 were matched

Bug #1882608 reported by melanie witt on 2020-06-08
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
melanie witt
Train
Medium
melanie witt
Ussuri
Medium
melanie witt

Bug Description

Noticed in a failed nova-grenade-multinode gate job where a resource cleanup (server delete) during a ServersNegativeTestJSON test results in a 500 error and the job fails with:

Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi [None req-ab8b5ad1-c168-4f7e-9bfc-42b202b9894f tempest-ServersNegativeTestJSON-1435542876 tempest-ServersNegativeTestJSON-1435542876] Unexpected exception in API method: sqlalchemy.orm.exc.StaleDataError: UPDATE statement on table 'instance_mappings' expected to update 1 row(s); 0 were matched.
Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi Traceback (most recent call last):
Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/opt/stack/new/nova/nova/api/openstack/wsgi.py", line 671, in wrapped
Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi return f(*args, **kwargs)
Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/opt/stack/new/nova/nova/api/openstack/compute/servers.py", line 990, in delete
Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi self._delete(req.environ['nova.context'], req, id)
Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/opt/stack/new/nova/nova/api/openstack/compute/servers.py", line 798, in _delete
Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi self.compute_api.delete(context, instance)
Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/opt/stack/new/nova/nova/compute/api.py", line 224, in inner
Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi return function(self, context, instance, *args, **kwargs)
Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/opt/stack/new/nova/nova/compute/api.py", line 151, in inner
Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi return f(self, context, instance, *args, **kw)
Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/opt/stack/new/nova/nova/compute/api.py", line 2479, in delete
Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi self._delete_instance(context, instance)
Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/opt/stack/new/nova/nova/compute/api.py", line 2471, in _delete_instance
Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi task_state=task_states.DELETING)
Jun 01 14:33:57.523020 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/opt/stack/new/nova/nova/compute/api.py", line 2158, in _delete
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi self._local_delete_cleanup(context, instance)
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/opt/stack/new/nova/nova/compute/api.py", line 2117, in _local_delete_cleanup
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi self._update_queued_for_deletion(context, instance, True)
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/opt/stack/new/nova/nova/compute/api.py", line 2434, in _update_queued_for_deletion
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi im.save()
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/oslo_versionedobjects/base.py", line 226, in wrapper
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi return fn(self, *args, **kwargs)
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/opt/stack/new/nova/nova/objects/instance_mapping.py", line 165, in save
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi changes)
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 1014, in wrapper
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi return fn(*args, **kwargs)
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/contextlib.py", line 88, in __exit__
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi next(self.gen)
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 1064, in _transaction_scope
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi yield resource
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/lib/python3.6/contextlib.py", line 88, in __exit__
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi next(self.gen)
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 666, in _session
Jun 01 14:33:57.524852 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi self.session.rollback()
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi self.force_reraise()
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi six.reraise(self.type_, self.value, self.tb)
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi raise value
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 663, in _session
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi self._end_session_transaction(self.session)
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 691, in _end_session_transaction
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi session.commit()
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 1042, in commit
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi self.transaction.commit()
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 504, in commit
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi self._prepare_impl()
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 483, in _prepare_impl
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi self.session.flush()
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 2523, in flush
Jun 01 14:33:57.526814 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi self._flush(objects)
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 2664, in _flush
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi transaction.rollback(_capture_exception=True)
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/util/langhelpers.py", line 69, in __exit__
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi exc_value, with_traceback=exc_tb,
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/util/compat.py", line 178, in raise_
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi raise exception
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/session.py", line 2624, in _flush
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi flush_context.execute()
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi rec.execute(self)
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi uow,
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/persistence.py", line 236, in save_obj
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi update,
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/persistence.py", line 1028, in _emit_update_statements
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi % (table.description, len(records), rows)
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi sqlalchemy.orm.exc.StaleDataError: UPDATE statement on table 'instance_mappings' expected to update 1 row(s); 0 were matched.
Jun 01 14:33:57.528398 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: ERROR nova.api.openstack.wsgi
Jun 01 14:33:57.530257 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: INFO nova.api.openstack.wsgi [None req-ab8b5ad1-c168-4f7e-9bfc-42b202b9894f tempest-ServersNegativeTestJSON-1435542876 tempest-ServersNegativeTestJSON-1435542876] HTTP exception thrown: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
Jun 01 14:33:57.530257 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: <class 'sqlalchemy.orm.exc.StaleDataError'>
Jun 01 14:33:57.530257 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: DEBUG nova.api.openstack.wsgi [None req-ab8b5ad1-c168-4f7e-9bfc-42b202b9894f tempest-ServersNegativeTestJSON-1435542876 tempest-ServersNegativeTestJSON-1435542876] Returning 500 to user: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
Jun 01 14:33:57.530257 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: <class 'sqlalchemy.orm.exc.StaleDataError'> {{(pid=13723) __call__ /opt/stack/new/nova/nova/api/openstack/wsgi.py:941}}
Jun 01 14:33:57.530257 ubuntu-bionic-rax-iad-0016890725 <email address hidden>[13722]: INFO nova.api.openstack.requestlog [None req-ab8b5ad1-c168-4f7e-9bfc-42b202b9894f tempest-ServersNegativeTestJSON-1435542876 tempest-ServersNegativeTestJSON-1435542876] 10.208.194.105 "DELETE /compute/v2.1/servers/d87b9767-d6ac-4c23-ad5b-d1fd139f1662" status: 500 len: 207 microversion: 2.1 time: 0.414127

melanie witt (melwitt) wrote :

I found that this happens when a server is requested to be deleted while it's in the middle of booting, as seen in the ServersNegativeTestJSON code:

    @classmethod
    def resource_setup(cls):
        super(ServersNegativeTestJSON, cls).resource_setup()
        server = cls.create_test_server(wait_until='ACTIVE')
        cls.server_id = server['id']

        server = cls.create_test_server()
        cls.client.delete_server(server['id'])
        waiters.wait_for_server_termination(cls.client, server['id'])
        cls.deleted_server_id = server['id']

There is a race where nova-conductor will delete the instance mapping while nova-api is trying to update the queued_for_delete field for the instance mapping record. When that happens, nova-conductor deletes the instance mapping after nova-api has retrieved it for the intended update, and then nova-api fails with StaleDataError when it tries to save the instance mapping record to the database. We see the following log in screen-n-cond.txt[1]:

Jun 01 14:33:57.487787 ubuntu-bionic-rax-iad-0016890725 nova-conductor[14142]: DEBUG nova.conductor.manager [None req-e73643cb-efb2-445d-a6dc-5c6fd956c989 tempest-ServersNegativeTestJSON-1435542876 tempest-ServersNegativeTestJSON-1435542876] [instance: d87b9767-d6ac-4c23-ad5b-d1fd139f1662] While scheduling instance, the build request was already deleted. {{(pid=15387) schedule_and_build_instances /opt/stack/new/nova/nova/conductor/manager.py:1515}}

which triggers nova-conductor to delete the instance mapping [2]. Then we fail in the delete path while trying to update queued_for_delete [3].

I think we could fix this with a try-except to catch StaleDataError and then raise InstanceMappingNotFound to treat it as a missing instance mapping.

[1] https://zuul.opendev.org/t/openstack/build/58647aab9847469cb1dc474a7e7a1e6d/log/logs/screen-n-cond.txt#2379
[2] https://github.com/openstack/nova/blob/2061ce1125039f3595999457da3a6ad3c202ea2a/nova/conductor/manager.py#L1514-L1524
[3] https://github.com/openstack/nova/blob/2061ce1125039f3595999457da3a6ad3c202ea2a/nova/compute/api.py#L2423-L2434

Fix proposed to branch: master
Review: https://review.opendev.org/734400

Changed in nova:
status: New → In Progress
melanie witt (melwitt) on 2020-06-09
Changed in nova:
importance: Undecided → Medium

Reviewed: https://review.opendev.org/734400
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=16df22dcd57a73fe3be15c64c41b4081b4826ef2
Submitter: Zuul
Branch: master

commit 16df22dcd57a73fe3be15c64c41b4081b4826ef2
Author: melanie witt <email address hidden>
Date: Tue Jun 9 00:27:39 2020 +0000

    Raise InstanceMappingNotFound if StaleDataError is encountered

    We have a race where if a user issues a delete request while an
    instance is in the middle of booting, we could fail to update the
    'queued_for_delete' field on the instance mapping with:

      sqlalchemy.orm.exc.StaleDataError: UPDATE statement on table
      'instance_mappings' expected to update 1 row(s); 0 were matched.

    This happens if we've retrieved the instance mapping record from the
    database and then it gets deleted by nova-conductor before we attempt
    to save() it.

    This handles the situation by adding try-except around the update call
    to catch StaleDataError and raise InstanceMappingNotFound instead,
    which the caller does know how to handle.

    Closes-Bug: #1882608

    Change-Id: I2cdcad7226312ed81f4242c8d9ac919715524b48

Changed in nova:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers