Cells: DBReferenceError possible deleting unscheduled instance

Bug #1462128 reported by melanie witt
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
melanie witt
Kilo
Fix Released
Low
Matt Riedemann

Bug Description

In cells, when a delete request comes in before an instance has been scheduled, cells will broadcast a "delete everywhere" to delete the instance in all cells. During the delete code path in compute, instance.save() attempts to save some state along the way before instance.destroy(). If there isn't an instance record in the child database, the FK constraint to save state (for example, flavor), will fail and DBReference error will be raised by oslo.db. In cells, InstanceNotFound is caught and handled for these scenarios. Since a FK constraint failure does mean the instance doesn't exist, it would make sense to raise InstanceNotFound for DBReferenceError.

Logstash query:

message:"DBReferenceError" AND build_name:"check-tempest-dsvm-cells"

Example trace:

http://logs.openstack.org/49/188249/2/check/check-tempest-dsvm-cells/a74468a/logs/screen-n-cell-child.txt.gz#_2015-06-04_07_22_45_806

2015-06-04 07:22:45.806 ERROR nova.cells.messaging [req-388635a4-8e75-4986-af34-99e6eca82f3b ServersNegativeTestJSON-34383368 ServersNegativeTestJSON-350156579] Error processing message locally
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging Traceback (most recent call last):
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/cells/messaging.py", line 201, in _process_locally
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging resp_value = self.msg_runner._process_message_locally(self)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/cells/messaging.py", line 1277, in _process_message_locally
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging return fn(message, **message.method_kwargs)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/cells/messaging.py", line 1090, in instance_delete_everywhere
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging self.compute_api.delete(message.ctxt, instance)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/compute/api.py", line 227, in wrapped
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging return func(self, context, target, *args, **kwargs)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/compute/api.py", line 216, in inner
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging return function(self, context, instance, *args, **kwargs)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/compute/api.py", line 244, in _wrapped
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging return fn(self, context, instance, *args, **kwargs)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/compute/api.py", line 197, in inner
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging return f(self, context, instance, *args, **kw)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/compute/api.py", line 1820, in delete
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging self._delete_instance(context, instance)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/compute/api.py", line 1810, in _delete_instance
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging task_state=task_states.DELETING)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/compute/api.py", line 1622, in _delete
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging quotas.rollback()
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 119, in __exit__
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging six.reraise(self.type_, self.value, self.tb)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/compute/api.py", line 1538, in _delete
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging instance.save()
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/objects/base.py", line 205, in wrapper
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging return fn(self, *args, **kwargs)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/objects/instance.py", line 826, in save
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging getattr(self, '_save_%s' % field)(context)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/objects/instance.py", line 716, in _save_flavor
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging {'flavor': jsonutils.dumps(flavor_info)})
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/db/api.py", line 919, in instance_extra_update_by_uuid
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging updates)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 2643, in instance_extra_update_by_uuid
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging _instance_extra_create(context, create_values)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 2631, in _instance_extra_create
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging inst_extra_ref.save()
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/opt/stack/new/nova/nova/db/sqlalchemy/models.py", line 82, in save
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging super(NovaBase, self).save(session=session)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/oslo_db/sqlalchemy/models.py", line 48, in save
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging session.flush()
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 1985, in flush
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging self._flush(objects)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 2103, in _flush
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging transaction.rollback(_capture_exception=True)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/util/langhelpers.py", line 60, in __exit__
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging compat.reraise(exc_type, exc_value, exc_tb)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 2067, in _flush
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging flush_context.execute()
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/unitofwork.py", line 372, in execute
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging rec.execute(self)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/unitofwork.py", line 526, in execute
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging uow
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 65, in save_obj
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging mapper, table, insert)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 602, in _emit_insert_statements
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging execute(statement, params)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 841, in execute
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging return meth(self, multiparams, params)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/elements.py", line 322, in _execute_on_connection
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging return connection._execute_clauseelement(self, multiparams, params)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 938, in _execute_clauseelement
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging compiled_sql, distilled_params
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1070, in _execute_context
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging context)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/oslo_db/sqlalchemy/compat/handle_error.py", line 155, in _handle_dbapi_exception
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging e, statement, parameters, cursor, context)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1267, in _handle_dbapi_exception
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging util.raise_from_cause(newraise, exc_info)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging reraise(type(exception), exception, tb=exc_tb)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1063, in _execute_context
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging context)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 442, in do_execute
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging cursor.execute(statement, parameters)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 174, in execute
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging self.errorhandler(self, exc, value)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging raise errorclass, errorvalue
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging DBReferenceError: (IntegrityError) (1452, 'Cannot add or update a child row: a foreign key constraint fails (`nova_cell`.`instance_extra`, CONSTRAINT `instance_extra_instance_uuid_fkey` FOREIGN KEY (`instance_uuid`) REFERENCES `instances` (`uuid`))') 'INSERT INTO instance_extra (created_at, updated_at, deleted_at, deleted, instance_uuid, numa_topology, pci_requests, flavor, vcpu_model) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)' (datetime.datetime(2015, 6, 4, 7, 22, 45, 797823), None, None, 0, '29f21049-ae1d-492f-a7f4-616b57abc755', None, None, '{"new": null, "old": null, "cur": {"nova_object.version": "1.1", "nova_object.name": "Flavor", "nova_object.data": {"disabled": false, "root_gb": 0, "name": "m1.nano", "flavorid": "42", "deleted": false, "created_at": "2015-06-04T07:19:00Z", "ephemeral_gb": 0, "updated_at": null, "memory_mb": 64, "vcpus": 1, "extra_specs": {}, "swap": 0, "rxtx_factor": 1.0, "is_public": true, "deleted_at": null, "vcpu_weight": 0, "id": 6}, "nova_object.namespace": "nova"}}', None)
2015-06-04 07:22:45.806 16038 ERROR nova.cells.messaging

Tags: cells
Changed in nova:
status: Triaged → In Progress
melanie witt (melwitt)
tags: added: cells
Revision history for this message
melanie witt (melwitt) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/188261
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5f869d0c04271d4fa058843eebe56e0bd524b12b
Submitter: Jenkins
Branch: master

commit 5f869d0c04271d4fa058843eebe56e0bd524b12b
Author: melanie witt <email address hidden>
Date: Thu Jun 4 02:38:13 2015 +0000

    Raise InstanceNotFound when save FK constraint fails

    Currently, in the Instance.save method, if an ObjectField is dirty,
    it will be updated in the db before the instance record is updated
    (separate transaction). If the instance does not have a row, the
    foreign key constraint fails and DBReferenceError is raised from
    oslo.db. This scenario can happen in cells if an instance delete
    request comes in before it's scheduled, during the broadcast to
    delete in all cells. This is expected and InstanceNotFound is
    caught and handled.

    This change catches DBReferenceError and raises InstanceNotFound.

    Closes-Bug: #1462128

    Change-Id: I821f8c121760e0fb74d7ae5c680bd0c361f92128

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → liberty-1
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/227451

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/228660

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/kilo)

Reviewed: https://review.openstack.org/227451
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d864603c9df0552441312243ecc4452ed0c96272
Submitter: Jenkins
Branch: stable/kilo

commit d864603c9df0552441312243ecc4452ed0c96272
Author: melanie witt <email address hidden>
Date: Thu Jun 4 02:38:13 2015 +0000

    Raise InstanceNotFound when save FK constraint fails

    Currently, in the Instance.save method, if an ObjectField is dirty,
    it will be updated in the db before the instance record is updated
    (separate transaction). If the instance does not have a row, the
    foreign key constraint fails and DBReferenceError is raised from
    oslo.db. This scenario can happen in cells if an instance delete
    request comes in before it's scheduled, during the broadcast to
    delete in all cells. This is expected and InstanceNotFound is
    caught and handled.

    This change catches DBReferenceError and raises InstanceNotFound.

    Conflicts:
            nova/tests/unit/objects/test_instance.py

    NOTE(mriedem): The test conflict is because we don't have commit
    6d6fdeeff880cc5fe8edcc24ef94d3bc4ca924e2 in stable/kilo and that's
    not needed for this bug fix.

    Closes-Bug: #1462128

    Change-Id: I821f8c121760e0fb74d7ae5c680bd0c361f92128
    (cherry picked from commit 5f869d0c04271d4fa058843eebe56e0bd524b12b)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/228660
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=881488c5d5d9d52cf0b38c35cb929cffa0bcd70d
Submitter: Jenkins
Branch: master

commit 881488c5d5d9d52cf0b38c35cb929cffa0bcd70d
Author: melanie witt <email address hidden>
Date: Mon Sep 28 20:22:37 2015 +0000

    Check DBReferenceError foreign key in Instance.save

    In commit 5f869d0c04271d4fa058843eebe56e0bd524b12b handling of
    DBReferenceError from oslo.db was added to raise InstanceNotFound.
    However, if the target row has foreign key constraints to other
    tables, the DBReferenceError raised doesn't necessarily imply that
    the instance row is missing.

    This change adds a check on the key attribute in the DBReferenceError
    exception object and raises InstanceNotFound only if the key is
    'instance_uuid' -- otherwise, it reraises DBReferenceError.

    Related-Bug: #1462128
    Related-Bug: #1497076

    Change-Id: I131dbc27639626e89735ac0591dc266f35e6ac80

Thierry Carrez (ttx)
Changed in nova:
milestone: liberty-1 → 12.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.