On compute restart, quotas are not updated when instance vm_state is 'DELETED' but instance is not destroyed in db

Bug #1493694 reported by Rajesh Tailor
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Rajesh Tailor
Kilo
Fix Released
Undecided
Unassigned

Bug Description

This is a timing issue and can occur if instance delete call reaches to _delete_instance method in nova/compute/manager.py module and nova-compute crashes after setting instance vm_state to 'DELETED' but before destroying the instance in db.

Now on restarting nova-compute service, _init_instance method call checks whether instance vm_state is 'DELETED' or not, if yes, then it tries to call _complete_partial_deletion method and destroys instance in db then raises "ValueError: Circular reference detected" and quota was not updated for that instance which is not as expected.

Steps to reproduce:
1) Put a break point in nova/compute/manager.py module in _delete_instance method, just after updating instance vm_state to 'DELETED' but before destroying instance in db.
2) Create instance and wait until instance vm_state become 'ACTIVE'.
$ nova boot --image <image_id> --flavor <flavor_id> <name>

3) Send request to delete instance.
$ nova delete <instance_id>

4) When delete request reaches to break point in nova-compute, make sure instance vm_state is marked as 'DELETED' and stop the nova-compute service.
5) Restart nova-compute service and in _init_instance method call below error (ValueError: Circular reference detected) will be raised and instance will be marked as deleted in db but quota for that instance will never be updated.

2015-09-08 00:36:34.133 ERROR nova.compute.manager [req-3222b8a4-0542-48cf-a2e1-c92e5fd91e5e None None] [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] Failed to complete a deletion
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] Traceback (most recent call last):
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] File "/opt/stack/nova/nova/compute/manager.py", line 952, in _init_instance
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] self._complete_partial_deletion(context, instance)
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] File "/opt/stack/nova/nova/compute/manager.py", line 879, in _complete_partial_d
eletion
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] bdms = objects.BlockDeviceMappingList.get_by_instance_uuid(
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", lin
e 197, in wrapper
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] ctxt, self, fn.__name__, args, kwargs)
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] File "/opt/stack/nova/nova/conductor/rpcapi.py", line 246, in object_action
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] objmethod=objmethod, args=args, kwargs=kwargs)
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 158, in call
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] retry=self.retry)
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] timeout=timeout, retry=retry)
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 431, in send
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] retry=retry)
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 399, in _send
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] msg = rpc_common.serialize_msg(msg)
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/common.py", line 286, in serialize_msg
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] _MESSAGE_KEY: jsonutils.dumps(raw_msg)}
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] File "/usr/local/lib/python2.7/dist-packages/oslo_serialization/jsonutils.py", line 185, in dumps
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] return json.dumps(obj, default=default, **kwargs)
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] File "/usr/lib/python2.7/json/__init__.py", line 250, in dumps
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] sort_keys=sort_keys, **kw).encode(obj)
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] File "/usr/lib/python2.7/json/encoder.py", line 207, in encode
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] chunks = self.iterencode(o, _one_shot=True)
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] return _iterencode(o, 0)
2015-09-08 00:36:34.133 TRACE nova.compute.manager [instance: 00c7a9ae-bff1-461f-ab95-0e8f15327536] ValueError: Circular reference detected

Changed in nova:
assignee: nobody → Rajesh Tailor (rajesh-tailor)
tags: added: compute db quotas
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/226168

Changed in nova:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/226168
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=fdd85c10de35c21700dd33bb9b2e0a3ff425d83a
Submitter: Jenkins
Branch: master

commit fdd85c10de35c21700dd33bb9b2e0a3ff425d83a
Author: Rajesh Tailor <email address hidden>
Date: Thu Sep 10 02:01:47 2015 -0700

    Fix quota update in init_instance on nova-compute restart

    During instance deletion if nova-compute crashes after setting
    instance vm_state to 'DELETED' but before destroying instance from db,
    then on restart of nova-compute _init_instance method deletes instance
    from db as well but fails to update quota for that instance.

    Since on nova-compute restart that instance was marked as deleted in
    db also, quota for that instance never gets updated.

    Added code to get instance metadata in init_host as it is required in
    'notify_about_instance_usage' method and getting instance
    system_metadata before destroying instance from db
    in _complete_partial_deletion.

    Closes-Bug: 1493694
    Change-Id: Ib4ff381848957b33774a4d7c1bfcf1410d67a657

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → liberty-rc1
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/229261

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/kilo)

Reviewed: https://review.openstack.org/229261
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=68e9359d88646760030a6fe769af122e70986d8c
Submitter: Jenkins
Branch: stable/kilo

commit 68e9359d88646760030a6fe769af122e70986d8c
Author: Rajesh Tailor <email address hidden>
Date: Thu Sep 10 02:01:47 2015 -0700

    Fix quota update in init_instance on nova-compute restart

    During instance deletion if nova-compute crashes after setting
    instance vm_state to 'DELETED' but before destroying instance from db,
    then on restart of nova-compute _init_instance method deletes instance
    from db as well but fails to update quota for that instance.

    Since on nova-compute restart that instance was marked as deleted in
    db also, quota for that instance never gets updated.

    Added code to get instance metadata in init_host as it is required in
    'notify_about_instance_usage' method and getting instance
    system_metadata before destroying instance from db
    in _complete_partial_deletion.

    Closes-Bug: 1493694
    Change-Id: Ib4ff381848957b33774a4d7c1bfcf1410d67a657
    (cherry picked from commit fdd85c10de35c21700dd33bb9b2e0a3ff425d83a)

tags: added: in-stable-kilo
Thierry Carrez (ttx)
Changed in nova:
milestone: liberty-rc1 → 12.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.