cells: Object action destroy failed because: host changed

Bug #1448316 reported by Matt Riedemann
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Medium
Unassigned

Bug Description

http://logs.openstack.org/58/175458/3/check/check-tempest-dsvm-cells/ce91d7e/logs/screen-n-api.txt.gz?level=TRACE#_2015-04-24_15_21_05_563

2015-04-24 15:21:05.563 ERROR nova.api.openstack [req-99d82867-7bd3-4ca7-9102-27bacb4ab6a0 ListServersNegativeTestJSON-1811537006 ListServersNegativeTestJSON-968745101] Caught error: Object action destroy failed because: host changed
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack Traceback (most recent call last):
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/opt/stack/new/nova/nova/api/openstack/__init__.py", line 125, in __call__
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack return req.get_response(self.application)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/webob/request.py", line 1317, in send
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack application, catch_exc_info=False)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/webob/request.py", line 1281, in call_application
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack app_iter = application(self.environ, start_response)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 144, in __call__
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack return resp(environ, start_response)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/keystonemiddleware/auth_token/__init__.py", line 639, in __call__
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack return self._call_app(env, start_response)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/keystonemiddleware/auth_token/__init__.py", line 559, in _call_app
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack return self._app(env, _fake_start_response)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 144, in __call__
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack return resp(environ, start_response)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 144, in __call__
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack return resp(environ, start_response)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/routes/middleware.py", line 136, in __call__
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack response = self.app(environ, start_response)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 144, in __call__
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack return resp(environ, start_response)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 130, in __call__
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack resp = self.call_func(req, *args, **self.kwargs)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 195, in call_func
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack return self.func(req, *args, **kwargs)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/opt/stack/new/nova/nova/api/openstack/wsgi.py", line 756, in __call__
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack content_type, body, accept)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/opt/stack/new/nova/nova/api/openstack/wsgi.py", line 821, in _process_stack
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack action_result = self.dispatch(meth, request, action_args)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/opt/stack/new/nova/nova/api/openstack/wsgi.py", line 911, in dispatch
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack return method(req=request, **action_args)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/opt/stack/new/nova/nova/api/openstack/compute/servers.py", line 834, in delete
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack self._delete(req.environ['nova.context'], req, id)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/opt/stack/new/nova/nova/api/openstack/compute/servers.py", line 669, in _delete
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack self.compute_api.delete(context, instance)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/opt/stack/new/nova/nova/compute/cells_api.py", line 212, in delete
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack self._handle_cell_delete(context, instance, 'delete')
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/opt/stack/new/nova/nova/compute/cells_api.py", line 226, in _handle_cell_delete
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack self._do_delete)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/opt/stack/new/nova/nova/compute/api.py", line 1766, in _local_delete
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack instance.destroy()
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/opt/stack/new/nova/nova/objects/base.py", line 208, in wrapper
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack return fn(self, *args, **kwargs)
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack File "/opt/stack/new/nova/nova/objects/instance.py", line 646, in destroy
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack reason='host changed')
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack ObjectActionError: Object action destroy failed because: host changed
2015-04-24 15:21:05.563 32393 TRACE nova.api.openstack

http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiX2hhbmRsZV9jZWxsX2RlbGV0ZVwiIEFORCBtZXNzYWdlOlwiT2JqZWN0QWN0aW9uRXJyb3I6IE9iamVjdCBhY3Rpb24gZGVzdHJveSBmYWlsZWQgYmVjYXVzZTogaG9zdCBjaGFuZ2VkXCIgQU5EIHRhZ3M6XCJzY3JlZW4tbi1hcGkudHh0XCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjE0Mjk5MDk4OTQ5NDl9

21 hits in 7 days. It's 90% failure when it happens though.

I also noticed this in the compute logs:

http://logs.openstack.org/58/175458/3/check/check-tempest-dsvm-cells/ce91d7e/logs/screen-n-cpu.txt.gz?level=TRACE#_2015-04-24_15_16_26_866

A separate query on that shows it happens quite a bit though in normal gate runs:

http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiTm8gY29tcHV0ZSBub2RlIHJlY29yZCBmb3IgaG9zdFwiIEFORCB0YWdzOlwic2NyZWVuLW4tY3B1LnR4dFwiIiwiZmllbGRzIjpbXSwib2Zmc2V0IjowLCJ0aW1lZnJhbWUiOiI2MDQ4MDAiLCJncmFwaG1vZGUiOiJjb3VudCIsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sInN0YW1wIjoxNDI5OTA4OTUwOTE2LCJtb2RlIjoiIiwiYW5hbHl6ZV9maWVsZCI6IiJ9

Matt Riedemann (mriedem)
tags: added: cells unified-objects
Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

By investigating a separate issue, we know that the following pattern (ie. calling twice an instance.save() method) can be racy :
https://github.com/openstack/tempest/blob/2b7373dbb1524ff7be4165714678fdd2fac41590/tempest/api/compute/servers/test_list_servers_negative.py#L45-L46

I just wonder how much https://review.openstack.org/#/c/177356/ could help

Changed in nova:
assignee: nobody → tianzichen306 (tianzichen306)
Revision history for this message
Matt Riedemann (mriedem) wrote :

This is not only showing up with cells jobs:

http://goo.gl/qZs8C3

It looks like it started ramping up hits after 5/16.

Revision history for this message
Matt Riedemann (mriedem) wrote :

This merged on 5/15 and hits the delete path in compute API, and is a bit ugly, so I'm wondering if it introduced a regression:

https://review.openstack.org/#/c/145738/

^ also looks like the cause for a spike in bug 1456771.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/184554

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/184554
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1ba72081715869fe6d0ffba3436fb09ef9499335
Submitter: Jenkins
Branch: master

commit 1ba72081715869fe6d0ffba3436fb09ef9499335
Author: Matt Riedemann <email address hidden>
Date: Wed May 20 17:36:21 2015 +0000

    Revert "Detach volume after deleting instance with no host"

    This reverts commit d1baa9fe7eb342b63fc85cbb5ef70bb676de6566

    The change introduced a race on delete where there isn't a host while
    prepping block devices and we hit a NoneType error in nova-compute.

    There was also a recheck grind on the original change showing it wasn't
    safe and needs to be reworked.

    Change-Id: Id4e405e7579530ed1c1f22ccc972d45b6d185f41
    Closes-Bug: #1456771
    Related-Bug: #1448316

Changed in nova:
assignee: tianzichen306 (tianzichen306) → nobody
Revision history for this message
Matt Riedemann (mriedem) wrote :

This was either fixed by 1ba72081715869fe6d0ffba3436fb09ef9499335 or is a latent issue in cells v1 which we aren't going to focus on fixing since cells v1 is deprecated and will hopefully be removed in the Rocky release.

Changed in nova:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.