Fail to unrescue server with quobyte volume attached

Bug #1606136 reported by Silvan Kaiser
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Undecided
Unassigned

Bug Description

This Error occurs randomly with the Quobyte CI on arbitrary changes.
Nova tries to to detach a volume from a VM that is in vm_state error.

CI run examples can be found at [1][2]

Example output:
==============================
Failed 1 tests - output below:
==============================

tempest.api.compute.servers.test_server_rescue_negative.ServerRescueNegativeTestJSON.test_rescued_vm_detach_volume[id-f56e465b-fe10-48bf-b75d-646cda3a8bc9,negative,volume]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Captured traceback:
~~~~~~~~~~~~~~~~~~~
    Traceback (most recent call last):
      File "tempest/api/compute/servers/test_server_rescue_negative.py", line 80, in _unrescue
        server_id, 'ACTIVE')
      File "tempest/common/waiters.py", line 77, in wait_for_server_status
        server_id=server_id)
    tempest.exceptions.BuildErrorException: Server 35a2d53a-d80d-46b2-87a6-a7a82f9d5244 failed to build and is in ERROR status
    Details: {u'code': 500, u'message': u"Failed to open file '/mnt/quobyte-volume/abfa1002557ab2b21ec218a86487dd92/volume-351db2c5-9724-410f-b1d8-8680065c0788': No such file or directory", u'created': u'2016-07-23T13:03:20Z'}

Captured traceback-2:
~~~~~~~~~~~~~~~~~~~~~
    Traceback (most recent call last):
      File "tempest/api/compute/base.py", line 346, in delete_volume
        cls._delete_volume(cls.volumes_extensions_client, volume_id)
      File "tempest/api/compute/base.py", line 277, in _delete_volume
        volumes_client.delete_volume(volume_id)
      File "tempest/lib/services/compute/volumes_client.py", line 63, in delete_volume
        resp, body = self.delete("os-volumes/%s" % volume_id)
      File "tempest/lib/common/rest_client.py", line 301, in delete
        return self.request('DELETE', url, extra_headers, headers, body)
      File "tempest/lib/services/compute/base_compute_client.py", line 48, in request
        method, url, extra_headers, headers, body, chunked)
      File "tempest/lib/common/rest_client.py", line 664, in request
        resp, resp_body)
      File "tempest/lib/common/rest_client.py", line 828, in _error_checker
        message=message)
    tempest.lib.exceptions.ServerFault: Got server fault
    Details: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
    <class 'nova.exception.InvalidInput'>

Captured traceback-1:
~~~~~~~~~~~~~~~~~~~~~
    Traceback (most recent call last):
      File "tempest/api/compute/servers/test_server_rescue_negative.py", line 73, in _detach
        self.servers_client.detach_volume(server_id, volume_id)
      File "tempest/lib/services/compute/servers_client.py", line 342, in detach_volume
        (server_id, volume_id))
      File "tempest/lib/common/rest_client.py", line 301, in delete
        return self.request('DELETE', url, extra_headers, headers, body)
      File "tempest/lib/services/compute/base_compute_client.py", line 48, in request
        method, url, extra_headers, headers, body, chunked)
      File "tempest/lib/common/rest_client.py", line 664, in request
        resp, resp_body)
      File "tempest/lib/common/rest_client.py", line 777, in _error_checker
        raise exceptions.Conflict(resp_body, resp=resp)
    tempest.lib.exceptions.Conflict: An object with that identifier already exists
    Details: {u'code': 409, u'message': u"Cannot 'detach_volume' instance 35a2d53a-d80d-46b2-87a6-a7a82f9d5244 while it is in vm_state error"}

[1] http://78.46.57.153:8081/refs-changes-38-346438-3/
[2] http://78.46.57.153:8081/refs-changes-58-346358-1/

Revision history for this message
Silvan Kaiser (2-silvan) wrote :

I'm attaching one of the referenced CI runs logs.

Revision history for this message
Silvan Kaiser (2-silvan) wrote :
Silvan Kaiser (2-silvan)
tags: added: quobyte volumes
Revision history for this message
Matt Riedemann (mriedem) wrote :

It's a negative test, so likely something getting torn down in an unexpected order.

The test is here:

http://git.openstack.org/cgit/openstack/tempest/tree/tempest/api/compute/servers/test_server_rescue_negative.py#n147

Creates a volume, attaches it to the server, rescues the server, then attempts to detach the volume which is expected to fail.

It looks like the problem you're hitting is in the unrescue part of the cleanup:

http://git.openstack.org/cgit/openstack/tempest/tree/tempest/api/compute/servers/test_server_rescue_negative.py#n77

Where it's unrescuing the server it fails and goes to ERROR.

What are the errors in the n-cpu logs for that server when the unrescue fails?

Revision history for this message
Matt Riedemann (mriedem) wrote :
Download full text (5.1 KiB)

This is the failure in the compute during the unrescue:

http://paste.openstack.org/show/570328/

2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [req-2bd83902-baec-42cf-824d-d1c16aaedd4f tempest-ServerRescueNegativeTestJSON-727441625 tempest-ServerRescueNegativeTestJSON-727441625] [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] Setting instance vm_state to ERROR
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] Traceback (most recent call last):
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] File "/opt/stack/nova/nova/compute/manager.py", line 6432, in _error_out_instance_on_exception
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] yield
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] File "/opt/stack/nova/nova/compute/manager.py", line 3262, in unrescue_instance
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] network_info)
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2544, in unrescue
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] self._create_domain(xml, virt_dom)
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4650, in _create_domain
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] guest.launch(pause=pause)
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 142, in launch
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] self._encoded_xml, errors='ignore')
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] self.force_reraise()
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] six.reraise(self.type_, self.value, self.tb)
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 137, in launch
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35a2d53a-d80d-46b2-87a6-a7a82f9d5244] return self._domain.createWithFlags(flags)
2016-07-23 13:03:19.943 1365 ERROR nova.compute.manager [instance: 35...

Read more...

summary: - cannot detach_volume while VM is in error state
+ Fail to unrescue server with quobyte volume attached
Changed in nova:
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.