test_create_server_from_volume_snapshot tests failing with Invalid volume: Volume status must be available or error or error_restoring or error_extending or error_managing and must not be migrating

Bug #1863750 reported by chandan kumar
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Unassigned

Bug Description

tripleo-ci-centos-7-scenario001-standalone uses validate-tempest to run tempest tests on stable/stein.

https://0e4946a809d3ba8ed039-2e4779f3cb2ed6b8482a8051dc8df3aa.ssl.cf1.rackcdn.com/703764/1/check/tripleo-ci-centos-7-scenario001-standalone/e78d66f/logs/undercloud/home/zuul/tempest.log

Currently tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern.test_create_server_from_volume_snapshot [1042.077890s] ... FAILED is failing continously in gate job.
2020-02-16 16:56:04 |
2020-02-16 16:56:04 | Captured traceback-1:
2020-02-16 16:56:04 | ~~~~~~~~~~~~~~~~~~~~~
2020-02-16 16:56:04 | Traceback (most recent call last):
2020-02-16 16:56:04 | File "/usr/lib/python2.7/site-packages/tempest/lib/common/utils/test_utils.py", line 84, in call_and_ignore_notfound_exc
2020-02-16 16:56:04 | return func(*args, **kwargs)
2020-02-16 16:56:04 | File "/usr/lib/python2.7/site-packages/tempest/lib/services/volume/v3/volumes_client.py", line 125, in delete_volume
2020-02-16 16:56:04 | resp, body = self.delete(url)
2020-02-16 16:56:04 | File "/usr/lib/python2.7/site-packages/tempest/lib/common/rest_client.py", line 314, in delete
2020-02-16 16:56:04 | return self.request('DELETE', url, extra_headers, headers, body)
2020-02-16 16:56:04 | File "/usr/lib/python2.7/site-packages/tempest/lib/services/volume/base_client.py", line 38, in request
2020-02-16 16:56:04 | method, url, extra_headers, headers, body, chunked)
2020-02-16 16:56:04 | File "/usr/lib/python2.7/site-packages/tempest/lib/common/rest_client.py", line 679, in request
2020-02-16 16:56:04 | self._error_checker(resp, resp_body)
2020-02-16 16:56:04 | File "/usr/lib/python2.7/site-packages/tempest/lib/common/rest_client.py", line 790, in _error_checker
2020-02-16 16:56:04 | raise exceptions.BadRequest(resp_body, resp=resp)
2020-02-16 16:56:04 | tempest.lib.exceptions.BadRequest: Bad request
2020-02-16 16:56:04 | Details: {u'message': u'Invalid volume: Volume status must be available or error or error_restoring or error_extending or error_managing and must not be migrating, attached, belong to a group, have snapshots or be disassociated from snapshots after volume transfer.', u'code': 400}

and during cleanup 2020-02-16 16:43:12 | {0} tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern.test_create_server_from_volume_snapshot [1042.077890s] ... FAILED
2020-02-16 16:43:12 |
2020-02-16 16:43:12 | Captured traceback-2:
2020-02-16 16:43:12 | ~~~~~~~~~~~~~~~~~~~~~
2020-02-16 16:43:12 | Traceback (most recent call last):
2020-02-16 16:43:12 | File "/usr/lib/python2.7/site-packages/tempest/lib/common/rest_client.py", line 891, in wait_for_resource_deletion
2020-02-16 16:43:12 | raise exceptions.TimeoutException(message)
2020-02-16 16:43:12 | tempest.lib.exceptions.TimeoutException: Request timed out
2020-02-16 16:43:12 | Details: (TestVolumeBootPattern:_run_cleanups) Failed to delete volume eceb2182-a20d-46b5-aac8-eaa00757df7c within the required time (500 s).

As it was not able to cleanup the volume in required timing.

We will add the more details later. In order to unblock the CI, moving it to skip list.

From cinder volume logs:
https://0e4946a809d3ba8ed039-2e4779f3cb2ed6b8482a8051dc8df3aa.ssl.cf1.rackcdn.com/703764/1/check/tripleo-ci-centos-7-scenario001-standalone/e78d66f/logs/undercloud/var/log/containers/cinder/cinder-volume.log

2020-02-16 16:26:32.118 48 ERROR cinder.volume.manager [req-f8219c09-efdd-4b13-9b71-dd2d6dae20bc af3f6ce8580344baa1ec6b02468765eb 986a23f0740346c1b0706298010a7836 - default default] Delete snapshot failed, due to snapshot busy.: SnapshotIsBusy: deleting snapshot snapshot-4f964506-a45c-4067-8698-4f5f85155dac that has dependent volumes

description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-quickstart-extras (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/708368

description: updated
wes hayutin (weshayutin)
Changed in tripleo:
status: Confirmed → Triaged
Revision history for this message
Alan Bishop (alan-bishop) wrote :
Download full text (3.7 KiB)

Using cinder-volume logs, I traced the problem to a race condition in the cleanup tasks that execute after the test completes (BTW, it passes!). Here are the relevant UUID's:
eceb2182-a20d-46b5-aac8-eaa00757df7c : Base volume ID4f964506-a45c-4067-8698-4f5f85155dac : Snapshot ID49a168f2-593e-48ae-8156-def0b25e9052 : Bootable volume created from the snapshot9acb28c5-0ad5-4d81-aabd-786b8022e6a2 : Nova server ID
From the cinder-volume logs [1], the sequence is:
[1] https://0e4946a809d3ba8ed039-2e4779f3cb2ed6b8482a8051dc8df3aa.ssl.cf1.rackcdn.com/703764/1/check/tripleo-ci-centos-7-scenario001-standalone/e78d66f/logs/undercloud/var/log/containers/cinder/cinder-volume.log
1. Base volume is created
2020-02-16 16:26:00.180 48 INFO cinder.volume.flows.manager.create_volume [req-e2a16913-c477-4c2d-9be3-15f4967d8b40 af3f6ce8580344baa1ec6b02468765eb 986a23f0740346c1b0706298010a7836 - default default] Volume volume-eceb2182-a20d-46b5-aac8-eaa00757df7c (eceb2182-a20d-46b5-aac8-eaa00757df7c): created successfully
2. Bootable volume is created from snapshot
2020-02-16 16:26:10.488 48 DEBUG cinder.volume.drivers.rbd [req-bf133539-6e24-4c13-b544-c9f5415199a4 af3f6ce8580344baa1ec6b02468765eb 986a23f0740346c1b0706298010a7836 - default default] cloning volumes/volume-eceb2182-a20d-46b5-aac8-eaa00757df7c@snapshot-4f964506-a45c-4067-8698-4f5f85155dac to volume-49a168f2-593e-48ae-8156-def0b25e9052 _clone /usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py:918
3. Bootable volume is attached to the server
2020-02-16 16:26:14.988 48 DEBUG cinder.volume.drivers.rbd [req-b9bed9ae-f661-4a37-beec-2c6c75655bcb af3f6ce8580344baa1ec6b02468765eb 986a23f0740346c1b0706298010a7836 - default default] connection data: {'driver_volume_type': 'rbd', 'data': {'secret_uuid': '4b5c8c0a-ff60-454b-a1b4-9747aa737d19', 'volume_id': u'49a168f2-593e-48ae-8156-def0b25e9052', 'auth_username': 'openstack', 'secret_type': 'ceph', 'name': u'volumes/volume-49a168f2-593e-48ae-8156-def0b25e9052', 'discard': True, 'keyring': None, 'cluster_name': 'ceph', 'hosts': [u'192.168.24.2'], 'auth_enabled': True, 'ports': [u'3300']}} initialize_connection /usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py:1404
4. After test finishes, cleanup detaches the volume
2020-02-16 16:26:30.341 48 INFO cinder.volume.manager [req-37847165-24e4-4d4a-87e9-4b7759406025 af3f6ce8580344baa1ec6b02468765eb 986a23f0740346c1b0706298010a7836 - default default] Detaching volume 49a168f2-593e-48ae-8156-def0b25e9052 from instance 9acb28c5-0ad5-4d81-aabd-786b8022e6a2.
5. Request received to delete the bootable volume

2020-02-16 16:26:31.143 48 DEBUG cinder.volume.drivers.rbd [req-69f4084a-0fcb-4021-8a77-d0fd867608da af3f6ce8580344baa1ec6b02468765eb 986a23f0740346c1b0706298010a7836 - default default] deleting rbd volume volume-49a168f2-593e-48ae-8156-def0b25e9052 delete_volume /usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py:1102

6. Request to delete the snapshot is rejected
2020-02-16 16:26:32.118 48 ERROR cinder.volume.manager [req-f8219c09-efdd-4b13-9b71-dd2d6dae20bc af3f6ce8580344baa1ec6b02468765eb 986a23f0740346c1b0706298010a7836 - default default] Delete snapshot failed, due to snapshot busy.: Sna...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-quickstart-extras (master)

Reviewed: https://review.opendev.org/708368
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart-extras/commit/?id=35c8399f5f4da72d1450b2cf2898ce28b0b5a803
Submitter: Zuul
Branch: master

commit 35c8399f5f4da72d1450b2cf2898ce28b0b5a803
Author: Soniya Vyas <email address hidden>
Date: Tue Feb 18 17:37:22 2020 +0530

    [Stein] Adding failing tests back to skiplist

    tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern
    class has 'test_create_server_from_volume_snapshot' test failing.

    So, just added above test back to skiplist
    Related-Bug: #1863750

    Signed-off by: Soniya Vyas<email address hidden>

    Change-Id: Ia5790b7a86249c4a09ad290f0f76adb4d5c4ec52

wes hayutin (weshayutin)
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.