Tempest Volume deleted twice causing error on second delete

Bug #1929550 reported by Sampat Ponnaganti
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tempest
Invalid
Undecided
Lukáš Piwowarski

Bug Description

Tempest Volume deleted twice causing error on second delete
Ignore not found error when the volume id deleted second time.

For example in this test:
https://github.com/openstack/tempest/blob/master/tempest/api/volume/test_volumes_snapshots.py#L97

On spanshot create the snapshot is added for class level clean up:
Create snapshot
https://github.com/openstack/tempest/blob/master/tempest/api/volume/test_volumes_snapshots.py#L103
addClassResourceCleanUp called
https://github.com/openstack/tempest/blob/3b56cdb083998d9162ed79db303bf8a469c3015a/tempest/api/volume/base.py#L134

Delete snapshot is called in the test case:
https://github.com/openstack/tempest/blob/master/tempest/api/volume/test_volumes_snapshots.py#L142

Class level clean up throws an exception since the volume is deleted:
Captured traceback:
~~~~~~~~~~~~~~~~~~~
    Traceback (most recent call last):

      File "/home/aqua/tempest/tempest/test.py", line 236, in tearDownClass
    six.reraise(etype, value, trace)

      File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
    raise value

      File "/home/aqua/tempest/tempest/test.py", line 208, in tearDownClass
    teardown()

      File "/home/aqua/tempest/tempest/test.py", line 590, in resource_cleanup
    raise testtools.MultipleExceptions(*cleanup_errors)

    testtools.runtest.MultipleExceptions: (<class 'tempest.lib.exceptions.BadRequest'>, Bad request
Details: {'code': 400, 'message': 'Invalid volume: Volume status must be available or error or error_restoring or error_extending or error_managing and must not be migrating, attached, belong to a group, have snapshots or be disassociated from snapshots after volume transfer.'}, <traceback object at 0x7f37d0504b48>)

Propose fixed.
Remove volume/snapshot delete call in the testcase

description: updated
description: updated
Changed in tempest:
assignee: nobody → Lukáš Piwowarski (piwowarl)
Revision history for this message
Lukáš Piwowarski (piwowarl) wrote :

I think that the deletion of the snapshot should be removed from the test you mention. However, after looking at the code I think that this particular error might be caused by something else. What leads me to this idea is that:
 - tempest waits for successful deletion of each snapshot [1] and
 - tempest receives NotFound error from Openstack api that is ignored when it tries
   to delete a snapshot that does not exist [2].

But I could be wrong. Can you please provide steps of how to reproduce the error to make sure that the double deletion of snapshots is really the cause of the error? (e.g.: Which tests did you run when this error occurred? Did you run only the test_snapshot_create_get_list_update_delete you mention in the description?)

[1] https://github.com/openstack/tempest/blob/3e127640657b294c9d4a941b6a6bb287aa496539/tempest/api/volume/base.py#L182
[2] https://github.com/openstack/tempest/blob/3e127640657b294c9d4a941b6a6bb287aa496539/tempest/lib/common/utils/test_utils.py#L88

Revision history for this message
Sampat Ponnaganti (sp810x) wrote :

@Lukas Thanks for the comment. I get this error in our internal deployment and cannot reproduce this in a devstack.

I got this error again today for this test:
tempest.api.volume.test_volumes_extend.VolumesExtendTest.test_volume_extend[id-9a36df71-a257-43a5-9555-dc7c88e66e0e]
KibanaLog(Right click and Open in New Tab)

Volume was never created so the delete failed as expected.

root@auk51r11o001:/home/sp810x# openstack volume list --project ec86b25d16aa4958a892c6fea4cdb433
+--------------------------------------+-----------------------------------------------------+----------+------+-------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+-----------------------------------------------------+----------+------+-------------+
| b2c2da00-6271-4c75-bff1-004455b1f526 | aqua-131877bd-VolumesGetTest-Volume-960283900 | creating | 1 | |
| 798260e7-f1ca-46a8-8fa5-5b64df51ae7b | aqua-131877bd-VolumesExtendTest-Volume-1611627085 | creating | 1 | |
| 5ea7d637-54fa-4f9d-8e14-2e4be129fd62 | aqua-131877bd-VolumesImageMetadata-Volume-272072738 | creating | 1 | |
+--------------------------------------+-----------------------------------------------------+----------+------+-------------+

Shall I put in a patchset to show the volume status in the exception so it is clear: In this message show the volume/snapshot/backup status:

Details: {'code': 400, 'message': 'Invalid volume: Volume status must be available or error or error_restoring or error_extending or error_managing and must not be migrating, attached, belong to a group, have snapshots or be disassociated from snapshots after volume transfer.'}, <traceback object at 0x7f37d0504b48>)

Revision history for this message
Lukáš Piwowarski (piwowarl) wrote :

I was trying to reproduce the error but as you mention it is not reproducible on devsteck and neither it is on packstack.

Would it be possible for you to run the test_volume_extend test and check whether tempest.log contains line similar to this one: "2021-06-03 08:38:13.023 10124 INFO tempest.common.waiters [-] volume e48f4c0e-a594-4f43-bd3e-7a8e3664a8d3 reached available after waiting for 7.023860 seconds" ?

You can run following commands:
1) tempest run --regex "tempest.api.volume.test_volumes_extend.VolumesExtendTest.test_volume_extend" # (Please provide output of this command)
2) grep "waiters" tempest.log # (Please provide output of this command)

I am thinking about how it is possible that tempest passes successfully through this function [1] when the volume is not fully created.

[1] https://github.com/openstack/tempest/blob/165586d19061c9ecf36d9ef191043c6269579f87/tempest/common/waiters.py#L281

Changed in tempest:
status: New → Incomplete
Revision history for this message
Sampat Ponnaganti (sp810x) wrote :

Hi Lukas,

Thanks for checking. I checked the tempest logs so the Volume was not in available state within the build timeout of 300s
Tempest did not pass through this function successfully. Since there was an exception, the clean up failed since volume in creating status cannot be deleted.

If we can show the current status of the volume in the clean up function as well it would help.

{0} tempest.api.volume.test_volumes_extend.VolumesExtendTest.test_volume_extend [302.598503s] ... FAILED

Captured traceback:
~~~~~~~~~~~~~~~~~~~
    Traceback (most recent call last):

      File "/home/aqua/tempest/tempest/api/volume/test_volumes_extend.py", line 37, in test_volume_extend
    volume = self.create_volume(imageRef=self.image_ref)

      File "/home/aqua/tempest/tempest/api/volume/base.py", line 130, in create_volume
    volume['id'], wait_until)

      File "/home/aqua/tempest/tempest/common/waiters.py", line 288, in wait_for_volume_resource_status
    raise lib_exc.TimeoutException(message)

    tempest.lib.exceptions.TimeoutException: Request timed out
Details: volume 798260e7-f1ca-46a8-8fa5-5b64df51ae7b failed to reach available status (current creating) within the required time (300 s).

Captured pythonlogging:

Revision history for this message
Lukas Piwowarski (lukas-piwowarski) wrote :

The error described in this bug description is not caused by double deletion as Sampat explained in the previous comment but rather by the fact that the created volume during testing did not reach "active" status. Later when we try to delete the volume which is still in "creating" status we receive an error (as volumes in "creating" status can not be deleted). Therefore this is not an issue of tempest.

I think that showing the status of volume in a cleanup function would be unnecessary overhead. We would have to create another API call to get the status of the volume. Also it is fine that tempest throws an error because this is really an issue as tempest leaves uncleaned volume on the system and there is nothing to do about it.

I believe we can close this. Feel free to reopen this if you believe that I was wrong:).

Changed in tempest:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.