Unable to remove volume type until image cache has been cleared

Bug #1823880 reported by Lee Yarwood
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
New
Low
Unassigned
tempest
Fix Released
Undecided
Lee Yarwood

Bug Description

I noticed while writing Tempest tests for bug #1803961 that Tempest wasn't able to reliably clean up volume types while volumes using the type were still present in the image volume cache:

Introduce an attached volume migration test
https://review.openstack.org/#/c/637527/

http://logs.openstack.org/27/637527/12/check/tempest-slow/25e335a/job-output.txt.gz#_2019-04-04_14_18_07_474786

http://logs.openstack.org/27/637527/12/check/tempest-slow/25e335a/controller/logs/screen-c-api.txt.gz?#_Apr_04_14_18_07_105846

Reviewing the cinder code in this area shows that these cache volumes need to be manually deleted by an admin in order for them to then be considered evicted from the cache and no longer in use:

https://github.com/openstack/cinder/blob/6912b5f246bf3a12cab77d323a64d13e40d99ab4/cinder/volume/api.py#L502-L505

Shouldn't we automatically remove these when the last volume associated to this cached volume is removed?

Running the following Tempest scenario test locally shows this behaviour even while the overall test passes (I'm assuming Tempest is swallowing the failure somewhere?):

$ cd tempest
$ tempest run --regex tempest.scenario.test_volume_migrate_attached.TestVolumeMigrateRetypeAttached.test_volume_migrate_attached
[..]
{0} tempest.scenario.test_volume_migrate_attached.TestVolumeMigrateRetypeAttached.test_volume_migrate_attached [43.161687s] ... ok

$ openstack volume list --all
+--------------------------------------+--------------------------------------------+-----------+------+-------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+--------------------------------------------+-----------+------+-------------+
| 2c48d377-4a3c-462d-9791-62df3635073f | image-4718064c-fa76-4ffa-80c1-44c8f1eef030 | available | 1 | |
+--------------------------------------+--------------------------------------------+-----------+------+-------------+

$ openstack volume type list
+--------------------------------------+----------------------------------------------------------------+-----------+
| ID | Name | Is Public |
+--------------------------------------+----------------------------------------------------------------+-----------+
| 92642ce1-0bb3-46e5-a731-4f8593d21e63 | tempest-volume-type-TestVolumeMigrateRetypeAttached-1047281271 | True |
| 60827973-eae8-4c2d-9774-5013ad4b05c4 | lvmdriver-2 | True |
| 6809c94a-ac65-481e-ab2a-01b7fc2fbc00 | lvmdriver-1 | True |
+--------------------------------------+----------------------------------------------------------------+-----------+

$ openstack volume type delete tempest-volume-type-TestVolumeMigrateRetypeAttached-1047281271
Failed to delete volume type with name or ID 'tempest-volume-type-TestVolumeMigrateRetypeAttached-1047281271': Target volume type is still in use. (HTTP 400) (Request-ID: req-83ce605a-f9fc-4e66-924f-32fb8183d4d3)
1 of 1 volume types failed to delete.

If this is expected then we can mark this as a bug against Tempest as it would need to remove any remaining volumes from the image volume cache before attempting to delete the volume type.

Revision history for this message
Sean McGinnis (sean-mcginnis) wrote :

I believe this is expected, but I will wait to assign it to tempest to give others a chance to add some input.

It's possible the last volume gets deleted, but it is just a temporary thing and more new volumes will be created after that. In that case, I would not want to flush the volume cache and cause an unnecessary delay on the next volume create.

Revision history for this message
Lee Yarwood (lyarwood) wrote :

ACK thanks Sean, I've jumped the gun slightly and posted a change in Tempest to ensure these are always cleaned up before we attempt to remove the volume type:

https://review.openstack.org/#/c/651238/

Happy to retract this if others want to workaround this in cinder but for now this looks like the way to go.

Changed in tempest:
assignee: nobody → Lee Yarwood (lyarwood)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tempest (master)

Reviewed: https://review.openstack.org/651238
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=be64e1ae7cfaea2a208ba42bbbf66fe4a5dee0e9
Submitter: Zuul
Branch: master

commit be64e1ae7cfaea2a208ba42bbbf66fe4a5dee0e9
Author: Lee Yarwood <email address hidden>
Date: Tue Apr 9 14:02:12 2019 +0100

    Ensure all image cache volumes are removed before removing the volume type

    Previously the only clean up action registered when creating a new
    volume type was the direct removal of that type. However this request would
    silently fail in the attached volume migration scenario tests if the
    backends being used had their image volume cache enabled.

    This was due to the image volume cache still containing volumes
    associated to the given volume type when attempts were made to delete
    the volume type. To avoid this these image volume cache volumes must be
    manually removed by an admin user before deleting the volume type.

    Closes-Bug: #1823880
    Change-Id: Ib4d82586e91037729f9e846a6f0fac6d393ca475

Changed in tempest:
status: In Progress → Fix Released
Revision history for this message
Gorka Eguileor (gorka) wrote :

I agree with Sean, we shouldn't delete the cache when we delete the last volume associated.

In my opinion we should delete it when we request to delete the volume type, and cached volumes should not be taken into consideration at the API to determine if a volume type can be removed.

Solving this in Cinder may be somewhat complicated. Volume types don't have a status we can change while we try to delete the cached volume to prevent races, we cannot delete the type before the volume, and deleting the volume before the type could create race conditions where somebody else uses the type for a new volume and we can no longer delete the type...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tempest 21.0.0

This issue was fixed in the openstack/tempest 21.0.0 release.

tags: added: cache image image-cache volume-type
Changed in cinder:
importance: Undecided → Low
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers