Deleting VM with an attached volume during copy-volume-to-image causes the volume remains in-use state

Bug #1406703 reported by Tushar Patil
32
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Medium
Abhijeet Malawade

Bug Description

Steps to reproduce:
1. Attach a volume to an instance.
2. Upload the volume to image.
3. Delete the instance.
4. After upload finished, volume state is still "in-use" and normal user cannot attach it to another instance. Only administrator can recover it using cinder reset-state.

Changed in cinder:
assignee: nobody → Rajesh Tailor (rajesh-tailor)
assignee: Rajesh Tailor (rajesh-tailor) → nobody
Changed in cinder:
assignee: nobody → Abhijeet Malawade (abhijeet-malawade)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/145143

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (master)

Change abandoned by Abhijeet Malawade (<email address hidden>) on branch: master
Review: https://review.openstack.org/145143
Reason: Already another patch is submitted : https://review.openstack.org/#/c/144409/

Revision history for this message
Huang Zhiteng (zhiteng-huang) wrote :

What backend was used? Is it Ceph/RBD? Does step 2 always work for all Cinder backends? I suspect how many backends allows 'multi-attach' because the generic copy_volume_to_image() method requires attaching the volume to volume service node to work. If most iSCSI based drivers don't even work, I think proper fix would be to enforce status check, i.e. not allow user to upload volume to image when volume is attached.

Revision history for this message
Avishay Traeger (avishay-il) wrote :

Zhiteng - currently you can pass 'force' to upload an in-use volume. I don't understand why this is allowed - if the volume is being changed during upload, you will end up with a corrupt image.

The question related to this bug is, does this happen if the VM is deleted without upload? Does it happen with failed upload without attaching to a VM?

Revision history for this message
Abhijeet Malawade (abhijeet-malawade) wrote :

Hi,

I have used 'lvm' as a backend. Mostly all backends supports 'copy_volume_to_image'.

IMO 'copy_volume_to_image' is allowed while volume is in 'in-use' state because user can create VM from bootable volume,
and he can create image of VM while it is running. Also user can create VM from bootable volume with 'Delete on Terminate' flag set to True.

This issue is reproducible only if user deletes VM while upload volume to image is in progress and VM gets deleted before upload volume to image is completed.

Revision history for this message
Walt Boring (walter-boring) wrote :

I'm not sure it even makes sense to allow copy volume to image, while the volume is attached. It will lead to corrupt data in the image as i/o is ongoing in the volume.

Revision history for this message
Abhijeet Malawade (abhijeet-malawade) wrote :

Hi,

We need to stop instance (to which volume is attached or which is created from bootable volume) before coping volume to image, to do not corrupt data.
Also when we stop instance with volume attached, then volume still remains in 'in-use' state.

Jay Bryant (jsbryant)
Changed in cinder:
importance: Undecided → Medium
Mike Perez (thingee)
tags: added: kilo-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/144409
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=4d6ddcb94435a20476536f4956de88ce4ab15b66
Submitter: Jenkins
Branch: master

commit 4d6ddcb94435a20476536f4956de88ce4ab15b66
Author: Abhijeet Malawade <email address hidden>
Date: Mon Dec 22 05:50:28 2014 -0800

    Get volume from db again before updating it's status

    During uploading a volume which is attached to an instance to image service,
    if the instance is deleted then the volume remains in "in-use" status.
    So get the volume again from db before updating it's status.

    Added new db api called 'volume_update_status_based_on_attachment'.
    It gets volume from db and updates its status using the same session.

    Closes-Bug: #1406703
    Change-Id: I01d313438f021cd2fe0d6f5c23f2029279d4b55a

Changed in cinder:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/juno)

Fix proposed to branch: stable/juno
Review: https://review.openstack.org/173715

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/175740

Revision history for this message
John Griffith (john-griffith) wrote :

This is an invalid case, we don't allow multi-attach; Upload volume_to_image works by attaching the volume to the Cinder node and uploading via the Cinder node as a proxy/worker.

This won't even work for almost every driver as there's no multi-attach. The API shouldn't allow this and that's what should be changed IMO.

Revision history for this message
John Griffith (john-griffith) wrote :

Proper behavior is:

ubuntu@stack-1:/opt/stack/cinder$ cinder upload-to-image cb73f501-ae9f-4f0c-b427-a2aab9e44e07 foo
ERROR: Bad Request (HTTP 400) (Request-ID: req-cb6638c0-1eb7-4020-b4ae-be09b13e250f)

Resulting in the following clearly in the c-api logs:

2015-04-22 04:15:39.753 DEBUG cinder.api.openstack.wsgi [req-cb6638c0-1eb7-4020-b4ae-be09b13e250f demo] Action body: {"os-volume_upload_image": {"container_format": "bare", "force": false, "image_name": "foo", "disk_format": "raw"}} get_method /opt/stack/cinder/cinder/api/ope
2015-04-22 04:15:39.791 INFO cinder.volume.api [req-cb6638c0-1eb7-4020-b4ae-be09b13e250f demo] [volume-cb73f501-ae9f-4f0c-b427-a2aab9e44e07] Volume info retrieved successfully.
2015-04-22 04:15:39.793 INFO cinder.api.openstack.wsgi [req-cb6638c0-1eb7-4020-b4ae-be09b13e250f demo] HTTP exception thrown: Invalid volume: Volume status is in-use.

Sure, if using LVM it's a local disk so this can be done with a --force True, but my question is why? I know that I'm going to be overridden on this, but by fixing this like this we're maintaining a broken API. Even if the functionality that somebody is using is because of a bug doesn't mean we shouldn't fix the bug IMHO.

Revision history for this message
John Griffith (john-griffith) wrote :

My point here is that:
1. The behavior probably shouldn't be allowed but it is
2. Somebody may be exploiting this with certain devices, and that's *ok*
3. Given that there's a work-around albeit inconvenient (reset-state), and this is a use case that we don't recommend anyway, it doesn't seem to meet backport criteria.

I'll defer to thingee on the final judgement here of course, but wanted to make sure I expressed my concerns.

Mike Perez (thingee)
tags: removed: kilo-rc-potential
no longer affects: cinder/kilo
Thierry Carrez (ttx)
Changed in cinder:
milestone: none → liberty-1
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (stable/juno)

Change abandoned by Mike Perez (<email address hidden>) on branch: stable/juno
Review: https://review.openstack.org/173715
Reason: Over a month with no update.

Thierry Carrez (ttx)
Changed in cinder:
milestone: liberty-1 → 7.0.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (stable/kilo)

Change abandoned by Sean McGinnis (<email address hidden>) on branch: stable/kilo
Review: https://review.openstack.org/175740
Reason: This review is > 4 weeks without comment and currently blocked by a core reviewer with a -2. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and contacting the reviewer with the -2 on this review to ensure you address their concerns.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.