Can attach_volume to multiple instances (race)

Bug #1096983 reported by Cory Stone
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Medium
John Griffith
OpenStack Compute (nova)
Invalid
Undecided
Unassigned

Bug Description

This affects both nova & cinder.

In compute.api, the race between volume_api.get, check_attach, and reserve_volume, can end up with the volume physically attaching to both instances.

Revision history for this message
John Griffith (john-griffith) wrote :

I couldn't remember exactly what patch it was when we talked about this but I think this guy: https://review.openstack.org/#/c/18717/

"should" address this.

Revision history for this message
Ivan-Zhu (ivan-zhu) wrote :

I agree with John.

Chuck Short (zulcss)
Changed in cinder:
status: New → Fix Committed
Thierry Carrez (ttx)
Changed in cinder:
status: Fix Committed → Fix Released
milestone: none → grizzly-2
Revision history for this message
Cory Stone (corystone) wrote :

Hi guys,

This patch doesn't address the issue. Nova's driver.attach_volume happens before the volume_api.attach is even called. Cinder's attach is just a notification.

To fix this issue, we may have to do something like making reserve_volume do something useful. Maybe I have to do something in my driver to handle this?

It seems really easy to duplicate:

nova volume-attach <instanceid1> <volumeid> auto& nova volume-attach <instanceid2> <volumeid> auto&

root@nova:~# virsh list
 Id Name State
----------------------------------
  1 instance-00000001 running
  2 instance-00000002 running

root@nova:~# virsh domblklist 1
Target Source
------------------------------------------------
vda /etc/nova/state/instances/instance-00000001/disk
vdb /dev/disk/by-path/ip-10.127.0.166:3260-iscsi-iqn.2010-11.com.rackspace:9cba7916-f5e4-4edb-8bb6-7582d6609e9c-lun-0

root@nova:~# virsh domblklist 2
Target Source
------------------------------------------------
vda /etc/nova/state/instances/instance-00000002/disk
vdb /dev/disk/by-path/ip-10.127.0.166:3260-iscsi-iqn.2010-11.com.rackspace:9cba7916-f5e4-4edb-8bb6-7582d6609e9c-lun-0

Revision history for this message
Cory Stone (corystone) wrote :

It gets uglier if you try it again. The next bdm gets the wrong device name:

cory@cfsyn25:~/devstack$ nova volume-attach 68932196-8bf2-4d7d-ba01-eb4a8a3b20cc 2e911a75-08a1-4ca8-a1c5-10fd8d30c99d auto & nova volume-attach f4f6b668-6936-49fe-9ff1-6d0751dfe681 2e911a75-08a1-4ca8-a1c5-10fd8d30c99d auto &
[1] 28200
[2] 28201
cory@cfsyn25:~/devstack$ +----------+--------------------------------------+
| Property | Value |
+----------+--------------------------------------+
| device | /dev/vdc |
| id | 2e911a75-08a1-4ca8-a1c5-10fd8d30c99d |
| serverId | 68932196-8bf2-4d7d-ba01-eb4a8a3b20cc |
| volumeId | 2e911a75-08a1-4ca8-a1c5-10fd8d30c99d |
+----------+--------------------------------------+
+----------+--------------------------------------+
| Property | Value |
+----------+--------------------------------------+
| device | /dev/vdb |
| id | 2e911a75-08a1-4ca8-a1c5-10fd8d30c99d |
| serverId | f4f6b668-6936-49fe-9ff1-6d0751dfe681 |
| volumeId | 2e911a75-08a1-4ca8-a1c5-10fd8d30c99d |
+----------+--------------------------------------+

^^^ /dev/vdb is already in use by the first double attach, so this one doesn't actually get attached by compute.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/20301

tags: added: folsom-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/20301
Committed: http://github.com/openstack/cinder/commit/2c84413c74e6481abe4af716ea12d8c75b405c25
Submitter: Jenkins
Branch: master

commit 2c84413c74e6481abe4af716ea12d8c75b405c25
Author: John Griffith <email address hidden>
Date: Wed Jan 23 04:15:40 2013 +0000

    Get updated vol status in volume.api.reserve.

    A race condtion was discovered where a nova volume attach
    could easily be performed twice for the same volume. Although
    the cinder attach update would fail, this occured after the BDM
    updates were made and in essence the attach to the compute instance
    had already been issued.

    This change simply forces a get from the DB of the volume-ref in the
    reserve call and checks if an attach is already in progress.

    Fixes bug: 1096983

    Change-Id: Ie0e4156d691ee92b6981078ef0ba62f8c4cdf0c8

Changed in nova:
status: New → Invalid
Changed in cinder:
assignee: nobody → John Griffith (john-griffith)
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/folsom)

Fix proposed to branch: stable/folsom
Review: https://review.openstack.org/20975

Mark McLoughlin (markmc)
tags: removed: folsom-backport-potential
Thierry Carrez (ttx)
Changed in cinder:
milestone: grizzly-2 → 2013.1
Sean Dague (sdague)
no longer affects: nova/folsom
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.