volume is not getting attached to nova instance

Bug #1311533 reported by shweta
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack DBaaS (Trove)
Invalid
High
Sushil Kumar
OpenStack Heat
Fix Released
Critical
Sushil Kumar
Icehouse
Fix Released
Undecided
Unassigned

Bug Description

volume device doesn't comes up in nova instance provisioned with heat.
Following is the finding.

1. Heat stack list output says create is complete
ubuntu@test:~$ heat stack-list
+--------------------------------------+--------------------------------------------+-----------------+----------------------+
| id | stack_name | stack_status | creation_time |
+--------------------------------------+--------------------------------------------+-----------------+----------------------+
| a599ad03-0957-4500-a12b-42ee57114c98 | trove-dea20223-d9c1-4877-bb28-8aad6c0b2b84 | CREATE_COMPLETE | 2014-04-22T15:28:43Z |
+--------------------------------------+--------------------------------------------+-----------------+----------------------+

ubuntu@test:~$ heat resource-list trove-dea20223-d9c1-4877-bb28-8aad6c0b2b84
+-------------------+----------------------------+-----------------+----------------------+
| resource_name | resource_type | resource_status | updated_time |
+-------------------+----------------------------+-----------------+----------------------+
| DataVolume | AWS::EC2::Volume | CREATE_COMPLETE | 2014-04-22T15:28:43Z |
| DatabaseIPAddress | AWS::EC2::EIP | CREATE_COMPLETE | 2014-04-22T15:28:43Z |
| BaseInstance | AWS::EC2::Instance | CREATE_COMPLETE | 2014-04-22T15:28:44Z |
| VerticaDbaasSG | AWS::EC2::SecurityGroup | CREATE_COMPLETE | 2014-04-22T15:28:44Z |
| DatabaseIPAssoc | AWS::EC2::EIPAssociation | CREATE_COMPLETE | 2014-04-22T15:29:23Z |
| MountPoint | AWS::EC2::VolumeAttachment | CREATE_COMPLETE | 2014-04-22T15:29:24Z |
+-------------------+----------------------------+-----------------+----------------------+

2.As per cynder list, volume is attached to nova server

ubuntu@test:~$ cinder list
+--------------------------------------+--------+--------------------------------------------------------------------+------+-------------+----------+--------------------------------------+
| ID | Status | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+--------+--------------------------------------------------------------------+------+-------------+----------+--------------------------------------+
| 8c0a270a-62cc-48d1-89ac-4d6c8080a407 | in-use | trove-dea20223-d9c1-4877-bb28-8aad6c0b2b84-DataVolume-6gh2issdccox | 1 | None | false | e671f8cd-8fed-401e-8366-b795bb4a0a3a |
+--------------------------------------+--------+--------------------------------------------------------------------+------+-------------+----------+--------------------------------------+
ubuntu@test:~$ nova list
+--------------------------------------+-------------------------------------------------------+--------+------------+-------------+------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+-------------------------------------------------------+--------+------------+-------------+------------------------------+
| e671f8cd-8fed-401e-8366-b795bb4a0a3a | tr-1-4877-bb28-8aad6c0b2b84-BaseInstance-idrmjksjkshr | ACTIVE | - | Running | private=10.1.0.2, 172.24.4.1 |
+--------------------------------------+-------------------------------------------------------+--------+------------+-------------+------------------------------+

3. Nova- show output shows no volume is attached
ubuntu@test:~$ nova show e671f8cd-8fed-401e-8366-b795bb4a0a3a
+--------------------------------------+------------------------------------------------------------------------+
| Property | Value |
+--------------------------------------+------------------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | - |
| OS-EXT-STS:vm_state | active |
| OS-SRV-USG:launched_at | 2014-04-22T15:29:21.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2014-04-22T15:28:45Z |
| flavor | m1.rd-tiny (7) |
| hostId | e5136fb5e0dc9523a780c46f16ee962290721554a24ab180f9c577d9 |
| id | e671f8cd-8fed-401e-8366-b795bb4a0a3a |
| image | ubuntu_vertica (7374f58a-1760-4f12-ad87-ffbfd967d492) |
| key_name | - |
| metadata | {} |
| name | tr-1-4877-bb28-8aad6c0b2b84-BaseInstance-idrmjksjkshr |
| os-extended-volumes:volumes_attached | [] |
| private network | 10.1.0.2, 172.24.4.1 |
| progress | 0 |
| security_groups | trove-dea20223-d9c1-4877-bb28-8aad6c0b2b84-VerticaDbaasSG-sj7zrvvuuffw |
| status | ACTIVE |
| tenant_id | 6e0ee95a6f9c41edb8c8e25aa9f90895 |
| updated | 2014-04-22T15:29:22Z |
| user_id | 9c3537e5bf944da0bb06e232eb113b23 |
+--------------------------------------+------------------------------------------------------------------------+

shweta (shweta)
Changed in heat:
assignee: nobody → shweta (shweta)
Revision history for this message
Pavlo Shchelokovskyy (pshchelo) wrote :

could you show the template generated by Trove?

Revision history for this message
shweta (shweta) wrote :

our template looks like this
http://paste.openstack.org/show/76743/

shweta (shweta)
Changed in trove:
assignee: nobody → shweta (shweta)
Revision history for this message
Sushil Kumar (sushil-kumar2) wrote :

This bug is because of a recent change in heat volume provisioning calls shifted from nova to cinder in the patchset https://review.openstack.org/#/c/86638/

However, the problem is cinder APIs which are only exposed programmatically, do not work for actual attach-detach with nova server.

Cinder calls only update metadata, which is why cinder list is showing that volume was attached but nova does not confirms the same.

Revision history for this message
Sushil Kumar (sushil-kumar2) wrote :

I think we need to get back to nova calls itself.
proposing a change for the same.

Revision history for this message
Sushil Kumar (sushil-kumar2) wrote :

cinder client (v2) 'detach' API call is a legal call ( https://github.com/openstack/python-cinderclient/blob/master/cinderclient/v2/volumes.py#L260)

But the problem is that it only clear's the volume attachment metadata. You can see this from what call on the Cinder API server side get's executed ( https://github.com/openstack/cinder/blob/master/cinder/api/contrib/volume_actions.py#L122 )

And similarity exists with attach call aslo that it only creates metadata for cinder https://github.com/openstack/cinder/blob/master/cinder/api/contrib/volume_actions.py#L78

Changed in heat:
assignee: shweta (shweta) → Sushil Kumar (sushil-kumar2)
Revision history for this message
Sushil Kumar (sushil-kumar2) wrote :

I have pushed https://review.openstack.org/#/c/89796/ to just restore old nova calls.

Changed in heat:
status: New → In Progress
Revision history for this message
Steven Hardy (shardy) wrote :

Confirmed, I've been seeing the same, https://review.openstack.org/#/c/86638/ appears to have caused a regression where OS::Cinder::VolumeAttachment appears to work, but the volume is not really visible to the instance at all.

Using nova volume-attach on the CLI works fine, so reverting to the nova API seems like the best approach, provided we can avoid reintroducing the gate race which moving to the cinder calls was supposedly fixing ref bug #1298350

Can anyone provide any further details on why the cinder volume attach/detach API doesn't actually work?

I noticed it's not documented and only partially tested:

https://bugs.launchpad.net/cinder/+bug/1311733

FYI I'm currently working on a tempest test which when merged will cause us to fail the gate for a regression like this in future (I'll link the review here when I post it).

Changed in heat:
importance: Undecided → Critical
milestone: none → juno-1
Revision history for this message
Pavlo Shchelokovskyy (pshchelo) wrote :

another possibility should also be investigated:
this bug is concerning attach operation, and bug #1298350 is actually about detach. behavior of volume.detach should be tested (Sahara team claims it works pretty fine for them), and if detach is really doing its job, we might want to use nova when attaching and cinder when detaching.

Revision history for this message
Steven Hardy (shardy) wrote :

Here's the tempest test which should catch any regressions like this in future when it gets merged:

https://review.openstack.org/#/c/90143/

Revision history for this message
Sushil Kumar (sushil-kumar2) wrote :

Hey Steven

I have included the suggested change regarding the tests in the patchset.
Removed the duplicated calls.

Thanks for the information.

Revision history for this message
Openstack Gerrit (openstack-gerrit) wrote : Fix merged to heat (master)

Reviewed: https://review.openstack.org/89796
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=d1ffbd4bfde0cd6b7a82a48b9d4f59cc8b310bd8
Submitter: Jenkins
Branch: master

commit d1ffbd4bfde0cd6b7a82a48b9d4f59cc8b310bd8
Author: Sushil Kumar <email address hidden>
Date: Wed Apr 23 11:18:31 2014 +0000

    Restores Nova API for volume attach and detach

    Reasons:
     - Cinder's API attach-detach APIs do not actually work with
       Nova instances.
     - Cinder APIs only update metadata for attach/detach operations.

    Changes:
     - Replace cinder calls for attaching and detaching volumes
       with nova API calls.
     - Removed duplicate delete calls to prevent race condition.
     - Removed duplicate mocked calls from unit-test.

    Closes-Bug: #1311533
    Change-Id: I5f16c528652f12440160f03b92f41b76d1c9100c

Changed in heat:
status: In Progress → Fix Committed
Changed in trove:
status: New → Triaged
importance: Undecided → High
milestone: none → juno-1
tags: added: icehouse-backport-potential
Revision history for this message
Sushil Kumar (sushil-kumar2) wrote :

This problem also faced in trove was occurring because of change in heat code for volume attach-detach from nova APIs to cinder APIs.

It needed a fix in heat code, and as such there is not needed any fix in trove code for the same.

Changed in trove:
assignee: shweta (shweta) → Sushil Kumar (sushil-kumar2)
status: Triaged → Fix Committed
status: Fix Committed → Won't Fix
Changed in trove:
status: Won't Fix → Invalid
milestone: juno-1 → none
Thierry Carrez (ttx)
Changed in heat:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to heat (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/113158

Revision history for this message
David Hill (david-hill-ubisoft) wrote :

It doesn't fully work because of this:
   @scheduler.wrappertask
    def _delete(self, backup=False):
        if self.resource_id is not None:
            try:
                vol = self.cinder().volumes.get(self.resource_id)

                if backup:
                    yield self._backup()
                    vol.get()

                if vol.status == 'in-use':
                    logger.warn(_('can not delete volume when in-use'))
                    raise exception.Error(_('Volume in use'))

                vol.delete()
                while True:
                    yield
                    vol.get()
            except clients.cinderclient.exceptions.NotFound:
                self.resource_id_set(None)

I get this:
2014-09-10 15:19:20,780 ERROR [heat.engine.resource] Delete Volume "DataVolume" [d508cf13-1cee-479b-969a-137a09349258] Stack "monitoring-stv-lab-ctrl02_" [498154dd-181e-4c5a-acb2-3c27b3b97949]
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/heat/engine/resource.py", line 707, in delete
    handle_data = self.handle_delete()
  File "/usr/lib/python2.6/site-packages/heat/engine/resources/volume.py", line 173, in handle_delete
    delete_task.start()
  File "/usr/lib/python2.6/site-packages/heat/engine/scheduler.py", line 161, in start
    self.step()
  File "/usr/lib/python2.6/site-packages/heat/engine/scheduler.py", line 189, in step
    next(self._runner)
  File "/usr/lib/python2.6/site-packages/heat/engine/scheduler.py", line 250, in wrapper
    subtask = next(parent)
  File "/usr/lib/python2.6/site-packages/heat/engine/resources/volume.py", line 155, in _delete
    vol.delete()
  File "/usr/lib/python2.6/site-packages/cinderclient/v1/volumes.py", line 35, in delete
    self.manager.delete(self)
  File "/usr/lib/python2.6/site-packages/cinderclient/v1/volumes.py", line 228, in delete
    self._delete("/volumes/%s" % base.getid(volume))
  File "/usr/lib/python2.6/site-packages/cinderclient/base.py", line 162, in _delete
    resp, body = self.api.client.delete(url)
  File "/usr/lib/python2.6/site-packages/cinderclient/client.py", line 229, in delete
    return self._cs_request(url, 'DELETE', **kwargs)
  File "/usr/lib/python2.6/site-packages/cinderclient/client.py", line 187, in _cs_request
    **kwargs)
  File "/usr/lib/python2.6/site-packages/cinderclient/client.py", line 170, in request
    raise exceptions.from_response(resp, body)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to heat (master)

Reviewed: https://review.openstack.org/113158
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=30bc841b0933602ed938df74881449bea2436abe
Submitter: Jenkins
Branch: master

commit 30bc841b0933602ed938df74881449bea2436abe
Author: Steve Baker <email address hidden>
Date: Mon Aug 11 14:39:00 2014 +1200

    Add volume backup/restore integration test

    Adds a more comprehensive test for the cinder volume resources:
    - Creates a stack with a volume, and writes data to it
    - Deletes the stack with the volume deletion policy set to
      "snapshot" (which really means backup) the volume
    - Create a new stack with a volume created from the backup
    - Prove the data written in the first stack is still present
    Note this test also aims to provide coverage of volume attachment
    resources, e.g so we would catch any bugs like bug #1311533 in
    future.

    Authored-By: Steve Hardy <email address hidden> based on tempest change
    I04ae0cf942d12c4504b2df504a8c940575b90b69

    Change-Id: I04ae0cf942d12c4504b2df504a8c940575b90b69
    Related-Bug: #1311533

Thierry Carrez (ttx)
Changed in heat:
milestone: juno-1 → 2014.2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to heat (stable/icehouse)

Fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/139972

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to heat (stable/icehouse)

Reviewed: https://review.openstack.org/139972
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=0b984b8c13338cf2ec3836e21e71c3c955a148b4
Submitter: Jenkins
Branch: stable/icehouse

commit 0b984b8c13338cf2ec3836e21e71c3c955a148b4
Author: Sushil Kumar <email address hidden>
Date: Wed Apr 23 11:18:31 2014 +0000

    Call server volume detach only once

    Changes:
     - Removed duplicate delete calls to prevent race condition.
     - Removed duplicate mocked calls from unit-test.

    Closes-Bug: #1311533
    Closes-Bug: #1298350

    Conflicts:
     heat/engine/resources/volume.py
     heat/tests/test_volume.py

    Change-Id: I5f16c528652f12440160f03b92f41b76d1c9100c
    (cherry picked from commit d1ffbd4bfde0cd6b7a82a48b9d4f59cc8b310bd8)

tags: added: in-stable-icehouse
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.