Volume detach fails after unrescuing a instance after trying to detach while instance in rescue state doesn't work.

Bug #1158942 reported by Matthew Treinish
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
Joe Gordon
tempest
Fix Released
Medium
Joe Gordon

Bug Description

When an instance with a volume attached is brought into a rescue mode and a negative test is run against it to detach the volume it returns the expected error code. However if you attempt to detach the volume again after unrescuing the vm it will not detach properly. The volume state goes from detaching to in-use, and it is never detached.

Steps to reproduce:

1. Create a server and wait for it to get to ACTIVE state
2. Create a volume and wait for it to get to AVAILABLE state
3. Attach the volume to the server and wait for the volume to get to "IN USE" state
4. Rescue the VM and wait for the VM to get to the "RESCUE" state
5. Detach the volume from the server, which should fail with 409
6. Unrescue the VM and wait for it to get to ACTIVE state
7. Try to detach the volume from the Volume again from the server

Watching the status of the volume it will go from detaching to in-use and remain there.

Revision history for this message
Chuck Short (zulcss) wrote :

Which version is this with? Which distro? Which virt driver are you using? What volume backend are you using? Also can you attach the log files. Thanks
chuck

Changed in nova:
status: New → Incomplete
Revision history for this message
Matthew Treinish (treinish) wrote :

It's against grizzly. On ubuntu (I was hitting it with tempest on CI). It was using libvirt. I'm not sure of the default volume backend in CI. The logs are located here:

http://logs.openstack.org/24930/4/check/gate-tempest-devstack-vm-full/9883/logs/

I verified the state transition (in-use -> detaching -> in-use) by adding prints to the tempest tests and running locally.

Changed in nova:
milestone: none → grizzly-rc2
Changed in nova:
status: Incomplete → New
Chuck Short (zulcss)
Changed in nova:
status: New → Triaged
Thierry Carrez (ttx)
tags: added: grizzly-rc-potential
Changed in nova:
milestone: grizzly-rc2 → none
Thierry Carrez (ttx)
Changed in nova:
importance: Undecided → Medium
Thierry Carrez (ttx)
tags: added: grizzly-backport-potential
removed: grizzly-rc-potential
Changed in nova:
assignee: nobody → Joe Gordon (jogo)
milestone: none → havana-1
Revision history for this message
Joe Gordon (jogo) wrote :

Matthew, I was unable to manually reproduce this bug. working on reproducing with tempest.

Revision history for this message
Joe Gordon (jogo) wrote :

from the nova-compute logs

Failed to detach volume 31592e81-70f7-4b7d-b1f8-0da8c578129b from /dev/vdf

It looks like the problem is the disk doesn't appear at /dev/vdf right away.

Joe Gordon (jogo)
affects: nova → tempest
Changed in tempest:
milestone: havana-1 → none
Changed in nova:
status: New → In Progress
assignee: nobody → Joe Gordon (jogo)
milestone: none → havana-1
Revision history for this message
Joe Gordon (jogo) wrote :

This is a two part bug:

1) the tempest test as running incorrectly https://review.openstack.org/#/c/26166

2) a VM should only be allowed to enter rescue state when the attached volumes are in 'in-use' state

Revision history for this message
Joe Gordon (jogo) wrote :

Until now nova doesn't check the state of volumes before any action. If a VM is put into rescue mode before the volume is fully mounted, the volume doesn't re-appear properly after the VM is unrescued. So I think this warrants querying cinder before going into rescue mode.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/26562

Joe Gordon (jogo)
Changed in tempest:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tempest (master)

Reviewed: https://review.openstack.org/26166
Committed: http://github.com/openstack/tempest/commit/e4f2b2ec7805275559def1a085b1ead0b565ef73
Submitter: Jenkins
Branch: master

commit e4f2b2ec7805275559def1a085b1ead0b565ef73
Author: Joe Gordon <email address hidden>
Date: Thu Apr 4 15:54:10 2013 -0700

    Re-enable detach volume from unrescued VM

    Fix syntax around calling wait_for_volume_status. If don't include
    parameters on the same line, the function doesn't get evaluated.

    Fix bug 1158942

    Change-Id: I3ad3b8ee07f79bc440d9b26489777510340af2f9

Changed in tempest:
status: In Progress → Fix Released
Changed in nova:
importance: Undecided → Low
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/26562
Committed: http://github.com/openstack/nova/commit/5ada427935a0664f6c2534163f9988fb85d7b6ca
Submitter: Jenkins
Branch: master

commit 5ada427935a0664f6c2534163f9988fb85d7b6ca
Author: Joe Gordon <email address hidden>
Date: Wed Apr 10 01:05:42 2013 +0000

    Prevent rescuing a VM with a partially mounted volume.

    If a VM goes into rescue mode with a partially mounted volume, the
    volume won't re-appear after the VM is unrescued.

    Fix bug 1158942

    Change-Id: I1e104236c41c59e67a0f0e9ef26143c57f6e0094

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
Zhikun Liu (zhikunliu) wrote : AUTO: Zhi Kun ZK Liu is on vacation

I am out of the office until 2013-05-04.

I take vacation on 5/2 and 5/3. If have any urgent, please call 13910806810

Note: This is an automated response to your message "[Bug 1158942] Re:
Volume detach fails after unrescuing a instance after trying to detach
while instance in rescue state doesn't work." sent on 05/03/2013 0:13:26.

This is the only notification you will receive while this person is away.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/grizzly)

Fix proposed to branch: stable/grizzly
Review: https://review.openstack.org/28116

Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: havana-1 → 2013.2
Alan Pevec (apevec)
tags: removed: grizzly-backport-potential
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.