Add capability to detach root device volume of an instance, when in shutoff state

Bug #1396965 reported by swapnil
230
This bug affects 47 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Wishlist
Unassigned

Bug Description

Currently we cannot detach root device volume, even if instance is in shutoff state. Following error comes,
+++
ERROR (Forbidden): Can't detach root device volume (HTTP 403) (Request-ID: req-57159c1c-5835-4a44-8e41-1b822b92127e)
+++

When instance is in shutoff this task should be allowed.

Tags: compute
swapnil (swap-kamble)
Changed in nova:
assignee: nobody → swapnil (swap-kamble)
Revision history for this message
Alex Xu (xuhj) wrote :

not sure we can, the forbidden is added by https://bugs.launchpad.net/nova/+bug/1279300

melanie witt (melwitt)
Changed in nova:
importance: Undecided → Wishlist
status: New → Opinion
Revision history for this message
Robert Collins (lifeless) wrote :

I suspect we might want to reopen this, a shutoff VM has no reason to prevent root device changes (and see also bug 1391196 where a user is requesting we re-instate the ability to make changes like one could previously.

swapnil, can you give an example of where doing this makes sense? E.g. what do you want to accomplish, by doing this.

Revision history for this message
lindis (t-openstack) wrote :

A change in this to allow the detaching of a root volume to an instance makes since when the following happens:

cinder snapshots are being created on a regular interval for a nova instance running Linux
cinder creates a volume from a snapshot
volume needs to be attached to the nova instance that has the original volume on /dev/sda

Linux has a specific issue with /etc/udev/rules.d/persistence-net-rules which stores the MAC addresses

If the restored volume is attached to a new nova instance, the new MAC address adds a second entry to the persistence-net-rules and treats the interface like a second non-configured device; thus, breaking networking.

The legitimate use case for this is local Operational Recovery where a customer wants to go backwards to a set of snapshots, where they are dependent on the disks as they have yet to write portable code or ephemeral applications.

Revision history for this message
apporc (appleorchard2000) wrote :

When the server which the volume attached to is failed, we may want to attach this root volume and delete manually from command line.

Revision history for this message
Max Krasilnikov (pseudo) wrote :

It would be very useful feature when installing guest systems, like FreeBSD, on instancewith no need to delete instance after installing. My users prefer to install their systems by hand from their ISO, not using my images.

Revision history for this message
Jörg Frede (frede-r) wrote :

This is very useful because it is the only way to reimage/rebuild a sever because nova rebuild is not working if you are working with Volumes. Sometimes you need to detach a Volume to attache a new one. Please implement this.

Revision history for this message
melanie witt (melwitt) wrote :

I just marked bug 1391196 as a duplicate to group all the comments together. Also removing the assignee as this hasn't been updated in about 7 months.

Doing a little research about this, detach root device volume when in shutoff state was disabled because it can cause damage to the guest under certain conditions, rendering it unusable. From what I gathered in #openstack-nova, being able to detach root device volume in shutoff state can be a useful ability, but adding it back should be done with careful checking around it, to prevent guest damage.

Changed in nova:
assignee: swapnil (swap-kamble) → nobody
Revision history for this message
Fabio Da Soghe (fabiodasoghe) wrote :

Happy to see someone paying attention to this issue.

Just to reassume here, for our use case the important things are:

1) not need to change/detach root volume live, but only in shutoff or even a new "maintenance" state (if this could help with other problems)
2) destroying and creating another VM is not an option for us because we have windows OS aboard and changing the VM raises some serious problems: license loss, cloudbase-init restart (it's the windows declination of cloud-init), to name a few.

Revision history for this message
René Gallati (ren6) wrote :

I second the issue here. There are several use cases where having the ability is crucial for operation. Changing a MAC / IP address can be prohibitive in some environment. Consider setups where security is relevant and any changes need to go through a manual review process or when you are using provider nets with interconnects to real existing devices (like RSA token server, etc.) where you just can't have systems change their mac and/or ip addresses.

You need the ability in such cases.

Here's a manual workaround (read: hack) to make it work. I am actually in the process of codifying this into our tooling as this is something we need and cannot wait another year until someone re-enables this.

I write this down here in the hope that it may help others stuck here. Note that perhaps not all these steps are necessary for you. They are for us as we're booting from volumes generated from snapshotted images (on ceph) ie: on nova show serverId I have "Attempt to boot from volume - no image supplied" in the image columns.

- Detaching:
 first ensure VM is stopped
 cinder reset-state --state available oldVolumeId
 mysql cinder 'delete from volume_attachment where volume_id = oldVolumeId';
 mysql cinder 'update volumes set attachment_status="detached" where id = oldVolumeId';
 mysql nova 'delete from block_device_mapping where not deleted and volume_id = oldVolumeId';
 nova volume-attach serverId newVolumeId
 cinder delete oldVolumeId
 mysql nova 'update block_device_mapping set device_name = "/dev/vda" where instance_uuid = serverId';

Now very important, you need to execute this to regenerate the libvirt xml file (it would have bad disk references and would not start otherwise):

nova reboot --hard serverId

Now you should have the new volume mounted, the old deleted (obviously omit that step if you need to keep it) and the vm booting with the new one.

Things that I noticed. Prior to this process, nova volume-detach server volume failes with the known root volume cannot be removed message. Afterwards this procedure it works. For some reason nova thinks the volume is vdb (ie second volume). You see this if you do a volume-detach then volume-attach where it will always be displayed as vdb and you will always need to do the update device mapping to vda and nova reboot --hard to make it start. If anyone knows where to kick nova so that it will accept this new volume as root / vda, I'll happily amend the procedure.

I hope this is useful information until this bug results in working volume-attach / detach for root volumes again.

Revision history for this message
René Gallati (ren6) wrote :

After working half of the day on this issue on my local installation, if all you want to do is to detach a root volume, two updates in mysql are sufficient:

use cinder;
update volume_attachment set mountpoint='/dev/vdb' where volume_id = 'oldVolumeId' and deleted = 0;
use nova;
update block_device_mapping set device_name = '/dev/vdb', boot_index=1 where volume_id = 'oldVolumeId' and deleted = 0;

basically the mountpoint/device name must not be /dev/vda and boot_index must not be 0. If this is true, you can successfully remove the volume using
nova volume-detach <instanceId> <volumeId>.

To attach another volume, do the same thing in reverse after having used nova volume-attach. That is, set the name back to /dev/vda and set the boot_index to 0 for the new volume. Basically below:

use cinder;
update volume_attachment set mountpoint='/dev/vda' where volume_id = 'shinyNewVolumeId' and deleted = 0;
use nova;
update block_device_mapping set device_name = '/dev/vda', boot_index=0 where volume_id = 'shinyNewVolumeId' and deleted = 0;

Then again use the magic incantation:

nova reboot --hard <instanceId>

to force a fresh and correct libvirt config for the instance being created.

Revision history for this message
Jörg Frede (frede-r) wrote :

Yes that may work. But dirty workarounds are not the way to go. There is no valid reason jet why a Volume can not be detached from a Instance while it is shutdown. This is something that was implemented to protect a instance from being damaged because someone detaches the Volume while the Instance is running.

Revision history for this message
Paul Murray (pmurray) wrote :

This is more of a feature than a bug. Adding the ability to detach and attach boot volumes was discussed in the Liberty design summit here: https://etherpad.openstack.org/p/YVR-nova-contributor-meetup#43

I took an action then to write up a spec but got delayed, so it was pushed out to address in Mitaka, see: https://bugs.launchpad.net/nova/+bug/1396965

The notes in the etherpad above a very brief but sum up some of the problems that will need to be addressed. The good news is that the volume attach and detach operations are already there and work for volumes attached as root devices. If you change the conditional in detach_volume() in the volumes API you will be allowed to do it. Attach requires a slight change to the naming to make sure it goes back as the root device.

The things to address are to do with notifications in case it affects billing etc., making operations on the instance safe, how to set the state of the instance and the volumes..... We intend to address all these as part of the spec. Please review it and see if it meets you needs.

Thanks,
paul

melanie witt (melwitt)
Changed in nova:
status: Opinion → Confirmed
Revision history for this message
Gregor (chef-h) wrote :

Here's another use case: We use the "retype" cinder function to relocate existing volumes to another storage backend. As expected, it only works for detached volumes. But since we cannot detach root volumes even from a shut down instance, we cannot retype root volumes at all. Since the majority of the affected instances only got one volume (the root volume), the retype function is basically useless.

Revision history for this message
Vinicius G Ferreira (vini-g-fer) wrote :

Any news on this?
We had the same problem theses days. We follow René Gallati's hack and it works, but this can not be implemented using code.

Revision history for this message
Christoph Fiehe (fiehe) wrote :

I am confronted with the same issue today on Mitaka. I will follow René Gallati's workaround.

Revision history for this message
Thiago Carreira (tcanascimento) wrote :

There is a more simple workaround: just make a clone of the volume - it should work even if the VM is running (The only problem is the data loss. So you'd better stop the VM before it).

For example:

openstack volume create --size 5 --source Vol Name

Should create a copy of the volume source Vol with 5 GB. Just note that the size should at least be equal to the source volume size, and the reference for the attached volume can be its ID or name (better the ID).

This should create a exact copy of the attached volume, without kill the VM or hacking the Cinder and Nova DataBase. :-D

Revision history for this message
Tim Bell (tim-bell) wrote :

Cloning the volume, re-creating the guest VM with the right combination of metadata/flavors and finding the right combination of hostname is something that many of our end users would find quite difficult.

Revision history for this message
Thiago Carreira (tcanascimento) wrote :

Hello Tim Bell, you are right; but I think that is a better solution than changing database tables.

In fact, the best solution is the fixing the bug. The method for download the volume have a --force option (which should stands for the case), but this does not work.

The hack I wrote above can be easly implemented (as a workaround) in a bash script fashion, even in python: just read the user credentials and call a subprocess on Python Code to Openstack CLI passing the command line. Again, this is a hack, not the best solution.

Best regards.

Revision history for this message
Preston L. Bannister (preston-bannister) wrote : Re: [Bug 1396965] Re: Add capability to detach root device volume of an instance, when in shutoff state

This bug (or misfeature) needs to be addressed.

If you only need to do this once, and the volume is not especially large,
workarounds are quite practical.

If your volumes are large, our if you need to do this across hundreds or
thousands (or more) instances, you are going to put massive load on your
storage (for no reason).

Your storage vendor will happily sell you a lot more/higher-performance
storage. You might but be fond of this approach.

On one side you have a relatively cheap fix to the implementation. On the
other side you have a massive cost to each and every cloud.

On Thu, Sep 29, 2016 at 12:07 PM Thiago Carreira <email address hidden>
wrote:

> Hello Tim Bell, you are right; but I think that is a better solution
> than changing database tables.
>
> In fact, the best solution is the fixing the bug. The method for
> download the volume have a --force option (which should stands for the
> case), but this does not work.
>
> The hack I wrote above can be easly implemented (as a workaround) in a
> bash script fashion, even in python: just read the user credentials and
> call a subprocess on Python Code to Openstack CLI passing the command
> line. Again, this is a hack, not the best solution.
>
> Best regards.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1396965
>
> Title:
> Add capability to detach root device volume of an instance, when in
> shutoff state
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/nova/+bug/1396965/+subscriptions
>

Revision history for this message
Christian Kujau (christiank) wrote :

I couldn't find the spec mentioned in this bug, so here it goes:

> Detach and attach boot volumes
> https://specs.openstack.org/openstack/nova-specs/specs/mitaka/approved/detach-boot-volume.html

It summarizes the problem and the use cases quite nicely, and even points to this very bug. At the end it says:

 > Release Name Description
 > Mitaka Introduced

Mitaka was released 2016-04-07 (and is EOL since 2017-04-10), so - has this really been implemented? Will it ever be? With a nova-9.1.1 client installed here, it wasn't able to detach the boot volume either.

Revision history for this message
Ameed Ashour (ameeda) wrote :

Hi Christian,

could I take this bug to solve it?

Regards
Ameed

Revision history for this message
Christian Kujau (christiank) wrote :

@Ameed: I don't understand - do you want to close this bug? Or implement this feature so that boot volumes can be detached in the future? In an Ubuntu 17.10 installation, I still cannot detach the root volume:

$ dpkg -s python3-novaclient | grep Version
Version: 2:9.1.0-0ubuntu1

$ cinder show testvol | egrep boot\|\ id
| bootable | true |
| id | f825ecb1-6e78-4ae6-9995-00df9b8d08bd |

$ nova --debug volume-detach test f825ecb1-6e78-4ae6-9995-00df9b8d08bd
[...]
    return self.request(url, 'DELETE', **kwargs)
  File "/usr/lib/python3/dist-packages/novaclient/client.py", line 83, in request
    raise exceptions.from_response(resp, body, url, method)
novaclient.exceptions.Forbidden: Can't detach root device volume (HTTP 403) (Request-ID: req-f7dfc93d-563b-4bb0-b366-b1c3ae7baa1e)
ERROR (Forbidden): Can't detach root device volume (HTTP 403) (Request-ID: req-f7dfc93d-563b-4bb0-b366-b1c3ae7baa1e)

Revision history for this message
Matt Riedemann (mriedem) wrote :

This is likely superseded by the blueprint to rebuild a volume-backed server with a new image now:

https://blueprints.launchpad.net/nova/+spec/volume-backed-server-rebuild

Revision history for this message
HT (h5t4) wrote :
Download full text (3.5 KiB)

#!/bin/bash
#migrate VM root and/or non-root gpfs to rbd-ec (ceph)
#tested on OS 'Train' release, use with extreme care (DB dump + Volume backups)
#https://bugs.launchpad.net/nova/+bug/1396965

#2020, High Performance Computing Center, University of Tartu, https://hpc.ut.ee

vm_uuid="${1}"
new_backend=rbd-ec

migrate_root() {
echo "Prepare root volume for migration"
dev="$(openstack volume show -c attachments -f value ${root_disk_uuid} | grep -P -m 1 -o "/dev/[s]?[v]?da")"
cinder reset-state --state available ${root_disk_uuid}
mysql -e "delete from cinder.volume_attachment where volume_id=\"${root_disk_uuid}\" limit 1;"
mysql nova -e "delete from block_device_mapping where not deleted and volume_id=\"${root_disk_uuid}\";"

echo "Migrate root volume ${dev}, ${root_disk_uuid}"
cinder retype --migration-policy on-demand ${root_disk_uuid} ${new_backend}
while [ "$(openstack volume show -c migration_status -f value ${root_disk_uuid})" != "success" ] ;do sleep 5; echo -n "."; done
nova volume-attach ${vm_uuid} ${root_disk_uuid} >/dev/null;sleep 15
mysql nova -e "update block_device_mapping SET device_name=\"${dev}\" where volume_id=\"${root_disk_uuid}\";"
mysql nova -e "update block_device_mapping set boot_index=\"0\" where volume_id=\"${root_disk_uuid}\";"
mysql cinder -e "update volume_attachment set mountpoint=\"${dev}\" where volume_id=\"${root_disk_uuid}\";"
src="$(mysql cinder -sNe "select id from volumes where display_description=\"migration src for ${root_disk_uuid}\";")"
mysql cinder -e "UPDATE volume_attachment SET connector = JSON_SET(connector, '$.mountpoint', \"${dev}\") WHERE volume_id=\"${root_disk_uuid}\";"

echo -e "\nManually delete leftover migration source (unknown OS bug)."
cinder reset-state --state error "${src}"
cinder reset-state --reset-migration-status "${src}"
mysql cinder -sNe "update volumes set attach_status='detached' where id=\"${src}\";"
cinder delete "${src}"
}

echo -e "Detect if any migration is needed at all."
vol_list=$(openstack server show -c volumes_attached -f value ${vm_uuid} | awk -F "'" '{print $2}')
if [ -z "$(for i in ${vol_list} ;do echo -n "$(openstack volume show ${i} -c type -f value | grep -v "${new_backend}" )" ;done)" ];then
echo "No illplaced volume detected"; exit 0;
fi

echo "VM stop"
nova stop ${vm_uuid};
while [ "$(openstack server show ${vm_uuid} -c 'OS-EXT-STS:power_state' -f value)" != "Shutdown" ];do sleep 1;echo -n "."; done

echo -e "\nDetach all non-root disks before migration."
for i in ${vol_list} ;do nova volume-detach "${vm_uuid}" "${i}" 2>&1 | grep root >/dev/null && root_disk_uuid=${i};done

echo "Detect if root disk migration is needed."
if [ -n "${root_disk_uuid}" ] && [ "$(openstack volume show ${root_disk_uuid} -c type -f value)" != "${new_backend}" ];then
migrate_root
fi

echo "Migrate all non-root volumes."
for i in ${vol_list} ;do
if [ "${i}" != "${root_disk_uuid}" ] && [ "$(openstack volume show ${i} -c type -f value)" != "${new_backend}" ];then
cinder retype --migration-policy on-demand ${i} "${new_backend}"
while [ "$(openstack volume show -c migration_status -f value ${i})" != "success" ] ;do sleep 5;echo -n ".";done
fi
done

echo -e "\nRe-generate new ...

Read more...

Revision history for this message
Yusuf Güngör (yusuf2) wrote :

Hi, most of the professional cloud solutions supports the root volume detach, i think nova should support this feature too.

Revision history for this message
Markus Moren (markus-moren) wrote :

Hi!
Adding in to this since it seems closely related, live resize of root volume, more specifically being able to extend any volume attached wether in use or not, is also something that should be supported.
I can't find this in any release notes and it once were possible, I believe in Train release using microversion for cinder but the behavior was patched in Ussuri.

Revision history for this message
Mohammad Fatemipour (m-fatemipour) wrote :

Hi,
Is there any update on this bug/feature/misfeature?
We want to retype the volume because new volume type will change front-end QoS, cinder allow this just when volume is detached and when it comes to root volume there is not any well-defined solution!

Revision history for this message
Yusuf Güngör (yusuf2) wrote :

Hi, this situation prevents restoring root volume backups too. Restoring volume backups requires the volume state must be available.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Related blueprints

Remote bug watches

Bug watches keep track of this bug in other bug trackers.