Bug #1672624 “Ceph volumes attached to local deleted instance co...” : Bugs : Cinder

Zhenyu Zheng (zhengzhenyu) on 2017-03-14

summary:	- Volumes attached to local deleted ceph volume could not be correctly + Ceph volumes attached to local deleted instance could not be correctly handled
description:	updated
affects:	nova → cinder

Zhenyu Zheng (zhengzhenyu) on 2017-03-14

affects:

cinder → nova

Revision history for this message

John Garbutt (johngarbutt) wrote on 2017-03-14:

#1

Add the cells tag, as the work has added lots of new code paths for local delete cases. Seeing errors around inconsistent handling of quotas, notifications and other things that hit this code path.

tags:

added: cells

Revision history for this message

melanie witt (melwitt) wrote on 2017-03-15:

#2

Download full text (6.5 KiB)

I couldn't reproduce this issue, I found that the volumes were correctly detached for the local delete.

$ git rev-parse HEAD
edf51119fa59ff8a3337abb9107a06fa33d3c68f

$ nova boot --flavor m1.nano --block-device source=image,id=59f7eeb3-700a-456f-8d3f-9dfc4cce797b,dest=volume,size=1,shutdown=remove,bootindex=0 --poll hi

I couldn't reproduce this issue, I found that the volumes were correctly detached for the local delete.

$ git rev-parse HEAD
edf51119fa59ff8a3337abb9107a06fa33d3c68f

$ nova boot --flavor m1.nano --block-device source=image,id=59f7eeb3-700a-456f-8d3f-9dfc4cce797b,dest=volume,size=1,shutdown=remove,bootindex=0 --poll hi

+--------------------------------------+-------------------------------------------------+
| Property                             | Value                                           |
+--------------------------------------+-------------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                          |
| OS-EXT-AZ:availability_zone          |                                                 |
| OS-EXT-SRV-ATTR:host                 | -                                               |
| OS-EXT-SRV-ATTR:hostname             | hi                                              |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | -                                               |
| OS-EXT-SRV-ATTR:instance_name        |                                                 |
| OS-EXT-SRV-ATTR:kernel_id            |                                                 |
| OS-EXT-SRV-ATTR:launch_index         | 0                                               |
| OS-EXT-SRV-ATTR:ramdisk_id           |                                                 |
| OS-EXT-SRV-ATTR:reservation_id       | r-nk9ygaf0                                      |
| OS-EXT-SRV-ATTR:root_device_name     | -                                               |
| OS-EXT-SRV-ATTR:user_data            | -                                               |
| OS-EXT-STS:power_state               | 0                                               |
| OS-EXT-STS:task_state                | scheduling                                      |
| OS-EXT-STS:vm_state                  | building                                        |
| OS-SRV-USG:launched_at               | -                                               |
| OS-SRV-USG:terminated_at             | -                                               |
| accessIPv4                           |                                                 |
| accessIPv6                           |                                                 |
| adminPass                            | aiGbhNhGC7nB                                    |
| config_drive                         |                                                 |
| created                              | 2017-03-15T04:27:47Z                            |
| description                          | -                                               |
| flavor                               | m1.nano (42)                                    |
| hostId                               |                                                 |
| host_status                          |                                                 |
| id                                   | 957b1fa8-0b9c-483d-a049-499cd4545188            |
| image                                | Attempt to boot from volume - no image supplied |
| key_name                             | -                                               |
| locked                               | False                                           |
| metadata                             | {}                                              |
| name                                 | hi                                              |
| os-extended-volumes:volumes_attached | []                                              |
| progress                             | 0                                               |
| security_groups                      | default                                         |
| status                               | BUILD                                           |
| tags                                 | []                                              |
| tenant_id                            | 1dc696f01a67429a88a20273b5e52e10                |
| updated                              | 2017-03-15T04:27:46Z                            |
| user_id                              | 488ae3ccfbba4655b201fa7c8fbb2686                |
+--------------------------------------+-------------------------------------------------+

Server building... 100% complete
Finished

$ nova service-list

+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
| Id | Binary           | Host          | Zone     | Status  | State | Updated_at                 | Disabled Reason |
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
| 7  | nova-conductor   | ubuntu-xenial | internal | enabled | up    | 2017-03-15T04:29:35.000000 | -               |
| 10 | nova-scheduler   | ubuntu-xenial | internal | enabled | up    | 2017-03-15T04:29:33.000000 | -               |
| 11 | nova-consoleauth | ubuntu-xenial | internal | enabled | up    | 2017-03-15T04:29:40.000000 | -               |
| 12 | nova-compute     | ubuntu-xenial | nova     | enabled | down  | 2017-03-15T04:28:32.000000 | -               |
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+

$ nova list

$ nova delete hi
Request to delete server hi has been accepted.

Changed in nova:
status:	New → Incomplete

Revision history for this message

melanie witt (melwitt) wrote on 2017-03-15:

#3

Download full text (10.3 KiB)

Just tried another case with shutdown=preserve and found I couldn't delete the volume from cinder, as you described. So far, it looks like a problem in cinder? The volume showed as "available" and not attached to anything in cinder, then the "cinder delete" failed to complete.

$ nova boot --flavor m1.nano --block-device source=image,id=59f7eeb3-700a-456f-8d3f-9dfc4cce797b,dest=volume,size=1,shutdown=preserve,bootindex=0 --poll hi

Just tried another case with shutdown=preserve and found I couldn't delete the volume from cinder, as you described. So far, it looks like a problem in cinder? The volume showed as "available" and not attached to anything in cinder, then the "cinder delete" failed to complete.

$ nova boot --flavor m1.nano --block-device source=image,id=59f7eeb3-700a-456f-8d3f-9dfc4cce797b,dest=volume,size=1,shutdown=preserve,bootindex=0 --poll hi

+--------------------------------------+-------------------------------------------------+
| Property                             | Value                                           |
+--------------------------------------+-------------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                          |
| OS-EXT-AZ:availability_zone          |                                                 |
| OS-EXT-SRV-ATTR:host                 | -                                               |
| OS-EXT-SRV-ATTR:hostname             | hi                                              |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | -                                               |
| OS-EXT-SRV-ATTR:instance_name        |                                                 |
| OS-EXT-SRV-ATTR:kernel_id            |                                                 |
| OS-EXT-SRV-ATTR:launch_index         | 0                                               |
| OS-EXT-SRV-ATTR:ramdisk_id           |                                                 |
| OS-EXT-SRV-ATTR:reservation_id       | r-1j6d6cne                                      |
| OS-EXT-SRV-ATTR:root_device_name     | -                                               |
| OS-EXT-SRV-ATTR:user_data            | -                                               |
| OS-EXT-STS:power_state               | 0                                               |
| OS-EXT-STS:task_state                | scheduling                                      |
| OS-EXT-STS:vm_state                  | building                                        |
| OS-SRV-USG:launched_at               | -                                               |
| OS-SRV-USG:terminated_at             | -                                               |
| accessIPv4                           |                                                 |
| accessIPv6                           |                                                 |
| adminPass                            | M6onRYNPbEJ4                                    |
| config_drive                         |                                                 |
| created                              | 2017-03-15T04:37:29Z                            |
| description                          | -                                               |
| flavor                               | m1.nano (42)                                    |
| hostId                               |                                                 |
| host_status                          |                                                 |
| id                                   | 48c9f377-52d4-46b9-bd6b-d9de2d0dc540            |
| image                                | Attempt to boot from volume - no image supplied |
| key_name                             | -                                               |
| locked                               | False                                           |
| metadata                             | {}                                              |
| name                                 | hi                                              |
| os-extended-volumes:volumes_attached | []                                              |
| progress                             | 0                                               |
| security_groups                      | default                                         |
| status                               | BUILD                                           |
| tags                                 | []                                              |
| tenant_id                            | 1dc696f01a67429a88a20273b5e52e10                |
| updated                              | 2017-03-15T04:37:29Z                            |
| user_id                              | 488ae3ccfbba4655b201fa7c8fbb2686                |
+--------------------------------------+-------------------------------------------------+

Server building... 100% complete
Finished

$ nova list

$ nova service-list

+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
| Id | Binary           | Host          | Zone     | Status  | State | Updated_at                 | Disabled Reason |
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+
| 7  | nova-conductor   | ubuntu-xenial | internal | enabled | up    | 2017-03-15T04:40:55.000000 | -               |
| 10 | nova-scheduler   | ubuntu-xenial | internal | enabled | up    | 2017-03-15T04:40:54.000000 | -               |
| 11 | nova-consoleauth | ubuntu-xenial | internal | enabled | up    | 2017-03-15T04:40:50.000000 | -               |
| 12 | nova-compute     | ubuntu-xenial | nova     | enabled | down  | 2017-03-15T04:38:08.000000 | -               |
+----+------------------+---------------+----------+---------+-------+----------------------------+-----------------+

$ nova delete hi

Request to delete server hi has been accepted.

$ nova list

$ cinder delete 08e57275-e4d6-4e1e-a781-c5f3d27d06b3
Request to delete volume 08e57275-e4d6-4e1e-a781-c5f3d27d06b3 has been accepted.

with a trace in c-vol.log:

2017-03-15 04:41:31.481 DEBUG cinder.volume.drivers.rbd [req-17ce097b-c92d-43a2-99c2-f9ac05e21ad6 admin None] deleting rbd volume volume-08e57275-e4d6-4e1e-a781-c5f3d27d06b3 from (pid=3285) delete_volume /opt/stack/cinder/cinder/volume/drivers/rbd.py:781
2017-03-15 04:41:31.514 DEBUG cinder.utils [req-17ce097b-c92d-43a2-99c2-f9ac05e21ad6 admin None] Failed attempt 1 from (pid=3285) _print_stop /opt/stack/cinder/cinder/utils.py:780
2017-03-15 04:41:31.515 DEBUG cinder.utils [req-17ce097b-c92d-43a2-99c2-f9ac05e21ad6 admin None] Have been at this for 0.032 seconds from (pid=3285) _print_stop /opt/stack/cinder/cinder/utils.py:782
2017-03-15 04:41:31.516 DEBUG cinder.utils [req-17ce097b-c92d-43a2-99c2-f9ac05e21ad6 admin None] Sleeping for 10.0 seconds from (pid=3285) _backoff_sleep /opt/stack/cinder/cinder/utils.py:774
2017-03-15 04:41:41.553 DEBUG cinder.utils [req-17ce097b-c92d-43a2-99c2-f9ac05e21ad6 admin None] Failed attempt 2 from (pid=3285) _print_stop /opt/stack/cinder/cinder/utils.py:780
2017-03-15 04:41:41.554 DEBUG cinder.utils [req-17ce097b-c92d-43a2-99c2-f9ac05e21ad6 admin None] Have been at this for 10.072 seconds from (pid=3285) _print_stop /opt/stack/cinder/cinder/utils.py:782
2017-03-15 04:41:41.555 DEBUG cinder.utils [req-17ce097b-c92d-43a2-99c2-f9ac05e21ad6 admin None] Sleeping for 20.0 seconds from (pid=3285) _backoff_sleep /opt/stack/cinder/cinder/utils.py:774
2017-03-15 04:42:01.593 DEBUG cinder.utils [req-17ce097b-c92d-43a2-99c2-f9ac05e21ad6 admin None] Failed attempt 3 from (pid=3285) _print_stop /opt/stack/cinder/cinder/utils.py:780
2017-03-15 04:42:01.594 DEBUG cinder.utils [req-17ce097b-c92d-43a2-99c2-f9ac05e21ad6 admin None] Have been at this for 30.112 seconds from (pid=3285) _print_stop /opt/stack/cinder/cinder/utils.py:782
2017-03-15 04:42:01.595 WARNING cinder.volume.drivers.rbd [req-17ce097b-c92d-43a2-99c2-f9ac05e21ad6 admin None] ImageBusy error raised while deleting rbd volume. This may have been caused by a connection from a client that has crashed and, if so, may be resolved by retrying the delete after 30 seconds has elapsed.
2017-03-15 04:42:01.600 ERROR cinder.volume.manager [req-17ce097b-c92d-43a2-99c2-f9ac05e21ad6 admin None] Unable to delete busy volume.

Revision history for this message

Zhenyu Zheng (zhengzhenyu) wrote on 2017-03-15:

#4

@melanie witt, Hi, thanks for the test, but I think you might not doing it correctly,
LVM volume can be correctly handled, but CEPH volume can't, and by looking at the code
it is likely that only LVM driver can be correctly handled. I will ask my colleague update
our test log to the report.

Changed in nova:
status:	Incomplete → New

Revision history for this message

Zhenyu Zheng (zhengzhenyu) wrote on 2017-03-15:

#5

@melanie witt, Ah, looks like you are doing with CEPH backend, and yes, what we tried is
create a volume first and then attach it to an instance, it should be equal with the
shutdown=preserve scenario?

And it is an Nova-Cinder issue, because nova didn't disconnect the volume due to compute down, and
currently cinder can only handle LVM volume in this kind of scenario. So it is probably a common
issue for all other volume backends except LVM.

Revision history for this message

wangxiyuan (wangxiyuan) wrote on 2017-03-15:

#6

LVM works well because LVM removes export and remove the iscsi target by itself:

https://github.com/openstack/cinder/blob/master/cinder/volume/targets/iscsi.py#L220

But ceph does nothing.

https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/rbd.py#L1019
https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/rbd.py#L1043

So when nove-compute is down and nova deletes the vm with local delete logic, the vm in hypervisor is still alive and the volume is still attached to the vm. Only when the vm is destroyed, the volume can be deleted successfully.

Take a look at the cinder third part driver, we can see that most of drivers leave these two function empty (except dell_emc, dothill, hpe and so on). So I guess most of the drivers has this problem as well.

reproduce:
http://paste.openstack.org/show/602801/

Revision history for this message

Lee Yarwood (lyarwood) wrote on 2017-03-15:

#7

To be clear, the issue here is that the domain is still active on the compute node and accessing the rbd volume so any attempt to delete the rbd volume itself fails.

I can't see how that's anything other than a failure by the operator to ensure that the original compute node has been fenced before attempting a local delete of an active instance.

Revision history for this message

Matthew Booth (mbooth-9) wrote on 2017-03-15:

#8

Well, the operator hasn't done a local delete, they've just done a delete. Nova did a local delete because nova compute wasn't running. We also wouldn't want to fence in this case because, in general, we don't want to kill running instances unless we really have to. Nova compute being down isn't 'really have to' under most circumstances because it can just be restarted: an HA response should focus on trying to get it back up again. I don't think we can reasonably blame the operator here.

As you say, though, this does mean that an attempt to delete the volume fails because ceph won't allow us to delete a volume that still has an active connection.

Presumably when nova compute eventually comes back up we will clean up the running instance which was deleted. At this point, presumably the ceph volume can also be deleted. IOW, if we wait a bit for normal maintenance to happen, this will resolve itself automatically. I think that's a cloudy 'working as expected', especially for a weird edge case like this.

Revision history for this message

melanie witt (melwitt) wrote on 2017-03-15:

#9

Download full text (6.4 KiB)

Confirmed that the default cleanup periodic task in compute will destroy local deleted domains when nova-compute is brought back up. The default action is "reap" which will delete orphaned domains and the periodic task runs every 30 minutes by default. It doesn't run immediately when nova-compute starts, so it takes 30 minutes after nova-compute comes back up for the local deleted instance to be reaped.

Afterward, the volume can be deleted from cinder.

The config options for cleaning local deleted instances are:

    cfg.StrOpt("running_deleted_instance_action",
        default="reap",
        choices=('noop', 'log', 'shutdown', 'reap'),
        help="""
The compute service periodically checks for instances that have been
deleted in the database but remain running on the compute node. The
above option enables action to be taken when such instances are
identified.

Possible values:

* reap: Powers down the instances and deletes them(default)
* log: Logs warning message about deletion of the resource
* shutdown: Powers down instances and marks them as non-
bootable which can be later used for debugging/analysis
* noop: Takes no action

    cfg.IntOpt("running_deleted_instance_poll_interval",
        default=1800,
        help="""
Time interval in seconds to wait between runs for the clean up action.
If set to 0, above check will be disabled. If "running_deleted_instance
_action" is set to "log" or "reap", a value greater than 0 must be set.

Possible values:

* Any positive integer in seconds enables the option.
* 0: Disables the option.
* 1800: Default value.

    cfg.IntOpt("running_deleted_instance_timeout",
        default=0,
        help="""
Time interval in seconds to wait for the instances that have
been marked as deleted in database to be eligible for cleanup.

Possible values:

* Any positive integer in seconds(default is 0).

n-cpu.log:

2017-03-15 16:54:44.939 DEBUG oslo_service.periodic_task [req-24c2664b-fe22-44a8-8e72-0b809adf50f1 None None] Running periodic task ComputeManager._cleanup_running_deleted_instances from (pid=14524) run_periodic_tasks /usr/local/lib/python2.7/dist-packages/oslo_service/periodic_task.py:215
2017-03-15 16:54:44.947 DEBUG oslo_messaging._drivers.amqpdriver [req-24c2664b-fe22-44a8-8e72-0b809adf50f1 None None] CALL msg_id: b7adb6d039274979a8c970f83bbef78f exchange 'nova' topic 'conductor' from (pid=14524) _send /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:442
2017-03-15 16:54:44.991 DEBUG oslo_messaging._drivers.amqpdriver [-] received reply msg_id: b7adb6d039274979a8c970f83bbef78f from (pid=14524) __call__ /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:299
2017-03-15 16:54:44.995 INFO nova.compute.manager [req-24c2664b-fe22-44a8-8e72-0b809adf50f1 None None] [instance: 67244c6e-7e04-4bf1-8dc2-06d6dcfb9a89] Destroying instance with name label 'instance-00000002' which is marked as DELETED but still present on host.
2017-03-15 16:54:44.996 DEBUG oslo_messaging._drivers.amqpdriver [req-24c2664b-fe22-44a8-8e72-0b809adf50f1 None None] CALL msg_id: 8094e4320a41471b9e8d928c1fd23ef8 exchange 'nova' topic 'conductor' from (pid=14524) _send /usr/loca...

Confirmed that the default cleanup periodic task in compute will destroy local deleted domains when nova-compute is brought back up. The default action is "reap" which will delete orphaned domains and the periodic task runs every 30 minutes by default. It doesn't run immediately when nova-compute starts, so it takes 30 minutes after nova-compute comes back up for the local deleted instance to be reaped.

Afterward, the volume can be deleted from cinder.

The config options for cleaning local deleted instances are:

cfg.StrOpt("running_deleted_instance_action",
        default="reap",
        choices=('noop', 'log', 'shutdown', 'reap'),
        help="""
The compute service periodically checks for instances that have been
deleted in the database but remain running on the compute node. The
above option enables action to be taken when such instances are
identified.

Possible values:

* reap: Powers down the instances and deletes them(default)
* log: Logs warning message about deletion of the resource
* shutdown: Powers down instances and marks them as non-
  bootable which can be later used for debugging/analysis
* noop: Takes no action

cfg.IntOpt("running_deleted_instance_poll_interval",
        default=1800,
        help="""
Time interval in seconds to wait between runs for the clean up action.
If set to 0, above check will be disabled. If "running_deleted_instance
_action" is set to "log" or "reap", a value greater than 0 must be set.

Possible values:

* Any positive integer in seconds enables the option.
* 0: Disables the option.
* 1800: Default value.

cfg.IntOpt("running_deleted_instance_timeout",
        default=0,
        help="""
Time interval in seconds to wait for the instances that have
been marked as deleted in database to be eligible for cleanup.

Possible values:

* Any positive integer in seconds(default is 0).

n-cpu.log:

2017-03-15 16:54:44.939 DEBUG oslo_service.periodic_task [req-24c2664b-fe22-44a8-8e72-0b809adf50f1 None None] Running periodic task ComputeManager._cleanup_running_deleted_instances from (pid=14524) run_periodic_tasks /usr/local/lib/python2.7/dist-packages/oslo_service/periodic_task.py:215
2017-03-15 16:54:44.947 DEBUG oslo_messaging._drivers.amqpdriver [req-24c2664b-fe22-44a8-8e72-0b809adf50f1 None None] CALL msg_id: b7adb6d039274979a8c970f83bbef78f exchange 'nova' topic 'conductor' from (pid=14524) _send /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:442
2017-03-15 16:54:44.991 DEBUG oslo_messaging._drivers.amqpdriver [-] received reply msg_id: b7adb6d039274979a8c970f83bbef78f from (pid=14524) __call__ /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:299
2017-03-15 16:54:44.995 INFO nova.compute.manager [req-24c2664b-fe22-44a8-8e72-0b809adf50f1 None None] [instance: 67244c6e-7e04-4bf1-8dc2-06d6dcfb9a89] Destroying instance with name label 'instance-00000002' which is marked as DELETED but still present on host.
2017-03-15 16:54:44.996 DEBUG oslo_messaging._drivers.amqpdriver [req-24c2664b-fe22-44a8-8e72-0b809adf50f1 None None] CALL msg_id: 8094e4320a41471b9e8d928c1fd23ef8 exchange 'nova' topic 'conductor' from (pid=14524) _send /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:442
2017-03-15 16:54:45.013 DEBUG oslo_messaging._drivers.amqpdriver [-] received reply msg_id: 8094e4320a41471b9e8d928c1fd23ef8 from (pid=14524) __call__ /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:299
2017-03-15 16:54:45.015 DEBUG oslo_concurrency.lockutils [req-24c2664b-fe22-44a8-8e72-0b809adf50f1 None None] Lock "67244c6e-7e04-4bf1-8dc2-06d6dcfb9a89-events" acquired by "nova.compute.manager._clear_events" :: waited 0.000s from (pid=14524) inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:270
2017-03-15 16:54:45.016 DEBUG oslo_concurrency.lockutils [req-24c2664b-fe22-44a8-8e72-0b809adf50f1 None None] Lock "67244c6e-7e04-4bf1-8dc2-06d6dcfb9a89-events" released by "nova.compute.manager._clear_events" :: held 0.001s from (pid=14524) inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:282
2017-03-15 16:54:45.017 INFO nova.compute.manager [req-24c2664b-fe22-44a8-8e72-0b809adf50f1 None None] [instance: 67244c6e-7e04-4bf1-8dc2-06d6dcfb9a89] Terminating instance
2017-03-15 16:54:45.017 DEBUG nova.objects.instance [req-24c2664b-fe22-44a8-8e72-0b809adf50f1 None None] Lazy-loading 'info_cache' on Instance uuid 67244c6e-7e04-4bf1-8dc2-06d6dcfb9a89 from (pid=14524) obj_load_attr /opt/stack/nova/nova/objects/instance.py:1034
2017-03-15 16:54:45.019 DEBUG oslo_messaging._drivers.amqpdriver [req-24c2664b-fe22-44a8-8e72-0b809adf50f1 None None] CALL msg_id: 19a63225ce49479c8583fef3b219c5c8 exchange 'nova' topic 'conductor' from (pid=14524) _send /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:442
2017-03-15 16:54:45.073 DEBUG oslo_messaging._drivers.amqpdriver [-] received reply msg_id: 19a63225ce49479c8583fef3b219c5c8 from (pid=14524) __call__ /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:299
2017-03-15 16:54:45.077 DEBUG nova.compute.manager [req-24c2664b-fe22-44a8-8e72-0b809adf50f1 None None] [instance: 67244c6e-7e04-4bf1-8dc2-06d6dcfb9a89] Start destroying the instance on the hypervisor. from (pid=14524) _shutdown_instance /opt/stack/nova/nova/compute/manager.py:2242
2017-03-15 16:54:45.546 INFO nova.virt.libvirt.driver [-] [instance: 67244c6e-7e04-4bf1-8dc2-06d6dcfb9a89] Instance destroyed successfully.

$ cinder delete f19c27c5-596f-4d30-9a3f-bd7df7b43e12
Request to delete volume f19c27c5-596f-4d30-9a3f-bd7df7b43e12 has been accepted.

Revision history for this message

melanie witt (melwitt) wrote on 2017-03-15:

#10

This behavior is expected in the scenario where nova-compute is brought down but the instance domain is still running and keeping the volume busy. Usually in practice, nova-compute should only be down in the case of a failed compute host or one that has been properly fenced by the operator. However, if nova-compute goes down temporarily and an instance is local deleted, the instance domain will be reaped by the nova-compute periodic task after nova-compute comes back up. After the domain is reaped, the volume can be successfully deleted from cinder.

Changed in nova:
status:	New → Invalid

Revision history for this message

Zhenyu Zheng (zhengzhenyu) wrote on 2017-03-16:

#11

This may also affect "evacuate" operation.

Affects		Status	Importance	Assigned to	Milestone
	Cinder	New	Undecided	Unassigned
	OpenStack Compute (nova)	Invalid	Undecided	Unassigned

Cinder

Ceph volumes attached to local deleted instance could not be correctly handled

Bug Description

Other bug subscribers

Remote bug watches