nova soft-delete leaves "Attached to None" volumes

Bug #1560300 reported by apporc on 2016-03-22
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Low
Anusha Unnam

Bug Description

1. version

`git log -1`
c4763d46fe76c524363a0cf55d1e8afe4bd23f53

This is the version i used to test on my devstack, but in fact this bug exists from at least Juno release till now as i know.

2. Relevant log:

When a soft-delete is made on one instance boot from volume. There will be one line like this in nova-compute.log:

WARNING nova.compute.manager [req-7bbc1701-fbce-41bc-8182-b2cbb6e5ac93 None None] [instance: a3645529-6b11-437e-b1e4-773e87db7223] Ignoring EndpointNotFound: The service catalog is empty.

This is because nova-compute uses a separate thread to do reclaiming instances job, which has an admin context. Since there is no service_catalog in admin context, nova-compute will raise exception EndpointNotFound while it tries to detach the volume.

3. Reproduce steps:

(1) Set a non-zero value for reclaim_instance_interval in /etc/nova/nova.conf on both nova-controller and nova-compute nodes.
reclaim_instance_interval=10, eg. This enables soft-delete feature.

(2) Create an instance with this:

nova boot --flavor xxx --block-device id=<image id>,source=image,dest=volume,size=<volume size>,bootindex=0 --nic net-id=<network uuid> test

(3) Delete the created instance:

nova delete test

(4) On the nova-compute node which hosted "test", there will be one warnging in nova-compute.log like this[NOTE: you should wait until the reclaim_instance_interval is ended, until then nova-compute are going to really terminate the instance]:

 "WARNING nova.compute.manager [req-7bbc1701-fbce-41bc-8182-b2cbb6e5ac93 None None] [instance: a3645529-6b11-437e-b1e4-773e87db7223] Ignoring EndpointNotFound: The service catalog is empty."

(5) if you list your volumes, you can find there still exists one volume attached to the deleted "test" instance. Check that on dashboard, the volume info says "Attached to None".

(6) If you try to delete the that volume with " cinder delete <volume id >". It says the volume is unable to be deleted because it is in attached status.

4. Expected result:

soft-delete detached instance's volume.

5. Actual result:

the volume is still left attached, and undeletable.

Wenzhi Yu (yuywz) on 2016-03-22
Changed in nova:
assignee: nobody → Wenzhi Yu (yuywz)
Changed in nova:
assignee: Wenzhi Yu (yuywz) → apporc (appleorchard2000)
Wenzhi Yu (yuywz) wrote :

Hi apporc, are you working on this?

Yeah, i was working on this before i submitted this bug. Sorry, i forgot to
assign to myself when submit it.

On Tue, Mar 22, 2016 at 3:06 PM, Wenzhi Yu <wenzhi_yu@163.com> wrote:

> Hi apporc, are you working on this?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1560300
>
> Title:
> nova soft-delete leaves "Attached to None" volumes
>
> Status in OpenStack Compute (nova):
> New
>
> Bug description:
> 1. version
>
> `git log -1`
> c4763d46fe76c524363a0cf55d1e8afe4bd23f53
>
> This is the version i used to test on my devstack, but in fact this
> bug exists from at least Juno release till now as i know.
>
> 2. Relevant log:
>
> When a soft-delete is made on one instance boot from volume. There
> will be one line like this in nova-compute.log:
>
> WARNING nova.compute.manager [req-7bbc1701-fbce-41bc-8182-b2cbb6e5ac93
> None None] [instance: a3645529-6b11-437e-b1e4-773e87db7223] Ignoring
> EndpointNotFound: The service catalog is empty.
>
> This is because nova-compute uses a separate thread to do reclaiming
> instances job, which has an admin context. Since there is no
> service_catalog in admin context, nova-compute will raise exception
> EndpointNotFound while it tries to detach the volume.
>
> 3. Reproduce steps:
>
> (1) Set a non-zero value for reclaim_instance_interval in
> /etc/nova/nova.conf on both nova-controller and nova-compute nodes.
> reclaim_instance_interval=10, eg. This enables soft-delete feature.
>
> (2) Create an instance with this:
>
> nova boot --flavor xxx --block-device id=<image
> id>,source=image,dest=volume,size=<volume size>,bootindex=0 --nic net-
> id=<network uuid> test
>
> (3) Delete the created instance:
>
> nova delete test
>
> (4) On the nova-compute node which hosted "test", there will be one
> warnging in nova-compute.log like this[NOTE: you should wait until the
> reclaim_instance_interval is ended, until then nova-compute are going
> to really terminate the instance]:
>
> "WARNING nova.compute.manager [req-7bbc1701-fbce-
> 41bc-8182-b2cbb6e5ac93 None None] [instance: a3645529-6b11-437e-
> b1e4-773e87db7223] Ignoring EndpointNotFound: The service catalog is
> empty."
>
> (5) if you list your volumes, you can find there still exists one
> volume attached to the deleted "test" instance. Check that on
> dashboard, the volume info says "Attached to None".
>
> (6) If you try to delete the that volume with " cinder delete <volume
> id >". It says the volume is unable to be deleted because it is in
> attached status.
>
> 4. Expected result:
>
> soft-delete detached instance's volume.
>
> 5. Actual result:
>
> the volume is still left attached, and undeletable.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/nova/+bug/1560300/+subscriptions
>

--
Regards,
apporc

Wenzhi Yu (yuywz) wrote :

Got it, that's OK.

Sylvain Bauza (sylvain-bauza) wrote :

That sounds a design problem : should we detach the volume when soft-deleting the instance, or should we wait for the instance to be hard-deleted ?

In case we're detaching the volume, what if someone wants to resurrect their instance but the volume is attached to another instance ?

Changed in nova:
status: New → Confirmed
importance: Undecided → Low
tags: added: volumes
tags: added: low-hanging-fruit
Wenzhi Yu (yuywz) wrote :

Agree with @Sylvain. I think it's more reasonable to detach the volume till the instance been really terminated rather than detaching volume when soft-deleting instance.

apporc (appleorchard2000) wrote :

Currently nova already trys to detach the volume when the interval is up.
It is just nova do not know where to send the detach request, as a result
of Endpointnotfound.
2016年3月22日 下午8:31,"Wenzhi Yu" <wenzhi_yu@163.com>写道:

> Agree with @Sylvain. I think it's more reasonable to detach the volume
> till the instance been really terminated rather than detaching volume
> when soft-deleting instance.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1560300
>
> Title:
> nova soft-delete leaves "Attached to None" volumes
>
> Status in OpenStack Compute (nova):
> Confirmed
>
> Bug description:
> 1. version
>
> `git log -1`
> c4763d46fe76c524363a0cf55d1e8afe4bd23f53
>
> This is the version i used to test on my devstack, but in fact this
> bug exists from at least Juno release till now as i know.
>
> 2. Relevant log:
>
> When a soft-delete is made on one instance boot from volume. There
> will be one line like this in nova-compute.log:
>
> WARNING nova.compute.manager [req-7bbc1701-fbce-41bc-8182-b2cbb6e5ac93
> None None] [instance: a3645529-6b11-437e-b1e4-773e87db7223] Ignoring
> EndpointNotFound: The service catalog is empty.
>
> This is because nova-compute uses a separate thread to do reclaiming
> instances job, which has an admin context. Since there is no
> service_catalog in admin context, nova-compute will raise exception
> EndpointNotFound while it tries to detach the volume.
>
> 3. Reproduce steps:
>
> (1) Set a non-zero value for reclaim_instance_interval in
> /etc/nova/nova.conf on both nova-controller and nova-compute nodes.
> reclaim_instance_interval=10, eg. This enables soft-delete feature.
>
> (2) Create an instance with this:
>
> nova boot --flavor xxx --block-device id=<image
> id>,source=image,dest=volume,size=<volume size>,bootindex=0 --nic net-
> id=<network uuid> test
>
> (3) Delete the created instance:
>
> nova delete test
>
> (4) On the nova-compute node which hosted "test", there will be one
> warnging in nova-compute.log like this[NOTE: you should wait until the
> reclaim_instance_interval is ended, until then nova-compute are going
> to really terminate the instance]:
>
> "WARNING nova.compute.manager [req-7bbc1701-fbce-
> 41bc-8182-b2cbb6e5ac93 None None] [instance: a3645529-6b11-437e-
> b1e4-773e87db7223] Ignoring EndpointNotFound: The service catalog is
> empty."
>
> (5) if you list your volumes, you can find there still exists one
> volume attached to the deleted "test" instance. Check that on
> dashboard, the volume info says "Attached to None".
>
> (6) If you try to delete the that volume with " cinder delete <volume
> id >". It says the volume is unable to be deleted because it is in
> attached status.
>
> 4. Expected result:
>
> soft-delete detached instance's volume.
>
> 5. Actual result:
>
> the volume is still left attached, and undeletable.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/nova/+bug/1560300/+subscriptions
>

apporc (appleorchard2000) wrote :
Download full text (3.7 KiB)

I may need to clarify the scene.This is no design problem. Nova did not try
to detach that volume on soft-deleting. It trys to do that when the reclaim
interval is up and the instance to be really terminated. Before is nova
leaves the volume attached to that instance, if someone restore that
instance, no problem at all.

The behaviour of nova is just the same as what you expected. But when nova
went to terminate this soft-deleted instance, it failed to detach the
volume. And it ignored that exception, leave that volume undeletable.
2016年3月22日 下午9:18,"me,apporc" <email address hidden>写道:

> Currently nova already trys to detach the volume when the interval is up.
> It is just nova do not know where to send the detach request, as a result
> of Endpointnotfound.
> 2016年3月22日 下午8:31,"Wenzhi Yu" <wenzhi_yu@163.com>写道:
>
>> Agree with @Sylvain. I think it's more reasonable to detach the volume
>> till the instance been really terminated rather than detaching volume
>> when soft-deleting instance.
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https://bugs.launchpad.net/bugs/1560300
>>
>> Title:
>> nova soft-delete leaves "Attached to None" volumes
>>
>> Status in OpenStack Compute (nova):
>> Confirmed
>>
>> Bug description:
>> 1. version
>>
>> `git log -1`
>> c4763d46fe76c524363a0cf55d1e8afe4bd23f53
>>
>> This is the version i used to test on my devstack, but in fact this
>> bug exists from at least Juno release till now as i know.
>>
>> 2. Relevant log:
>>
>> When a soft-delete is made on one instance boot from volume. There
>> will be one line like this in nova-compute.log:
>>
>> WARNING nova.compute.manager [req-7bbc1701-fbce-41bc-8182-b2cbb6e5ac93
>> None None] [instance: a3645529-6b11-437e-b1e4-773e87db7223] Ignoring
>> EndpointNotFound: The service catalog is empty.
>>
>> This is because nova-compute uses a separate thread to do reclaiming
>> instances job, which has an admin context. Since there is no
>> service_catalog in admin context, nova-compute will raise exception
>> EndpointNotFound while it tries to detach the volume.
>>
>> 3. Reproduce steps:
>>
>> (1) Set a non-zero value for reclaim_instance_interval in
>> /etc/nova/nova.conf on both nova-controller and nova-compute nodes.
>> reclaim_instance_interval=10, eg. This enables soft-delete feature.
>>
>> (2) Create an instance with this:
>>
>> nova boot --flavor xxx --block-device id=<image
>> id>,source=image,dest=volume,size=<volume size>,bootindex=0 --nic net-
>> id=<network uuid> test
>>
>> (3) Delete the created instance:
>>
>> nova delete test
>>
>> (4) On the nova-compute node which hosted "test", there will be one
>> warnging in nova-compute.log like this[NOTE: you should wait until the
>> reclaim_instance_interval is ended, until then nova-compute are going
>> to really terminate the instance]:
>>
>> "WARNING nova.compute.manager [req-7bbc1701-fbce-
>> 41bc-8182-b2cbb6e5ac93 None None] [instance: a3645529-6b11-437e-
>> b1e4-773e87db7223] Ignoring EndpointNotFound: The service catalog is
>> empty."
>>
>> (5) if you list your volumes, you can find the...

Read more...

Fix proposed to branch: master
Review: https://review.openstack.org/296890

Changed in nova:
status: Confirmed → In Progress

Change abandoned by Michael Still (<email address hidden>) on branch: master
Review: https://review.openstack.org/296890
Reason: This code hasn't been updated in a long time, and is in merge conflict. I am going to abandon this review, but feel free to restore it if you're still working on this.

Anusha Unnam (anusha-unnam) wrote :

apporc,

Are you still working on this bug?

Actually not, you can assign it yourself.

2016年9月13日 04:46,"Anusha Unnam" <email address hidden>写道:

> apporc,
>
> Are you still working on this bug?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1560300
>
> Title:
> nova soft-delete leaves "Attached to None" volumes
>
> Status in OpenStack Compute (nova):
> In Progress
>
> Bug description:
> 1. version
>
> `git log -1`
> c4763d46fe76c524363a0cf55d1e8afe4bd23f53
>
> This is the version i used to test on my devstack, but in fact this
> bug exists from at least Juno release till now as i know.
>
> 2. Relevant log:
>
> When a soft-delete is made on one instance boot from volume. There
> will be one line like this in nova-compute.log:
>
> WARNING nova.compute.manager [req-7bbc1701-fbce-41bc-8182-b2cbb6e5ac93
> None None] [instance: a3645529-6b11-437e-b1e4-773e87db7223] Ignoring
> EndpointNotFound: The service catalog is empty.
>
> This is because nova-compute uses a separate thread to do reclaiming
> instances job, which has an admin context. Since there is no
> service_catalog in admin context, nova-compute will raise exception
> EndpointNotFound while it tries to detach the volume.
>
> 3. Reproduce steps:
>
> (1) Set a non-zero value for reclaim_instance_interval in
> /etc/nova/nova.conf on both nova-controller and nova-compute nodes.
> reclaim_instance_interval=10, eg. This enables soft-delete feature.
>
> (2) Create an instance with this:
>
> nova boot --flavor xxx --block-device id=<image
> id>,source=image,dest=volume,size=<volume size>,bootindex=0 --nic net-
> id=<network uuid> test
>
> (3) Delete the created instance:
>
> nova delete test
>
> (4) On the nova-compute node which hosted "test", there will be one
> warnging in nova-compute.log like this[NOTE: you should wait until the
> reclaim_instance_interval is ended, until then nova-compute are going
> to really terminate the instance]:
>
> "WARNING nova.compute.manager [req-7bbc1701-fbce-
> 41bc-8182-b2cbb6e5ac93 None None] [instance: a3645529-6b11-437e-
> b1e4-773e87db7223] Ignoring EndpointNotFound: The service catalog is
> empty."
>
> (5) if you list your volumes, you can find there still exists one
> volume attached to the deleted "test" instance. Check that on
> dashboard, the volume info says "Attached to None".
>
> (6) If you try to delete the that volume with " cinder delete <volume
> id >". It says the volume is unable to be deleted because it is in
> attached status.
>
> 4. Expected result:
>
> soft-delete detached instance's volume.
>
> 5. Actual result:
>
> the volume is still left attached, and undeletable.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/nova/+bug/1560300/+subscriptions
>

Changed in nova:
assignee: apporc (appleorchard2000) → Anusha Unnam (anusha-unnam)
Anusha Unnam (anusha-unnam) wrote :

There is already a bug filed for this issue a year back https://bugs.launchpad.net/nova/+bug/1463856.
So i will mark this bug as duplicate and submit a patch for that bug.
@apporc, I will continue with the work you have started here. Thanks.

apporc (appleorchard2000) wrote :
Download full text (3.2 KiB)

Good to hear that. You are welcome.

2016年9月28日 00:36,"Anusha Unnam" <email address hidden>写道:

> *** This bug is a duplicate of bug 1463856 ***
> https://bugs.launchpad.net/bugs/1463856
>
> There is already a bug filed for this issue a year back
> https://bugs.launchpad.net/nova/+bug/1463856.
> So i will mark this bug as duplicate and submit a patch for that bug.
> @apporc, I will continue with the work you have started here. Thanks.
>
> ** This bug has been marked a duplicate of bug 1463856
> Cinder volume isn't available after instance soft-deleted timer expired
> while volume is still attached
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1560300
>
> Title:
> nova soft-delete leaves "Attached to None" volumes
>
> Status in OpenStack Compute (nova):
> In Progress
>
> Bug description:
> 1. version
>
> `git log -1`
> c4763d46fe76c524363a0cf55d1e8afe4bd23f53
>
> This is the version i used to test on my devstack, but in fact this
> bug exists from at least Juno release till now as i know.
>
> 2. Relevant log:
>
> When a soft-delete is made on one instance boot from volume. There
> will be one line like this in nova-compute.log:
>
> WARNING nova.compute.manager [req-7bbc1701-fbce-41bc-8182-b2cbb6e5ac93
> None None] [instance: a3645529-6b11-437e-b1e4-773e87db7223] Ignoring
> EndpointNotFound: The service catalog is empty.
>
> This is because nova-compute uses a separate thread to do reclaiming
> instances job, which has an admin context. Since there is no
> service_catalog in admin context, nova-compute will raise exception
> EndpointNotFound while it tries to detach the volume.
>
> 3. Reproduce steps:
>
> (1) Set a non-zero value for reclaim_instance_interval in
> /etc/nova/nova.conf on both nova-controller and nova-compute nodes.
> reclaim_instance_interval=10, eg. This enables soft-delete feature.
>
> (2) Create an instance with this:
>
> nova boot --flavor xxx --block-device id=<image
> id>,source=image,dest=volume,size=<volume size>,bootindex=0 --nic net-
> id=<network uuid> test
>
> (3) Delete the created instance:
>
> nova delete test
>
> (4) On the nova-compute node which hosted "test", there will be one
> warnging in nova-compute.log like this[NOTE: you should wait until the
> reclaim_instance_interval is ended, until then nova-compute are going
> to really terminate the instance]:
>
> "WARNING nova.compute.manager [req-7bbc1701-fbce-
> 41bc-8182-b2cbb6e5ac93 None None] [instance: a3645529-6b11-437e-
> b1e4-773e87db7223] Ignoring EndpointNotFound: The service catalog is
> empty."
>
> (5) if you list your volumes, you can find there still exists one
> volume attached to the deleted "test" instance. Check that on
> dashboard, the volume info says "Attached to None".
>
> (6) If you try to delete the that volume with " cinder delete <volume
> id >". It says the volume is unable to be deleted because it is in
> attached status.
>
> 4. Expected result:
>
> soft-delete detached instance's volume.
>
> 5. Actual result:
>
> the volume is still left attached, and undelet...

Read more...

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers