Volume wouldn't been detach after delete with reclaim_instance_interval

Bug #1555045 reported by Chung Chih, Hung
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Medium
hebin

Bug Description

Nova can assign seconds to reclaim_instance_interval option which is Interval in seconds for reclaiming deleted instances.
If users delete server which had attached volumes ( or boot from volume), those servers will been disappear in server list.
Their volumes would still in use and it will have "Attached to **** on /dev/vdb".
After interval seconds, nova compute would begin to reclaim those servers.
But it will not release those volumes.
This cause those volumes became "Attached to None on /dev/vdb".
Users need to reset volumes state then can use they again.
But those volumes will have "Attached to None on /dev/vdb Attached to **** on /dev/vdb""

We should not only actually delete instances but also detach volumes in reclaiming instances.

My environment was deployed by devstack which is upstream code.
devstack: 6fff3cc03589cb0fdf02b4bedf1c35bcb000f28d
nova-client: 3.3.0
nova: 3d7e403cc7a5d9ebcd9a011d6c2055bfbf56cb05

Changed in nova:
assignee: nobody → Chung Chih, Hung (lyanchih)
Revision history for this message
Matt Riedemann (mriedem) wrote :

There are some patches floating around for forcing a detach of volumes, Andrea Rosa and Scott D'Angelo might be able to help find the latest there, but I think this is a bit different. As noted, the periodic task should detach volumes before deleting instances.

tags: added: volumes
Changed in nova:
status: New → Triaged
importance: Undecided → Medium
tags: added: compute
Revision history for this message
Chung Chih, Hung (lyanchih) wrote :

I found reclaim schedule task will try to detach volume.
But there is exception raised when detach volume, its messages are "Ignoring EndpointNotFound: The service catalog is empty."

description: updated
description: updated
Revision history for this message
Chung Chih, Hung (lyanchih) wrote :

Nova compute will use periodic_task to reclaim deleted instance.
Task will try to send volume terminate api with admin context.
Admin context didn't contain service catalog, therefore exception will been raised.
Then volume detach operator will been interrupt.
Following is sample code.

>>> from nova import context
>>> import nova.volume.cinder
>>> ctx = context.get_admin_context()
>>> nova.volume.cinder.cinderclient(ctx)
Traceback (most recent call last):
...
...
keystoneauth1.exceptions.catalog.EmptyCatalog: The service catalog is empty.

Revision history for this message
Sarafraj Singh (sarafraj-singh) wrote :

Lyanchih,
Are you working on the fix? Please change status to Inprogress if you are, otherwise change Assigned to ->nobody.

Changed in nova:
assignee: Chung Chih, Hung (lyanchih) → nobody
Revision history for this message
Enol Fernández (enolfc) wrote :

I'm also getting the "Ignoring EndpointNotFound: The service catalog is empty." message.
Has anyone found the reason for this?

Alvaro Lopez (aloga)
Changed in nova:
assignee: nobody → Alvaro Lopez (aloga)
Alvaro Lopez (aloga)
Changed in nova:
assignee: Alvaro Lopez (aloga) → nobody
Revision history for this message
Alvaro Lopez (aloga) wrote :

This bug is due to the fact that the delayed task is being executed with an admin context, that does not have Keystone auth information, so the context does not contain a service catalog.

Hoever, even in the case where an operator sets the "CONF.cinder.endpoint_template" variable, the context does not contain an auth token that could be used to interact with cinder, so the delayed volume detach operation will fail with the following exception:

    Ignoring unknown exception for volume 4a8940a5-1d42-4f34-86e1-d949630d94fd: No valid authentication is available

Alvaro Lopez (aloga)
Changed in nova:
assignee: nobody → Alvaro Lopez (aloga)
hebin (491309649-t)
Changed in nova:
assignee: Alvaro Lopez (aloga) → hebin (491309649-t)
status: Triaged → In Progress
hebin (491309649-t)
Changed in nova:
status: In Progress → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.