OpenStack Compute (nova)

Nova informs Placement too early upon Ironic instance deletion

Bug #1884217 reported by Arne Wiebalck on 2020-06-19

This bug affects 3 people

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Compute (nova)	In Progress	Low	Unassigned

Bug Description

Description
===========

When an instance is deleted, it seems that Nova calls back into Placement and the corresponding resource provider becomes available again right away. For Ironic instances, the deletion is not instantaneous, however, and the node is not available at this point. Instance creation will hence fail. This is fixed once the resource tracker comes along and corrects the information in placement, but with hundreds of nodes and the way the resource tracker handles them, this leaves a window of several minutes.

Steps to reproduce
==================

Delete an instance and compare the state of the resource provider in placement with the provision state of the node:

# openstack baremetal node show --fit c55cb55d-46bb-404f-948e-f777fdef99ce --fields provision_state
+-----------------+------------+
| Field | Value |
+-----------------+------------+
| provision_state | clean wait |
+-----------------+------------+
# OS_PLACEMENT_API_VERSION=1.10 openstack allocation candidate list --resource CUSTOM_BAREMETAL_P1_xyz_S6045_C6_abc='1'
+---+------------------------------------------------+--------------------------------------+--------------------------------------------------+
| # | allocation | resource provider | inventory used/capacity |
+---+------------------------------------------------+--------------------------------------+--------------------------------------------------+
| 1 | CUSTOM_BAREMETAL_P1_CD5795986_S6045_C6_IP104=1 | c55cb55d-46bb-404f-948e-f777fdef99ce | CUSTOM_BAREMETAL_P1_xyz_S6045_C6_abc=0/1 |
+---+------------------------------------------------+--------------------------------------+--------------------------------------------------+

Expected result
===============

The resource provider should not become available in placement before the Ironic node moved to provision state available.

Actual result
=============

The resource provider is available in placement while the Ironic node is not in provision state available yet.

Environment
===========

Ironic on Train, Nova on Stein

Tags:

Balazs Gibizer (balazs-gibizer) on 2020-08-06

tags:

added: ironic placement resource-tracker

Revision history for this message

Sylvain Bauza (sylvain-bauza) wrote on 2020-09-17:

In theory, we first call _shutdown_instance() [1] before destroying the instance which disallocate the resources [2].

When we call driver.destroy() in shutdown_instance(), we asynchronously call the Ironic API to unprovision the node and we hold until we are sure that the node is unprovisioned [3]

[1] https://github.com/openstack/nova/blob/90777d7/nova/compute/manager.py#L2980
[2] https://github.com/openstack/nova/blob/90777d7/nova/compute/manager.py#L3013
[3] https://github.com/openstack/nova/blob/90777d7/nova/virt/ironic/driver.py#L1317

Could you please check the compute logs and tell us whether the timings tell us that the instance was destroyed *before* the node was unprovisioned ?

Changed in nova:
status:	New → Incomplete
importance:	Undecided → Low

Revision history for this message

Arne Wiebalck (arne-wiebalck) wrote on 2020-09-17:

Thanks for having a look, Sylvain!

I have a question for the hold loop you mention in [3]: it looks to me like the driver is only waiting for the node to be in one of

ironic_states.NOSTATE
ironic_states.CLEANING
ironic_states.CLEANWAIT
ironic_states.CLEANFAIL
ironic_states.AVAILABLE

to consider the node unprovisioned. Is my understanding correct?

If so, a node in CLEANWAIT, for instance, is not ready to be re-instantiated:
- cleaning could still take hours or days to clean
- it could fail cleaning
So, telling placement there is an allocation candidate for a node in this state is too early I would think.

Revision history for this message

Arne Wiebalck (arne-wiebalck) wrote on 2020-09-17:

The same is true for CLEANING, of course, as this is just the state before CLEANWAIT.
Even worse for CLEANFAIL: the node will not become AVAILABLE any time soon.

But maybe I am just missing sth :)

Revision history for this message

Launchpad Janitor (janitor) wrote on 2020-11-17:

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status:	Incomplete → Expired

Revision history for this message

James Bagwell (jimbagwell) wrote on 2021-08-31:

Hi, please re-open as this is relevant still in train.

Revision history for this message

Arne Wiebalck (arne-wiebalck) wrote on 2021-09-01:

I re-opened it by setting the status to new.

Sylvain: the bug was marked Incomplete, but I had added a question/comment related to your request for more information. Would you mind having another look and let me know if you (still) need sth from our side?

Changed in nova:
status:	Expired → New

Revision history for this message

Balazs Gibizer (balazs-gibizer) wrote on 2021-09-03:

I might not have the full context but the code Sylvain is linked to are waiting for the instance being destroyed[1]. If we change the states we are waiting in _unprovision() then we hold up the destroying of the instance object in the upper layers. Therefor holding up volume / network resources, instance quotas. Also if the ironic node ends up in ironic_states.CLEANFAIL then that would mean that the end user visible instance will be in DELETING state for a very long time.

I think that in ironic we need to decouple the instance destroyed state from the resource is ready to be used again state. E.g.: _unprovision could put a trait (e.g. CUSTOM_IRONIC_NEEDS_CLEANING) on the ironic node RP and we can have a placement pre-filter that filters out nodes with that trait during scheduling.

I don't know how this is happen:
"This is fixed once the resource tracker comes along and corrects the information in placement,"
but that logic might be extended to the _unprovision case to solve the issue.

bottom line: I agree that this is possible a bug (I haven't reproduced myself). But the suggested solution needs further discussion.

I let others with more ironic knowledge to confirm / triage this.

[1] https://github.com/openstack/nova/blob/90777d790d7c268f50851ac3e5b4e02617f5ae1c/nova/virt/ironic/driver.py#L1295-L1297

Revision history for this message

Chris Krelle (nobodycam) wrote on 2021-10-12:

We have encountered this in our production environment. It causes havoc with out event driven scheduling system and leads to failed deployments daily.

Revision history for this message

Steve Baker (steve-stevebaker) wrote on 2021-10-13:

I'm going to propose that the destroy method doesn't return until the node state is AVAILABLE (also state CLEANFAIL will raise an exception). This means ironic CLEANING and CLEANWAIT will map to nova DELETING, which may last hours depending on the cleaning steps.

As far as I can tell from ironic/driver.py and nova/compute/manager.py, volume[1][2] and vif[3] cleanup happens before the state polling I intend to add, so we might be fine here unless Balazs is referring to other volume/network cleanup.

[1] https://opendev.org/openstack/nova/src/branch/master/nova/compute/manager.py#L3035
[2] https://opendev.org/openstack/nova/src/branch/master/nova/virt/ironic/driver.py#L507
[3] https://opendev.org/openstack/nova/src/branch/master/nova/virt/ironic/driver.py#L508

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2021-10-13: Fix proposed to nova (master)

#10

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/813729

Changed in nova:
status:	New → In Progress

Revision history for this message

melanie witt (melwitt) wrote on 2022-01-15:

#11

> I don't know how this is happen:
"This is fixed once the resource tracker comes along and corrects the information in placement,"
but that logic might be extended to the _unprovision case to solve the issue.

I think this is referring to how the resource tracker sets a node to reserved=True if the node info from ironic indicates the node is not AVAILABLE for provisioning in update_provider_tree during the update_available_resource periodic task in nova-compute.

[1] https://github.com/openstack/nova/blob/d5b6412ef52b1e5ad797a49850c9c6701b0405db/nova/virt/ironic/driver.py#L876-L885

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.