instance uuid not cleared in an ironic node

Bug #1596922 reported by Yushiro FURUKAWA
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
Hironori Shiina

Bug Description

Description
===========
In nova with ironic environment, "instance UUID" still remains if nova boot
got an error. As a result, the ironic node cannot delete by API and deploy
baremetal instance.

Steps to reproduce
==================
1. create an ironic-node
2. create an ironic-port
3. nova boot for ironic instance with following illegal instance name.

   $ nova boot --flavor my-baremetal-agent --image $MY_IMAGE_UUID \
               --key-name default test3.141592 --nic net-id=$NET_UUID

Expected result
===============
Nova got an error and state has changed "ERROR". Then, tenant user deletes
nova's instance($nova delete <instance_uuid>). After that, "instance UUID"
of an ironic node should be cleared.

Actual result
=============
After $nova delete <instance_uuid>, "instance UUID" still remains into the
ironic node database.

$ ironic node-delete 0ea7c2e3-e5be-4052-85ac-a7b1adf0f30b
Failed to delete node 0ea7c2e3-e5be-4052-85ac-a7b1adf0f30b: Node 0ea7c2e3-e5be-4052-85ac-a7b1adf0f30b is associated with instance 84911c1f-976b-4428-a509-8ab6cf04182a. (HTTP 409)

Environment
===========
* Devstack all-in-one
* Nova's source code is for "May 18".

commit fe8a119e8d80de35d7f99e0c1d9a9e5095840146
Merge: b56d861 6f2a46f
Author: Jenkins <email address hidden>
Date: Wed May 18 23:33:00 2016 +0000

    Merge "Remove unused base_options param from _get_image_defined_bdms"

2. Which hypervisor did you use?
   ironic

3. Which networking type did you use?
   (For example: nova-network, Neutron with OpenVSwitch, ...)
neutron + ML2 + openvswitch driver + linuxbridge driver

Logs
====
2016-06-28 21:03:04.670 ERROR nova.scheduler.utils [req-7db0e0e5-5d18-4a5a-9293-52dd2e0f4351 admin admin] [instance: 84911c1f-976b-4428-a509-8ab6cf04182a] Error from last host: f
urukawa-dev-ironic (node 0ea7c2e3-e5be-4052-85ac-a7b1adf0f30b): [u'Traceback (most recent call last):\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 1749, in _do_bu
ild_and_run_instance\n filter_properties)\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 1939, in _build_and_run_instance\n instance_uuid=instance.uuid, reaso
n=six.text_type(e))\n', u"RescheduledException: Build of instance 84911c1f-976b-4428-a509-8ab6cf04182a was re-scheduled: Invalid input for dns_name. Reason: 'test3.141592' not a
valid PQDN or FQDN. Reason: TLD '141592' must not be all numeric.\n"]
2016-06-28 21:03:04.689 DEBUG oslo_messaging._drivers.amqpdriver [req-7db0e0e5-5d18-4a5a-9293-52dd2e0f4351 admin admin] sending reply msg_id: dea4917aad334ffda701e6cd23cf6a4c rep
ly queue: reply_a163275e9821450cb6494257a3b9629f time elapsed: 0.0230762520805s from (pid=5506) _send_reply /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdri
ver.py:74
2016-06-28 21:03:04.688 DEBUG oslo_messaging._drivers.amqpdriver [req-7db0e0e5-5d18-4a5a-9293-52dd2e0f4351 admin admin] CALL msg_id: 51218c1cd9ad403798f2d830d9d0abb3 exchange 'no
va' topic 'scheduler' from (pid=5507) _send /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:450
2016-06-28 21:03:04.741 DEBUG oslo_messaging._drivers.amqpdriver [-] received reply msg_id: 51218c1cd9ad403798f2d830d9d0abb3 from (pid=5507) __call__ /usr/local/lib/python2.7/dis
t-packages/oslo_messaging/_drivers/amqpdriver.py:298
2016-06-28 21:03:04.743 WARNING nova.scheduler.utils [req-7db0e0e5-5d18-4a5a-9293-52dd2e0f4351 admin admin] Failed to compute_task_build_instances: No valid host was found. There
 are not enough hosts available.
Traceback (most recent call last):

  File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 200, in inner
    return func(*args, **kwargs)

  File "/opt/stack/nova/nova/scheduler/manager.py", line 104, in select_destinations
    dests = self.driver.select_destinations(ctxt, spec_obj)

  File "/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 74, in select_destinations
    raise exception.NoValidHost(reason=reason)

NoValidHost: No valid host was found. There are not enough hosts available.

Revision history for this message
Yushiro FURUKAWA (y-furukawa-2) wrote :
Revision history for this message
Sean Dague (sdague) wrote :

It looks like the issue is solely that Ironic's cleanup should get called and do this here - https://github.com/openstack/nova/blob/4e62960722caaefd02f6fdc753176a7c117f6a18/nova/virt/ironic/driver.py#L838

Changed in nova:
status: New → Confirmed
importance: Undecided → Low
Changed in nova:
assignee: nobody → Hironori Shiina (shiina-hironori)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/341253

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/341253
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0e24e9e2ec254364ffe029226b9ae5956002df54
Submitter: Jenkins
Branch: master

commit 0e24e9e2ec254364ffe029226b9ae5956002df54
Author: Hironori Shiina <email address hidden>
Date: Sun Jul 10 15:32:58 2016 +0900

    ironic: Cleanup instance information when spawn fails

    Instance information such as an instance_uuid set to an ironic node by
    _add_driver_fields() is not cleared when spawning is aborted by an
    exception raised before ironic starts deployment. Then, ironic node
    stays AVAILABLE state with instance_uuid set. This information is not
    cleared even if the instance is deleted. The ironic node cannot be
    removed nor deployed again becuase instance_uuid remains.

    This patch adds a method to remove the information. This method is
    called if ironic doesn't need unprovisioning when an instance is
    destroyed.

    Change-Id: Idf5191aa1c990552ca2340856d5d5b6ac03f7539
    Closes-Bug: 1596922

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/364369

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 14.0.0.0b3

This issue was fixed in the openstack/nova 14.0.0.0b3 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/mitaka)

Reviewed: https://review.openstack.org/364369
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=20ba099205d8c90334d830f53ff9bec254415265
Submitter: Jenkins
Branch: stable/mitaka

commit 20ba099205d8c90334d830f53ff9bec254415265
Author: Hironori Shiina <email address hidden>
Date: Sun Jul 10 15:32:58 2016 +0900

    ironic: Cleanup instance information when spawn fails

    Instance information such as an instance_uuid set to an ironic node by
    _add_driver_fields() is not cleared when spawning is aborted by an
    exception raised before ironic starts deployment. Then, ironic node
    stays AVAILABLE state with instance_uuid set. This information is not
    cleared even if the instance is deleted. The ironic node cannot be
    removed nor deployed again becuase instance_uuid remains.

    This patch adds a method to remove the information. This method is
    called if ironic doesn't need unprovisioning when an instance is
    destroyed.

    Change-Id: Idf5191aa1c990552ca2340856d5d5b6ac03f7539
    Closes-Bug: 1596922
    (cherry picked from commit 0e24e9e2ec254364ffe029226b9ae5956002df54)

tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 13.1.2

This issue was fixed in the openstack/nova 13.1.2 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.