instances cannot be deleted if missing instance_info elements

Bug #1368984 reported by Adam Gandelman
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
Critical
Jim Rollenhagen

Bug Description

After migrating a running instance from nova-bm -> ironic, the instance cannot be deleted in nova. Doing so results in the following:

er': u'ironic', u'request_id': u'req-50ba4550-4df7-47dc-b34e-248856b24237', u'is_public_api': False, u'domain_id': u'default', u'tenant': u'service'} _safe_log /usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/common.py:177
2014-09-12 22:32:47.751 9395 DEBUG ironic.conductor.manager [-] RPC do_node_tear_down called for node 2700db7f-2b15-49b8-bb3a-2b29dae28a20. do_node_tear_down /usr/local/lib/python2.7/dist-packages/ironic/conductor/manager.py:537
2014-09-12 22:32:47.751 9395 DEBUG ironic.conductor.task_manager [-] Attempting to reserve node 2700db7f-2b15-49b8-bb3a-2b29dae28a20 reserve_node /usr/local/lib/python2.7/dist-packages/ironic/conductor/task_manager.py:179
2014-09-12 22:32:53.216 9395 DEBUG oslo.messaging._drivers.amqp [-] UNIQUE_ID is 223db7da5a20428680fcfd59bd13f3b1. _add_unique_id /usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqp.py:246
2014-09-12 22:32:53.217 9395 WARNING ironic.conductor.manager [-] Error in tear_down of node 2700db7f-2b15-49b8-bb3a-2b29dae28a20: Cannot validate iSCSI deploy. The following parameters were not passed to ironic: ['root_gb', 'image_source']
2014-09-12 22:32:53.221 9395 DEBUG oslo.messaging._drivers.amqp [-] UNIQUE_ID is 28b21ff6d18e4313aa89c8d2bdf37718. _add_unique_id /usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqp.py:246
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 455, in fire_timers
    timer()
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
    cb(*args, **kw)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 212, in main
    result = function(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/ironic/conductor/manager.py", line 593, in _do_node_tear_down
    node.target_provision_state = states.NOSTATE
  File "/usr/local/lib/python2.7/dist-packages/oslo/utils/excutils.py", line 82, in __exit__
    six.reraise(self.type_, self.value, self.tb)
  File "/usr/local/lib/python2.7/dist-packages/ironic/conductor/manager.py", line 584, in _do_node_tear_down
    task.driver.deploy.clean_up(task)
  File "/usr/local/lib/python2.7/dist-packages/ironic/drivers/modules/pxe.py", line 399, in clean_up
    pxe_info = _get_image_info(node, task.context)
  File "/usr/local/lib/python2.7/dist-packages/ironic/drivers/modules/pxe.py", line 223, in _get_image_info
    d_info = _parse_deploy_info(node)
  File "/usr/local/lib/python2.7/dist-packages/ironic/drivers/modules/pxe.py", line 140, in _parse_deploy_info
    info.update(iscsi_deploy.parse_instance_info(node))
  File "/usr/local/lib/python2.7/dist-packages/ironic/drivers/modules/iscsi_deploy.py", line 120, in parse_instance_info
    deploy_utils.check_for_missing_params(i_info, error_msg)
  File "/usr/local/lib/python2.7/dist-packages/ironic/drivers/modules/deploy_utils.py", line 408, in check_for_missing_params
    {'error_msg': error_msg, 'missing_info': missing_info})
MissingParameterValue: Cannot validate iSCSI deploy. The following parameters were not passed to ironic: ['root_gb', 'image_source']

Tags: pxe
summary: - intsances cannot be deleted after migration from nova-bm
+ instances cannot be deleted after migration from nova-bm, missing
+ instance_info elements
Revision history for this message
Adam Gandelman (gandelman-a) wrote : Re: instances cannot be deleted after migration from nova-bm, missing instance_info elements

Two issues here caused by missing or incorrect instance_info data after running the ironic-nova-bm-migrate script:

* nova-bm uses root_mb, ironic uses root_gb but there is no conversion happening. Same issue /w ephemeral_mb vs ephemeral_gb, though that wouldn't cause issues here.

* 'image_source' is missing entirely from the node's instance info, it was never migrated to being with. The nova-bm table contains no reference to the image associated with booted instances. To migrate this from nova along with the other data, the migration script would need to also query the nova database directly

aeva black (tenbrae)
Changed in ironic:
importance: Undecided → High
milestone: none → juno-rc1
aeva black (tenbrae)
Changed in ironic:
status: New → Confirmed
Revision history for this message
aeva black (tenbrae) wrote :

Updated bug title to reflect duplicate "node-delete operation on failed node does not work"

tldr; the PXE driver can not clean_up() if some parameters are missing from node.instance_info, leading to a failure in Nova. This situation can be the result of a failed launch or of a migration from nova "baremetal".

tags: added: pxe
summary: - instances cannot be deleted after migration from nova-bm, missing
- instance_info elements
+ instances cannot be deleted if missing instance_info elements
Changed in ironic:
importance: High → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (master)

Fix proposed to branch: master
Review: https://review.openstack.org/121615

Changed in ironic:
assignee: nobody → Jim Rollenhagen (jim-rollenhagen)
status: Confirmed → In Progress
Revision history for this message
Adam Gandelman (gandelman-a) wrote :

This also affects nodes with missing elements in the driver_info, specifically pxe_deploy_kernel + pxe_deploy_ramdisk

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (master)

Reviewed: https://review.openstack.org/121615
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=985eba60edc339a1444e9206552eedb0e28758b6
Submitter: Jenkins
Branch: master

commit 985eba60edc339a1444e9206552eedb0e28758b6
Author: Jim Rollenhagen <email address hidden>
Date: Mon Sep 15 09:16:20 2014 -0700

    Allow clean_up with missing image ref

    This change allows clean_up to continue with missing image_source
    in Node.instance_info. This previouly failed in cases where:

    * A node failed to deploy and so didn't have image_source
    * A node was migrated from nova-baremetal

    Change-Id: I03543adc74f9da83fff58e0cffe34f36a055e19a
    Closes-Bug: 1368984

Changed in ironic:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in ironic:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in ironic:
milestone: juno-rc1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.