failed to deploy overcloud node

Bug #1297063 reported by Robert Collins
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Invalid
Undecided
Unassigned

Bug Description

Hi, reporting a failure to deploy a node in tripleo devtest.

The undercloud deployed fine, and was then configured with three nodes and an overcloud was deployed.

To reproduce
-------------------

pull these two reviews into your trees:
tie https://review.openstack.org/#/c/81627/
t-i https://review.openstack.org/#/c/81959/

and run the Ironic patch for a custom compute manager:

export DIB_REPOLOCATION_ironic=https://review.openstack.org/openstack/ironic
export DIB_REPOREF_ironic=refs/changes/37/82637/9

then run devtest.sh

basic state at the point of failure:
$ nova list
+--------------------------------------+-------------------------------------+--------+------------+-------------+---------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+-------------------------------------+--------+------------+-------------+---------------------+
| b4f40561-310e-48ef-bf92-be98176ecf00 | overcloud-NovaCompute0-isfjq7ko6z6k | ERROR | - | Shutdown | |
| ffe075d1-aff8-4a52-8989-661bfca083e0 | overcloud-NovaCompute1-7iupgrjzjf72 | ACTIVE | - | Running | ctlplane=192.0.2.22 |
| ec3114cc-4f47-4b33-a41e-3cab5705a5ff | overcloud-notCompute0-wh4cpdg3fuv3 | ACTIVE | - | Running | ctlplane=192.0.2.24 |
+--------------------------------------+-------------------------------------+--------+------------+-------------+---------------------+

$ nova show b4f40561-310e-48ef-bf92-be98176ecf00
+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | undercloud |
| OS-EXT-SRV-ATTR:hypervisor_hostname | baffaf7c-7506-47a6-826b-01d1f126dddb |
| OS-EXT-SRV-ATTR:instance_name | instance-00000002 |
| OS-EXT-STS:power_state | 4 |
| OS-EXT-STS:task_state | - |
| OS-EXT-STS:vm_state | error |
| OS-SRV-USG:launched_at | - |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2014-03-25T02:20:02Z |
| fault | {"message": "'HTTPInternalServerError' object has no attribute '__name__'", "code": 500, "details": " File \"/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py\", line 296, in decorated_function |
| | return function(self, context, *args, **kwargs) |
| | File \"/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py\", line 2073, in run_instance |
| | do_run_instance() |
| | File \"/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/openstack/common/lockutils.py\", line 249, in inner |
| | return f(*args, **kwargs) |
| | File \"/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py\", line 2072, in do_run_instance |
| | legacy_bdm_in_spec) |
| | File \"/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py\", line 1205, in _run_instance |
| | notify(\"error\", fault=e) # notify that build failed |
| | File \"/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/openstack/common/excutils.py\", line 68, in __exit__ |
| | six.reraise(self.type_, self.value, self.tb) |
| | File \"/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py\", line 1189, in _run_instance |
| | instance, image_meta, legacy_bdm_in_spec) |
| | File \"/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py\", line 1353, in _build_instance |
| | filter_properties, bdms, legacy_bdm_in_spec) |
| | File \"/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py\", line 1399, in _reschedule_or_error |
| | self._log_original_error(exc_info, instance_uuid) |
| | File \"/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/openstack/common/excutils.py\", line 68, in __exit__ |
| | six.reraise(self.type_, self.value, self.tb) |
| | File \"/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py\", line 1394, in _reschedule_or_error |
| | bdms, requested_networks) |
| | File \"/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py\", line 2123, in _shutdown_instance |
| | requested_networks) |
| | File \"/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/openstack/common/excutils.py\", line 68, in __exit__ |
| | six.reraise(self.type_, self.value, self.tb) |
| | File \"/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py\", line 2113, in _shutdown_instance |
| | block_device_info) |
| | File \"/opt/stack/venvs/nova/local/lib/python2.7/site-packages/ironic/nova/virt/ironic/driver.py\", line 503, in destroy |
| | if e.__name__ == 'InstanceDeployFailure': |
| | ", "created": "2014-03-25T02:20:33Z"} |
| flavor | baremetal (35ab9ef3-9126-413c-863f-5fba777a19f1) |
| hostId | c5a58c1e8e7c4d5a574376fa4760b8b40aeeb9e317565753def0e6db |
| id | b4f40561-310e-48ef-bf92-be98176ecf00 |
| image | overcloud-compute (a712ff92-a4c6-4678-bb2a-e53ee950657e) |
| key_name | default |
| metadata | {} |
| name | overcloud-NovaCompute0-isfjq7ko6z6k |
| os-extended-volumes:volumes_attached | [] |
| status | ERROR |
| tenant_id | 19a41ab4f8b941a68ef81a28281137bc |
| updated | 2014-03-25T02:29:38Z |
| user_id | 61cdc9080f6544ae9476dcd01f8f4309 |
+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
(undercloud)(undercloud)robertc@lifelesswks:~/work$

$ ironic node-list
+--------------------------------------+--------------------------------------+-------------+--------------------+
| UUID | Instance UUID | Power State | Provisioning State |
+--------------------------------------+--------------------------------------+-------------+--------------------+
| baffaf7c-7506-47a6-826b-01d1f126dddb | b4f40561-310e-48ef-bf92-be98176ecf00 | power off | None |
| 9fb04171-8728-4941-bb91-944f761c6b50 | ec3114cc-4f47-4b33-a41e-3cab5705a5ff | power on | active |
| fedb1b9f-9584-472a-8a43-60655b067d85 | ffe075d1-aff8-4a52-8989-661bfca083e0 | power on | active |
+--------------------------------------+--------------------------------------+-------------+--------------------+

Revision history for this message
Robert Collins (lifeless) wrote :
Download full text (107.3 KiB)

Ironic API
192.0.2.2 - - [25/Mar/2014 02:20:27] "GET /v1/nodes/baffaf7c-7506-47a6-826b-01d1f126dddb/ports HTTP/1.1" 200 297
(wsme.api): 2014-03-25 02:20:27,865 WARNING Client-side error: Couldn't apply patch '[{'path': '/extra/vif_port_id', 'op': 'remove'}]'. Reason: u'vif_port_id'
192.0.2.2 - - [25/Mar/2014 02:20:27] "PATCH /v1/ports/db7222b9-474d-4b54-ad03-a6c0cf702da0 HTTP/1.1" 400 187
192.0.2.2 - - [25/Mar/2014 02:20:28] "GET /v1/nodes/baffaf7c-7506-47a6-826b-01d1f126dddb/ports HTTP/1.1" 200 297
192.0.2.2 - - [25/Mar/2014 02:20:28] "PATCH /v1/ports/db7222b9-474d-4b54-ad03-a6c0cf702da0 HTTP/1.1" 200 466
(wsme.api): 2014-03-25 02:20:28,883 ERROR Server-side error: "RPC do_node_deploy failed to validate deploy info. Error: Couldn't get the URL of the Ironic API service from the configuration file or keystone catalog.
Traceback (most recent call last):

  File "/opt/stack/venvs/ironic/local/lib/python2.7/site-packages/ironic/openstack/common/rpc/common.py", line 423, in catch_client_exception
    return func(*args, **kwargs)

  File "/opt/stack/venvs/ironic/local/lib/python2.7/site-packages/ironic/conductor/manager.py", line 393, in do_node_deploy
    task.release_resources()

  File "/opt/stack/venvs/ironic/local/lib/python2.7/site-packages/ironic/openstack/common/excutils.py", line 70, in __exit__
    six.reraise(self.type_, self.value, self.tb)

  File "/opt/stack/venvs/ironic/local/lib/python2.7/site-packages/ironic/conductor/manager.py", line 378, in do_node_deploy
    "Error: %(msg)s") % {'msg': e})

InstanceDeployFailure: RPC do_node_deploy failed to validate deploy info. Error: Couldn't get the URL of the Ironic API service from the configuration file or keystone catalog.
". Detail:
Traceback (most recent call last):

  File "/opt/stack/venvs/ironic/local/lib/python2.7/site-packages/wsmeext/pecan.py", line 77, in callfunction
    result = f(self, *args, **kwargs)

  File "/opt/stack/venvs/ironic/local/lib/python2.7/site-packages/ironic/api/controllers/v1/node.py", line 231, in provision
    pecan.request.context, node_uuid, topic)

  File "/opt/stack/venvs/ironic/local/lib/python2.7/site-packages/ironic/conductor/rpcapi.py", line 185, in do_node_deploy
    topic=topic or self.topic)

  File "/opt/stack/venvs/ironic/local/lib/python2.7/site-packages/ironic/openstack/common/rpc/proxy.py", line 125, in call
    result = rpc.call(context, real_topic, msg, timeout)

  File "/opt/stack/venvs/ironic/local/lib/python2.7/site-packages/ironic/openstack/common/rpc/__init__.py", line 112, in call
    return _get_impl().call(CONF, context, topic, msg, timeout)

  File "/opt/stack/venvs/ironic/local/lib/python2.7/site-packages/ironic/openstack/common/rpc/impl_kombu.py", line 815, in call
    rpc_amqp.get_connection_pool(conf, Connection))

  File "/opt/stack/venvs/ironic/local/lib/python2.7/site-packages/ironic/openstack/common/rpc/amqp.py", line 575, in call
    rv = list(rv)

  File "/opt/stack/venvs/ironic/local/lib/python2.7/site-packages/ironic/openstack/common/rpc/amqp.py", line 540, in __iter__
    raise result

InstanceDeployFailure_Remote: RPC do_node_deploy failed to validate deploy info. Error: Couldn't get the URL of the Ironic API ser...

description: updated
Revision history for this message
Adam Gandelman (gandelman-a) wrote :

Not sure what the cause of the bug is here, but the nova driver could use some better exception handling in destroy() to provide some useful debug pointers on the nova-compute side.

description: updated
description: updated
description: updated
Revision history for this message
aeva black (tenbrae) wrote :

New compute manager has landed in Ironic, and only one of the linked patch sets still remains:
  https://review.openstack.org/#/c/81959/

I can't tell from the bug description what the failure actually is, or where it's coming from. I agree with Adam that there needs to be better exception handling and logging in the nova.virt.ironic driver -- is that what this bug is really about? If so, please update the subject. If not, please clarify the description.

Thanks,

Changed in ironic:
status: New → Incomplete
Revision history for this message
Dmitry Tantsur (divius) wrote :

We cannot solve the issue you reported without more information. Could you please provide the requested information? In particular, "'HTTPInternalServerError' object has no attribute '__name__'" was already fixed.

Revision history for this message
Dmitry Tantsur (divius) wrote :

Hi! There were no activity on this bug for a while and we need more information to work on it. Please feel free to reopen with more details, if you still experience this problem.

Changed in ironic:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.