Ugly output when registering nodes

Bug #1710685 reported by Ben Nemec
22
This bug affects 5 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Toure Dunnon

Bug Description

I'm seeing the following behavior on master. It seems to be happening consistently when I register nodes, and while it doesn't seem to break anything it looks bad and could very easily cause users to think something went wrong.

$ openstack overcloud node import --provide instackenv.json
Started Mistral Workflow tripleo.baremetal.v1.register_or_update. Execution ID: b49129a9-18ad-443d-b326-7a1b3d3d53b4
Waiting for messages on queue '1c7ccb01-462b-40fb-b824-19f61a82b920' with no timeout.

Nodes set to managed.
Successfully registered node UUID 3cc25cf3-064a-4bb7-ae89-848f03b4dec2
Successfully registered node UUID 0040bc4d-f601-403d-933a-87473a4cf6f0
Started Mistral Workflow tripleo.baremetal.v1.provide. Execution ID: 263d89dd-cab8-4a84-b5c4-c466dd3d28f1
Waiting for messages on queue '1c7ccb01-462b-40fb-b824-19f61a82b920' with no timeout.
[{u'cpu_info': u'', u'manager': {u'api': {u'server_groups': None, u'keypairs': None, u'servers': None, u'server_external_events': None, u'server_migrations': None, u'agents': None, u'instance_action': None, u'glance': None, u'hypervisor_stats': None, u'virtual_interfaces': None, u'flavors': None, u'availability_zones': None, u'user_id': None, u'cloudpipe': None, u'os_cache': False, u'quotas': None, u'migrations': None, u'usage': None, u'logger': None, u'project_id': None, u'neutron': None, u'quota_classes': None, u'project_name': None, u'aggregates': None, u'flavor_access': None, u'services': None, u'list_extensions': None, u'limits': None, u'hypervisors': None, u'cells': None, u'versions': None, u'client': None, u'hosts': None, u'volumes': None, u'assisted_volume_snapshots': None, u'certs': None}}, u'free_disk_gb': 0, u'id': 1, u'service': {u'host': u'undercloud-test.localdomain', u'disabled_reason': None, u'id': 6}, u'local_gb_used': 0, u'memory_mb_used': 0, u'current_workload': 0, u'state': u'up', u'status': u'enabled', u'host_ip': u'12.0.0.3', u'hypervisor_hostname': u'3cc25cf3-064a-4bb7-ae89-848f03b4dec2', u'hypervisor_version': 1, u'disk_available_least': 0, u'local_gb': 0, u'free_ram_mb': 0, u'vcpus_used': 0, u'hypervisor_type': u'ironic', u'x_openstack_request_ids': [u'req-56331e62-bab4-4a0b-8c3c-12fe27e4d1a9'], u'memory_mb': 0, u'vcpus': 0, u'running_vms': 0, u'_info': {u'status': u'enabled', u'service': {u'host': u'undercloud-test.localdomain', u'disabled_reason': None, u'id': 6}, u'vcpus_used': 0, u'hypervisor_type': u'ironic', u'local_gb_used': 0, u'host_ip': u'12.0.0.3', u'hypervisor_hostname': u'3cc25cf3-064a-4bb7-ae89-848f03b4dec2', u'memory_mb_used': 0, u'memory_mb': 0, u'current_workload': 0, u'vcpus': 0, u'state': u'up', u'cpu_info': u'', u'running_vms': 0, u'free_disk_gb': 0, u'hypervisor_version': 1, u'disk_available_least': 0, u'local_gb': 0, u'free_ram_mb': 0, u'id': 1}, u'_loaded': True}, u"Failed to run action [action_ex_id=4b0a56dc-478c-4106-a333-7f6c82286d18, action_cls='<class 'mistral.actions.action_factory.NovaAction'>', attributes='{u'client_method_name': u'hypervisors.find'}', params='{u'hypervisor_hostname': u'0040bc4d-f601-403d-933a-87473a4cf6f0'}']\n NovaAction.hypervisors.find failed: No Hypervisor matching {u'hypervisor_hostname': u'0040bc4d-f601-403d-933a-87473a4cf6f0'}. (HTTP 404)"]

Successfully set nodes state to available.

Changed in tripleo:
milestone: pike-rc1 → pike-rc2
Changed in tripleo:
milestone: pike-rc2 → queens-1
Dougal Matthews (d0ugal)
Changed in tripleo:
status: Triaged → Fix Released
Revision history for this message
Ben Nemec (bnemec) wrote :
Download full text (6.2 KiB)

I just hit this again on a fresh undercloud. What was the fix so I can see if my undercloud has it?

Current output:

Started Mistral Workflow tripleo.baremetal.v1.register_or_update. Execution ID: d87c3c6c-d39f-433d-835b-090de700575f
Waiting for messages on queue 'cd0d8c96-c559-4805-98e7-e4942c3288d0' with no timeout.

Nodes set to managed.
Successfully registered node UUID 6c872e14-2cdd-4032-a73b-f59608da42dd
Successfully registered node UUID 893925a8-6b65-492b-86b1-9e31a4753136
Successfully registered node UUID f4702345-cf56-46f7-b896-a4c156468102
Started Mistral Workflow tripleo.baremetal.v1.provide. Execution ID: 7caeb36a-8b6f-4446-9804-5f53c02bd2a5
Waiting for messages on queue 'cd0d8c96-c559-4805-98e7-e4942c3288d0' with no timeout.
[{u'cpu_info': u'', u'manager': {u'api': {u'server_groups': None, u'keypairs': None, u'servers': None, u'server_external_events': None, u'server_migrations': None, u'agents': None, u'instance_action': None, u'glance': None, u'hypervisor_stats': None, u'virtual_interfaces': None, u'flavors': None, u'availability_zones': None, u'user_id': None, u'cloudpipe': None, u'os_cache': False, u'quotas': None, u'migrations': None, u'usage': None, u'logger': None, u'project_id': None, u'neutron': None, u'quota_classes': None, u'project_name': None, u'aggregates': None, u'flavor_access': None, u'services': None, u'list_extensions': None, u'limits': None, u'hypervisors': None, u'cells': None, u'versions': None, u'client': None, u'hosts': None, u'volumes': None, u'assisted_volume_snapshots': None, u'certs': None}}, u'free_disk_gb': 0, u'id': 1, u'service': {u'host': u'undercloud-test.localdomain', u'disabled_reason': None, u'id': 6}, u'local_gb_used': 0, u'memory_mb_used': 0, u'current_workload': 0, u'state': u'up', u'status': u'enabled', u'host_ip': u'12.0.0.11', u'hypervisor_hostname': u'6c872e14-2cdd-4032-a73b-f59608da42dd', u'hypervisor_version': 1, u'disk_available_least': 0, u'local_gb': 0, u'free_ram_mb': 0, u'vcpus_used': 0, u'hypervisor_type': u'ironic', u'x_openstack_request_ids': [u'req-4f463f24-9cd1-49da-b66c-8fca5fcf2331'], u'memory_mb': 0, u'vcpus': 0, u'running_vms': 0, u'_info': {u'status': u'enabled', u'service': {u'host': u'undercloud-test.localdomain', u'disabled_reason': None, u'id': 6}, u'vcpus_used': 0, u'hypervisor_type': u'ironic', u'local_gb_used': 0, u'host_ip': u'12.0.0.11', u'hypervisor_hostname': u'6c872e14-2cdd-4032-a73b-f59608da42dd', u'memory_mb_used': 0, u'memory_mb': 0, u'current_workload': 0, u'vcpus': 0, u'state': u'up', u'cpu_info': u'', u'running_vms': 0, u'free_disk_gb': 0, u'hypervisor_version': 1, u'disk_available_least': 0, u'local_gb': 0, u'free_ram_mb': 0, u'id': 1}, u'_loaded': True}, u"Failed to run action [action_ex_id=68c5d70f-8ad6-4134-a5c1-a8a9705c7c90, action_cls='<class 'mistral.actions.action_factory.NovaAction'>', attributes='{u'client_method_name': u'hypervisors.find'}', params='{u'hypervisor_hostname': u'893925a8-6b65-492b-86b1-9e31a4753136'}']\n NovaAction.hypervisors.find failed: No Hypervisor matching {u'hypervisor_hostname': u'893925a8-6b65-492b-86b1-9e31a4753136'}. (HTTP 404)", u"Failed to run action [action_ex_id=a9f3e777-de38-47b7-b483-2494d3b9d3c1, action_cls...

Read more...

Changed in tripleo:
status: Fix Released → Triaged
Ben Nemec (bnemec)
Changed in tripleo:
milestone: queens-1 → queens-2
Changed in tripleo:
milestone: queens-2 → queens-3
Revision history for this message
Yolanda Robla (yolanda.robla) wrote :
Download full text (12.2 KiB)

In my case, i get it when executing openstack overcloud node provide --all-manageable . I see this error appearing constantly on output, and causing the command to hang. I throws that final error and then it stops trying:

[u"Failed to run action [action_ex_id=cdbec835-efc5-4e80-9590-992ace0deb31, action_cls='<class 'mistral.actions.action_factory.NovaAction'>', attributes='{u'client_method_name': u'hypervisors.find'}', params='{u'hypervisor_hostname': u'8e24a698-3956-4ad8-8723-f0bd7d083237'}']\n NovaAction.hypervisors.find failed: "]
[u"Failed to run action [action_ex_id=8f0de763-3644-4c48-8ae0-cddde09b2ed4, action_cls='<class 'mistral.actions.action_factory.NovaAction'>', attributes='{u'client_method_name': u'hypervisors.find'}', params='{u'hypervisor_hostname': u'8e24a698-3956-4ad8-8723-f0bd7d083237'}']\n NovaAction.hypervisors.find failed: "]
{u'status': u'FAILED', u'message': [u"Failed to run action [action_ex_id=8f0de763-3644-4c48-8ae0-cddde09b2ed4, action_cls='<class 'mistral.actions.action_factory.NovaAction'>', attributes='{u'client_method_name': u'hypervisors.find'}', params='{u'hypervisor_hostname': u'8e24a698-3956-4ad8-8723-f0bd7d083237'}']\n NovaAction.hypervisors.find failed: "], u'result': None}
[u"Failed to run action [action_ex_id=8034ad13-563c-46b4-ab6c-e43fe883794a, action_cls='<class 'mistral.actions.action_factory.NovaAction'>', attributes='{u'client_method_name': u'hypervisors.find'}', params='{u'hypervisor_hostname': u'8e24a698-3956-4ad8-8723-f0bd7d083237'}']\n NovaAction.hypervisors.find failed: "]

I also could see lots of mistral executions failed:

mistral execution-list | grep "ERROR";
| ba40a872-9170-4b83-ae5b-c78d465051f3 | 87e20d75-4b86-438b-bb0d-7b574c41a4de | tripleo.validations.v1.copy_ssh_key | | <none> | ERROR | Failure caused by error i... | 2018-01-18 14:35:34 | 2018-01-18 14:35:36 |
| 1c0f01f1-0c73-4b4d-8779-c659dd8b9e1f | 17915115-e0b4-44ea-9b9b-c05e5dad5dfc | tripleo.baremetal.v1.provide_manageable_nodes | | <none> | ERROR | Failure caused by error i... | 2018-01-18 15:42:05 | 2018-01-18 15:58:21 |
| 41fc80fa-a6da-44e3-a66f-96325ee37755 | 2a29ba5b-eb15-4cae-9fad-da7149bbf54c | tripleo.baremetal.v1.provide | sub-workflow execution | f3e1bf76-ea9c-4d6a-a090-865f1eba9a58 | ERROR | None | 2018-01-18 15:42:06 | 2018-01-18 15:58:21 |
| b8aeb67f-6874-4721-a56b-56df59e17227 | b75ece3c-c0c9-40b4-98ec-dba1a19b8fe5 | tripleo.baremetal.v1.cellv2_discovery | sub-workflow execution | 04fcd7dc-cbf2-490a-975e-6b79b741cf0d | ERROR | None | 2018-01-18 15:42:09 | 2018-01-18 15:42:15 |
| 80bd4c22-9d4e-4fc2-b276-cee11876cdfa | b75ece3c-c0c9-40b4-98ec-dba1a19b8fe5 | tripleo.baremetal.v1.cellv2_discovery | sub-workflow execution | 04fcd7dc-cbf2-490a-975e-6b79b741cf0d | ERROR | None | 2018-01-18 15:42:44 | 2018-01-18 15:42:50 |
| 6a044171-4a60-4ab2-815a-19637c3971ae | b75ece...

Changed in tripleo:
milestone: queens-3 → queens-rc1
Changed in tripleo:
milestone: queens-rc1 → rocky-1
Changed in tripleo:
milestone: rocky-1 → rocky-2
Toure Dunnon (toure)
Changed in tripleo:
assignee: nobody → Toure Dunnon (toure)
Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on python-tripleoclient (master)

Change abandoned by Toure Dunnon (<email address hidden>) on branch: master
Review: https://review.openstack.org/564585
Reason: the fix will go into the workbook.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/569811

Changed in tripleo:
milestone: rocky-2 → rocky-3
Revision history for this message
Harry Kominos (hkominos) wrote :

I dont know if this is only an output issue. The nodes just dont come up on aarch64

Revision history for this message
Dougal Matthews (d0ugal) wrote :

Harry, that is likely a different bug - you might want to open a new one up?

Changed in tripleo:
milestone: rocky-3 → rocky-rc1
Changed in tripleo:
assignee: Toure Dunnon (toure) → Dougal Matthews (d0ugal)
Changed in tripleo:
milestone: rocky-rc1 → stein-1
Dougal Matthews (d0ugal)
Changed in tripleo:
assignee: Dougal Matthews (d0ugal) → Toure Dunnon (toure)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-common (master)

Change abandoned by Alex Schultz (<email address hidden>) on branch: master
Review: https://review.openstack.org/569811
Reason: ci is broken and i asked for no approvals. I will restore when ci is fixed

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (master)

Reviewed: https://review.openstack.org/569811
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=206dd5ddabdec5eefb9ed9e2efddaaea67fa14b5
Submitter: Zuul
Branch: master

commit 206dd5ddabdec5eefb9ed9e2efddaaea67fa14b5
Author: Toure Dunnon <email address hidden>
Date: Mon May 21 11:18:59 2018 -0400

    Clean up node registration output.

    The cell_v2_discover_hosts is an internal workflow
    and doesn't need to send messages. It is retried by
    the parent workflow as it is expected to fail, and
    when it does we don't want it to send error messages
    each time. which can be retried up to 30 times by
    the parent workflow.

    Change-Id: I4b8f8d153a8e8c2eabb9efadb52cf685e1f3759f
    Closes-Bug: #1710685

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 10.0.0

This issue was fixed in the openstack/tripleo-common 10.0.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/644802

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/644804

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/rocky)

Reviewed: https://review.openstack.org/644802
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=e07dc084d6b52fb2ba430d0a7bbce51d2ff1ed79
Submitter: Zuul
Branch: stable/rocky

commit e07dc084d6b52fb2ba430d0a7bbce51d2ff1ed79
Author: Toure Dunnon <email address hidden>
Date: Mon May 21 11:18:59 2018 -0400

    Clean up node registration output.

    The cell_v2_discover_hosts is an internal workflow
    and doesn't need to send messages. It is retried by
    the parent workflow as it is expected to fail, and
    when it does we don't want it to send error messages
    each time. which can be retried up to 30 times by
    the parent workflow.

    Change-Id: I4b8f8d153a8e8c2eabb9efadb52cf685e1f3759f
    Closes-Bug: #1710685
    (cherry picked from commit 206dd5ddabdec5eefb9ed9e2efddaaea67fa14b5)

tags: added: in-stable-rocky
tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/queens)

Reviewed: https://review.openstack.org/644804
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=8315421b6d303aad31f2710b63c661d4b3b4f51c
Submitter: Zuul
Branch: stable/queens

commit 8315421b6d303aad31f2710b63c661d4b3b4f51c
Author: Toure Dunnon <email address hidden>
Date: Mon May 21 11:18:59 2018 -0400

    Clean up node registration output.

    The cell_v2_discover_hosts is an internal workflow
    and doesn't need to send messages. It is retried by
    the parent workflow as it is expected to fail, and
    when it does we don't want it to send error messages
    each time. which can be retried up to 30 times by
    the parent workflow.

    Conflicts:
     workbooks/baremetal.yaml

    Change-Id: I4b8f8d153a8e8c2eabb9efadb52cf685e1f3759f
    Closes-Bug: #1710685
    (cherry picked from commit 206dd5ddabdec5eefb9ed9e2efddaaea67fa14b5)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 9.6.0

This issue was fixed in the openstack/tripleo-common 9.6.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 8.7.0

This issue was fixed in the openstack/tripleo-common 8.7.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.