[RFE] Provides conductor status and node association

Bug #1724474 reported by Kaifeng Wang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Confirmed
Wishlist
Kaifeng Wang

Bug Description

When using multiple ironic conductors in a deployment, nodes are distributed across conductors by hashring, there is no mechanism to know which conductor is currently serving a node.

Sometimes we need to locate the specific conductor to diagnose the cause of deploy failure, for example:
1. pxe boot failure
2. check deploy log sent by ipa

We also don't know how many conductor is alive or not, though a workaround is to list drivers, the information provided is from driver perspective, not conductors, and conductors not heartbeat in time is filtered out.

The proposal is to add an api endpoint for conductor status, something like /v1/conductors and /v1/conductors/<hostname>

we can get a full conductor list:

GET /v1/conductors

Host Online Last Seen
compute01 YES 2017-10-18 07:34:07
compute03 YES 2017-10-18 07:35:01
compute06 NO 2017-10-17 14:11:53

which node is serviced by this conductor:

GET /v1/conductors/compute01

online: YES
last seen: 2017-10-18 07:34:07
nodes: c5a379f1-12bf-4abd-9519-e874a9edbfdb
           095fcd72-5a97-4b82-abbc-2eea7fcc96a0
           e461db8a-5f82-4eb9-ac60-bbe4eb5c39f5

and add a conductor field to node show:

+------------------------+----------------------------------------------------------------+
| Property | Value |
+------------------------+----------------------------------------------------------------+
...
| uuid | eecc2a3e-4fa9-40fb-a84d-5a057f3df9e8 |
| conductor | compute01 |
+------------------------+----------------------------------------------------------------+

Tags: needs-spec rfe
Revision history for this message
Ruby Loo (rloo) wrote :

Thanks for the RFE. This looks similar to "[RFE] Add service management API" [1], so I'm going to mark this as a duplicate. If it isn't similar, please unduplicate. If it is the same, please help by e.g. reviewing the proposed spec for that.

[1] https://bugs.launchpad.net/ironic/+bug/1526759

Changed in ironic:
importance: Undecided → Wishlist
Revision history for this message
Kaifeng Wang (kaifeng) wrote :

Hi Ruby, I took a peek to the spec you provided, it focused on the role of conductor service management, that is different with the proposal here.
I can help to review to see if we can get similar information in that spec, then this rfe can be covered.

Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Greetings, Looking at the specification for service management, and this proposal seems to focus around providing insight into what node maps to what conductor. The addition to it seems to be able to provide insight into the heartbeat table which is used for the hash ring. I don't necessarily see value in providing a declaration of if a conductor is absolutely up or down, however insight into if the node is considered to be part of the hash ring at that moment and then the resulting mapping information of conductor to nodes would provide a great deal of information to assist in troubleshooting.

I think the simplest route would be to provide an api endpoint that returns information of conductor to node mapping as calculated out of the hash ring based upon the current heartbeat data.

As this would touch the API, we will need a specification. I don't think this ties in well with service management since that would be a larger amount of work to provide uniformity, and would be operationally contentious. Of course, it entirely depends on how it is implemented, hence why a spec would be helpful.

Ruby Loo (rloo)
tags: added: needs-spec
milan k (vetrisko)
Changed in ironic:
status: New → Confirmed
Kaifeng Wang (kaifeng)
Changed in ironic:
assignee: nobody → Kaifeng Wang (kaifeng)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to ironic-specs (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/528158

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.