failed Ironic deploys can have incorrect hypervisor attribute in Nova

Bug #1341347 reported by Robert Collins
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Invalid
Undecided
Unassigned
OpenStack Compute (nova)
Won't Fix
Low
Unassigned

Bug Description

I just booted 46 nodes at once from a single Ironic conductor/Nova/keystone etc all in one cloud.

After this, according to Ironic:

 - 1 node was in maintenance mode (see bug 1326279) 5 have instance_uuid None and the rest are active.

But according to Nova:

 - 8 are in ERROR spawning:
(in nova) | eb0e1255-4da5-46cb-b8e4-d3e1059e1087 | hw-test-eb0e1255-4da5-46cb-b8e4-d3e1059e1087 | ERROR | spawning | NOSTATE | |
(in ironic) | ebd0e2c1-7630-4067-94c1-81771c1680b6 | eb0e1255-4da5-46cb-b8e4-d3e1059e1087 | power on | active | False |
(see bug 1341346)

 - 5 are in ERROR NOSTATE:
(nova)| c389bb7b-1760-4e69-a4ea-0aea07ccd4d8 | hw-test-c389bb7b-1760-4e69-a4ea-0aea07ccd4d8 | ERROR | - | NOSTATE | ctlplane=10.10.16.146 |
nova show shows us that it has a hypervisor
| OS-EXT-SRV-ATTR:hypervisor_hostname | 8bc4357a-6b32-47de-b3ee-cec5b41e72d2
but in ironic there is no instance uuid (nor a deployment dict..):
| 8bc4357a-6b32-47de-b3ee-cec5b41e72d2 | None | power off | None | False |

This bug is about the Nova instance having a hypervisor attribute that is wrong :)

I have logs for this copied inside the DC, but a) its a production environment, so only tripleo-cd-admins can look (due to me being concened about passwords being in the logs) and b) they are 2.6GB in size, so its not all that feasible to attach them to the bug anyhow :).

Revision history for this message
Robert Collins (lifeless) wrote :

for tripleo-cd-admins, logs are in the hp1 region undercloud in /var/log/bug-1341346-and-1341347/

aeva black (tenbrae)
tags: added: nova-driver
Dmitry Tantsur (divius)
Changed in ironic:
status: New → Invalid
tags: added: ironic
Sean Dague (sdague)
Changed in nova:
status: New → Confirmed
importance: Undecided → Low
Revision history for this message
Jim Rollenhagen (jim-rollenhagen) wrote :

I tend to think the instance should always be tagged with a "hypervisor" for a record of where it was built. In the past this could cause problems with the resource tracker, but those are long solved.

There's also the part of this where the logs are likely gone by now, tripleo has changed its architecture up, etc. This is likely to be hard to reproduce, even if we think it is a bug.

Going to close this as WONTFIX, feel free to reopen if you think I'm a terrible person :)

Changed in nova:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.