Failed commissioning leaves machine completely cut off from MAAS
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Fix Released
|
Medium
|
Lee Trager |
Bug Description
MAAS's designed behaviour changes the MAAS user password to a randomly generated password in the BMC every time a node is enlisted, commissioned or re-commissioned. There is a chance that a failed commissioning run can cause the node to become completely cut off from MAAS where the only options to recover the node is to either set a default user and password in the BMC and modify MAAS to use that combo instead of the default maas:$RANDOMPAS
What happens in this case is this:
MAAS powers the node on to commission
commissioning ephemeral boots and begins doing it's thing.
Ephemeral sets the new password for the maas user in the BMC
Ephemeral does other stuff
Something breaks
Ephemeral fails to update MAAS with new maas user password
MAAS markes node as Failed Commissioning and shows Power Error because it no longer has the current BMC password.
At this point MAAS is no longer able to talk to the node at all. The only way to recover is, as stated above, manually set a password in the BMC and modify MAAS manually, or delete the node and start over from scratch.
To fix this. either:
1: Change the behaviour to ONLY create passwords during enlistment and subsequently only on user demand, rather than re-create passwords every time a node is commissioned.
2: At least have the Ephemeral IMMEDIATELY update MAAS with the new maas user password BEFORE it attempts anything else.
Idea 1 fixes the problem, Idea 2 is more of a band-aid and could still leave the system in an uncontrollable state.
I discovered this issue while trying to root cause a but with apt proxies during commissioning:
https:/
Related branches
- Alberto Donato (community): Approve
- MAAS Lander: Approve
-
Diff: 24 lines (+5/-1)1 file modifiedsrc/metadataserver/user_data/templates/commissioning.template (+5/-1)
- MAAS Lander: Approve
- Blake Rouse (community): Approve
-
Diff: 24 lines (+5/-1)1 file modifiedsrc/metadataserver/user_data/templates/commissioning.template (+5/-1)
description: | updated |
Changed in maas: | |
status: | New → In Progress |
importance: | Undecided → Medium |
assignee: | nobody → Lee Trager (ltrager) |
Changed in maas: | |
milestone: | none → next |
status: | In Progress → Fix Committed |
Changed in maas: | |
status: | Fix Committed → Fix Released |
Changed in maas: | |
milestone: | next → none |