failed pxe boot causes system to local boot old deployment - node gets marked deployed

Bug #1497991 reported by Blake Rouse
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MAAS
Won't Fix
High
Unassigned

Bug Description

During a deployment with MAAS, Juju, and Landscape one of the nodes failed to PXE boot and the BIOS then chose to boot from the local disk. That disk had a previous deployment and the cloud-init datasource for MAAS setup. The node then contacted the MAAS server with those credentials which were valid the last deployment and the node was marked deployed.

This is a big problem the node did not deploy at all and now all the old data is on that node. In this case the node should be marked "failed deployment" or even better MAAS would try to restart the node and try again, then after a few tries get marked "failed deployment".

If a node contacts the cloud-init metadata service *not the one used by curtin* before netboot_off=True, then the node should be marked "failed deployment". This signals that curtin did not finish its entire installation process and the node should not have made it to this point of using cloud-init. The node should then be powered off so that no user or Juju will try to ssh into that node, because it will have the previous deployment SSH keys.

How to reproduce:
Deploy a node with MAAS. Wait for the deployment to finish and be successful. Release the node and wait for it to go back to ready. Change the boot order on the node to boot from local disk first instead of PXE. *This needs to be done on a power type that doesn't change the boot order on power up, like virsh.* Deploy the node again, it will transition to "Deployed" very fast and will be the previous deployment not the new one.

description: updated
Christian Reis (kiko)
Changed in maas:
importance: Critical → High
milestone: 1.9.0 → 1.9.1
Changed in maas:
milestone: 1.9.1 → 1.9.2
Changed in maas:
milestone: 1.9.2 → 1.9.3
Changed in maas:
milestone: 1.9.3 → 1.9.4
Changed in maas:
milestone: 1.9.4 → 1.9.5
Revision history for this message
Andres Rodriguez (andreserl) wrote :

We believe this is no longer an issue in the latest releases of MAAS. Please upgrade to the latest version of MAAS, and If you believe this issue is still present, please re-open this bug report or file a new one.

Changed in maas:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.