[2.4] MAchines fail to deploy with {"current_testing_script_set": ["script set instance with id 5 does not exist."]

Bug #1751946 reported by Andres Rodriguez on 2018-02-27
22
This bug affects 6 people
Affects Status Importance Assigned to Milestone
MAAS
Critical
Lee Trager
2.3
Critical
Lee Trager

Bug Description

Machines fail to deploy due to the following error:

The deploy action for 1 node failed with error: {"current_testing_script_set": ["script set instance with id 5 does not exist."]}

How I reproduced:

1. I added a KVM pod that had 3 VM's in it.
2. The machines transitioned to 'Commissioning' but they never PXE booted for other reasons (I enabled DHCP, and it never really got enabled, so had to restart maas-rackd).
3. I aborted the 'Commissioning' and they went back to 'New'
4. I commissioned 2 out of 3 VM's *without* running hardware tests. The machines transitioned to Ready.
5. I attempted to deploy 'Ready' machines and failed.
6. I commissioned the remaining machine *with* hardware tests, and attempted to deploy, and it worked just fine.

So a few things:

1. If hardware tests were not run, why would it even prevent deployment of the machine?

Related branches

Andres Rodriguez (andreserl) wrote :

I;m thinking this bug is also present in 2.3

Changed in maas:
importance: Undecided → Critical
milestone: none → 2.4.0alpha2
status: New → Triaged
assignee: nobody → Lee Trager (ltrager)
description: updated
Andres Rodriguez (andreserl) wrote :

maasdb=# SELECT * FROM metadataserver_scriptset WHERE id = 3;
 id | last_ping | result_type | node_id | power_state_before_transition | requested_scripts
----+-----------+-------------+---------+-------------------------------+-------------------
  3 | | 2 | 2 | off | {commissioning}
(1 row)

tags: added: hwtv2
Andres Rodriguez (andreserl) wrote :

ok, I'm able to reproduce this with commissioning as well.

What i noticed is that if I try to commission without tests selected, it fails. With tests selected it succeeds.

Changed in maas:
status: Triaged → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released
Edward Hope-Morley (hopem) wrote :

I think i just hit this with maas 2.3.5. I did indeed commission without any tests and got:

failed to start machine 13 (unexpected: ServerError: 400 BAD REQUEST ({"current_testing_script_set": ["script set instance with id 7 does not exist."]})), retrying in 10s (9 more attempts)

when i tried to deploy. I'll try commissioning again with tests to see that resolves as per @andreserl comment #4

Mario Splivalo (mariosplivalo) wrote :

I am hitting this also, with MAAS 2.3.5.

Just one of the machines fails to deploy with this error:

Node failed to be deployed, because of the following error: {"current_testing_script_set": ["script set instance with id 2029 does not exist."]}
.

Also, commissioning yield similar error:

Node failed to be commissioned, because of the following error: {"current_testing_script_set": ["script set instance with id 2029 does not exist."]}

I was able to 'fix' this with suggestion from Andres - re-commissioning worked when I selected at least one test. After that deploying of that machine was fine.

I see that this particular machine was the only one that had 'offending' scriptset id of 2029.

I *think* this has to do with me testing commissioning scripts in the past, and removing them recently. However, I don't (yet) understand why just this particular machine was affected.

I'm running MAAS 2.3.5-6511-gf466fdb-0ubuntu1.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers