[2.2] If a single commissioning script times out, all scripts will enter "Timed out" state.

Bug #1679431 reported by Mike Pontillo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
High
Lee Trager

Bug Description

In testing the commissioning script timeout branch[1], I noticed that if one of the scripts time out, *all* of them enter "Timed out" state.

I think that's a little misleading. They should probably enter "Aborted" state or some other state to indicate that they didn't run. Any script that *did* complete should not have its status modified. (not sure if this is the case already, since I tested a timeout with the very first script).

[1]:
https://code.launchpad.net/~ltrager/maas/commissioning_script_reaper/+merge/321423

Related branches

Revision history for this message
Lee Trager (ltrager) wrote :

Script results are put into a timed out status in two ways, when the node status expires, and when any script goes past its own time out. Should scripts which are canceled by MAAS always be put into an aborted status?

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Why should scripts cancelled if they time out. For example, why can't we move on onto finished the other commissioning scripts?

For example, right now:

1. commissioning starts with a max running time of 20 mins
2. commissioning scripts takes 10 seconds to run, but in one test, it actually took 20.
3. all other commissioning scripts continue to run.
4. commissioning finishes successfully because it was completed under 20 mins.

What I understand is happening now:

1. commissioning starts
2. script has 10 seconds timeout, which causes it to timeout at 10
3. all other scripts timeout
4. commissioning finishes as timeout (aka failed commissioning).

Revision history for this message
Blake Rouse (blake-rouse) wrote :

If a script times out then comissioning cannot be successful. The script clearly had an issue with either its design or the hardware. I think that commission should go to failed commissioning as soon as a script times out, all remaining tests should be marked Aborted, not timed out. They never ran and MAAS decided not to run them, so they are aborted tests.

Lee,

I was under the impression that testing works this way? Why should it be different from commissioning?

Changed in maas:
milestone: none → 2.2.0rc2
importance: Undecided → High
status: New → Triaged
Changed in maas:
importance: High → Critical
Lee Trager (ltrager)
Changed in maas:
assignee: nobody → Lee Trager (ltrager)
status: Triaged → In Progress
importance: Critical → High
Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.