worker/provisioner: handle stopped/suspended instances

Bug #1042717 reported by Dave Cheney
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
Invalid
High
Unassigned

Bug Description

Definitions (apply as makes sense for the provider):

* shutdown - shutdown/suspended/stunned/temporarily unreachable/rebooting/paused/offline
* dead - terminated/deleted

The provisioner should be able to handle instances (physical/virtual) that are shutdown outside jujus' control. These machines are not terminated, but are coming back at some point in the future. More importantly they can return at any point without warning, so the environment has to be ready to accept them again without damage.

The _why_ someone would do such a thing is outside the scope of this ticket, but it is a use case that Juju should support.

Some things that fall out of this

1. an instance returned from environs.AllInstances() may not be running, consumers of this data will need to cater for this
2. environs.Instance needs to provide a way to indicate its status 'starting/running', or 'paused/offline', maybe this is a filter passed to environs.AllInstance()
3. There is probably some work that charm writers need to do to cater for machines which have relations, but are broken at the moment, this is outside what I can scope.
4. At some point in the future 'smart' charms may invoke add-unit to replace the capacity of their fallen comrade, but this is an addition, not a replacement. This is also outside the scope of this work.

Changed in juju-core:
milestone: none → 1.3
Changed in juju-core:
milestone: 1.3 → 1.5
assignee: nobody → Dave Cheney (dave-cheney)
importance: Undecided → High
Changed in juju-core:
milestone: 1.9.2 → 2.0
Changed in juju-core:
assignee: Dave Cheney (dave-cheney) → nobody
Revision history for this message
William Reade (fwereade) wrote :

Why did we not match the python behaviour in the first place? I'm not sure the above proposal is any improvement...

Revision history for this message
Dave Cheney (dave-cheney) wrote :

This is not my issue, it was reported by Gustavo, I just transposed it onto LP.

Revision history for this message
William Reade (fwereade) wrote :

Closing invalid: what actually happens currently is that the environment only reports active instances, but the provisioner does not take "missing" instances into account and just assumes they're probably ok. The user can observe a machine agent to be "down" via juju status and take action from there, but we shouldn't be trying to be too smart here -- the python feature went unremarked for months and months, and was only ever detected as a bug -- the user terminated an instance out-of-band, and was surprised that juju spun up a new machine to replace (it despite the machine not doing anything).

Changed in juju-core:
status: New → Invalid
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 2.0 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.