ensure-availability can add more state servers while ones it just started haven't come up yet

Bug #1307736 reported by Roger Peppe
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
High
Wayne Witzel III

Bug Description

I was just playing with HA, and saw the following behaviour.
Before running the commands, I'd had an environment with
3 environment managers (0, 3 and 4), of which I had destroyed the instance
of 0. I was expecting it to remove the EnvironManager status
of that one (machine 0) but definitely did not expect
to see it add two more machines... (6 and 7)

http://paste.ubuntu.com/7252280/

(Note that I'd tricked out juju status with the ability to show voting status)

Tags: ha

Related branches

Revision history for this message
Andrew Wilkins (axwalk) wrote :

I don't think there's a bug here; note that machine agents 3 and 4 are down after the ensure-availability call, and that their wants-vote=false.

Unless there really is a bug, those machine agents were down when you called ensure-availability, and it did the right thing and took away their voting rights and created two replacement machines.

Having said that, this isn't ideal. Machines 3 and 4 were down temporarily; it would be good to (a) have ensure-availability say what it has done (and why), and (b) improve the availability heuristic. Perhaps we need an uptime for agents as well as current availability.

Curtis Hovey (sinzui)
tags: added: ha
Changed in juju-core:
status: New → Triaged
importance: Undecided → High
milestone: none → 1.20.0
Revision history for this message
John A Meinel (jameinel) wrote :

So I think we do have a bug here that needs addressing. If you do:

juju bootstrap
...
juju ensure-availability && juju ensure-availability

The first one will try to spin up 2 new machines to become state servers. Immediately running ensure-availability again will say "I want to have 3 available state servers" and will notice that 2 of them are down.

I think we need a way to say "I just kicked off 2 machines and I'm expecting them to come up soon, don't spin up 2 more until these have been given a chance".
I don't know what that timeout is, but otherwise we just give users a big gun to shoot themselves in the foot if they run "juju ensure-availability" too many times in a row.

summary: - ensure-availability can add too many state servers
+ ensure-availability can add more state servers while ones it just
+ started haven't come up yet
Changed in juju-core:
milestone: 1.20.0 → next-stable
Changed in juju-core:
assignee: nobody → Wayne Witzel III (wwitzel3)
status: Triaged → In Progress
Changed in juju-core:
status: In Progress → Fix Committed
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: next-stable → 1.19.4
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.