juju-core

Bug #1339866
Comment #1

Comment 1 for bug 1339866

Revision history for this message

Michael Foord (mfoord) wrote on 2014-07-10:

After further experimentation, and verification of the *actual* specified behaviour, I can confirm that juju does behave correctly when mongo on the primary HA state server (or on a secondary) dies.

The symptom we saw that caused us to believe it didn't behave correctly was that the machine agent.conf was not rewritten, and the now-dead machine is still listed as an api server. However, this is actually the expected behaviour. When mongo goes down jujud remains up - but if it is the master it does shut down all the relevant jobs and workers (verified from the machine log) and the mongo primary fails over to a new machine which becomes the juju master. The old machine is left in the mongo replica set, and still listed as a valid apiserver, as it *may* come back. Running "juju ensure-availability" again will remove its entry (and also shut down the instance it runs on I believe).

Clients and machine agents have a list of all api servers, and if contacting one fails (e.g. our down machine) then they will automatically try the other entries in the list. So this behaviour is "as specified" and not a problem.