[RFE] Add leader request command

Bug #2025724 reported by Pedro Guimarães
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
Wishlist
Unassigned

Bug Description

There is no commands from which a charm may request to change the leader of its own application.

That means Juju leadership cannot be extended for more than charm leadership (e.g. use it to also track the leader of a database or holder of a VIP).

Would be interesting to add a "leader-request <unit-number>" command that allows charms to request to become the leader.

It means implementing a method similar to: https://github.com/juju/juju/blob/68360b1badab4234061f1195e5f10ce469e228c4/worker/uniter/runner/context/leader.go#L103-L113

That does not have the initial check:

 if ctx.isMinion {
  return errIsMinion
 }

In this case, I'd recommend making the value of LeadershipGuarantee a model-config:
https://github.com/juju/juju/blob/68360b1badab4234061f1195e5f10ce469e228c4/worker/deployer/unit_agent.go#L193

Revision history for this message
Joseph Phillips (manadart) wrote :

Can you add some context here for the use-case that you're proposing this will solve? There might be a better way...

Changed in juju:
status: New → Triaged
importance: Undecided → Wishlist
Revision history for this message
Pedro Guimarães (pguimaraes) wrote :

Hi @manadart, yes. We've been discussing how to use juju's leadership to actually manage the workload leadership.

I think one example is to manage the VIP.
Originally, we'd use pacemaker + corosync. That means the charm leader will often be a different unit than the VIP owner.
Another example would the DB primaries.

The goal is to give the charms more control to actually request the charm leadership when they detect something has changed / needs to change on the workload side.
So, in this case, if the VIP needs to move (e.g. a restart of the workload owning the VIP will cause to lose it), then we'd also move the charm leader to the same unit.

That way, in a quick glance, juju status would tell you who is the leader.

Revision history for this message
John A Meinel (jameinel) wrote :

So we've actually had the stated policy that "you should not use Charm leadership to manage application leadership" for a few reasons

1) The actual policy from Charm leadership is that "there will not be >1 leader at any given time". Which is different from "there is always guaranteed to be 1 leader".

2) The actual mechanism we use is "you have a lease for X time, and you should renew that lease in X/2 time". (where we have chosen 60s as the length of lease, and 30s as the renewal).
Most applications would be quite happy to be down for 60s while juju notices that someone else needs to take over, while management of the app is usually perfectly fine at that interval.

3) There is a big different from "my application is not responding" vs "my charm is not responding". The Juju Unit agent being alive and responsive doesn't meant that your running application hasn't stopped accepting requests.
You *really* want a health check on the actual application to be the thing that maintains the primary of the application, and charm leadership is a *very* weak proxy for that.

If you are ok with that delay, I would be ok with 'you can request leadership, but it won't take effect until the current lease expires'. Implementation is also a little bit tricky, as we don't have any way to disallow other units from getting the lease (all units are in a 'block until the lease expires', which in a healthy state never happens, and then make a claim which might succeed)

Lots of potential problems, like you request for unit/1 to become the next leader, but then unit/1 dies. And then you've blocked out /0 and /2 from becoming leaders in their absence. We could layer *another* lease on top of it "I want leadership for the next 2 min".
And if you have buggy code and each unit decides that it wants to request the next leader, who wins?

tags: added: canonical-data-platform-eng
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.