Comment 5 for bug 1810331

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 1810331] Re: Mid-hook lost leadership issues

I think Christian and I worked this out today. Specifically,

a) Raft keeps an FSM which tracks who the current leader is.
b) When the leader changes Raft writes to the leaseholders collection the
identity of the current lease holder.
c) When making a change to leader-settings content, we ask Raft to Check
that we are currently the leader
d) We then create a transaction that asserts the holder (matching b).

However, it turns out that (b) can fail (do to mongo contention, timeout,
etc) and thus we never actually complete (b). Raft can't rollback an FSM
change so we end up inconsistent.

The reason you get "state changing too quickly" is because the check in (c)
is against memory, while the assert in (d) is against the database.

However, the Raft FSM is intended to be the one-true-source-of-all-truth
about leaders. It just happens that it couldn't update the database copy.
However, during (c) we can check if the database is consistent with memory,
and if not, go update the database.

We're reasonably confident about the source of the errors because looking
in controller logs we can see:
./machine-2.2.log:775239:2019-01-23 05:48:14 ERROR
juju.worker.raft.raftforwarder target.go:168 couldn't claim lease
"e39da954-406c-4e8d-8da8-4cfd8e979895:application-leadership#landscape-client#"
for "landscape-client/0": read tcp 127.0.0.1:34748->127.0.0.1:37017: i/o
timeout

And that is exactly the message you get when Raft fails to update Mongo.

As a performance optimization, step (c) can just do the in-memory check on
attempt=1, and only go reread and update the database if the first attempt
gets aborted. (Its what we do in about 90% of the cases anyway. Start with
in-memory state, if the txn fails, reread from the DB and try again.)

I believe that Christian is going to be working on this during his morning
tomorrow, if we want to have Joe work on something else. Or he can just
finish the work before Christian starts, and then Christian can work on
some of the other things. (Like not having ClaimLeadership(onetoken)
creating a map of *all* leaders to answer that question.)

On Fri, Jan 25, 2019 at 4:51 PM Richard Harding <email address hidden>
wrote:

> ** Changed in: juju
> Assignee: (unassigned) => Joseph Phillips (manadart)
>
> ** Changed in: juju
> Status: Triaged => In Progress
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1810331
>
> Title:
> Mid-hook lost leadership issues
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1810331/+subscriptions
>