Comment 13 for bug 1728111

Revision history for this message
John A Meinel (jameinel) wrote :

Side note, we do potentially have a serious issue about responding to relation data and coordination of leadership. Our statement that we guarantee you will have no more than 1 leader at any given time doesn't work well with arbitrary hooks in response to relation data changes.
Here is an example timeline:

 0s mysql/0 => becomes the leader (goes unresponsive for a bit)
 20s rabbit/0 => joins the relation with mysql and sets data in the relation bucket that only the leader can handle
 35s mysql/1 sees rabbits data but is not the leader
 35s mysql/2 sees rabbits data but is not the leader
 60s mysql/0 demoted, mysql/1 is now the leader
 65s mysql/1 sees the relation data from rabbit but is no longer the leader

There is no guarantee that there will be a leader that sees relation change data.
The one backstop would be 'leader-elected', which could go through and re-evaluate if there is anything that the previous leader missed. (look at your existing relations, and see if there was something you didn't handle earlier because you weren't the leader, that the last leader also failed to handle).

All of the above is possible even with nothing wrong with our leader election process. All it takes is for the machine where the leader is currently running to be busy with other hooks (colocated workloads), that it takes too long for what was the leader to actually respond to a relation.

I'd like us to figure out what they need as charmers to actually handle this case. Should there be an idea of "if I become the leader this is what I would want to do", that gets set aside as context that gets presented again as context during leader-elected?