Comment 4 for bug 966679

Alex Yurchenko (ayurchen) wrote :


The actual Galera contract is that it will never commit a transaction that won't be committed on other members of the cluster unless they fail (in other words it guarantees eventual consistency), and thus is more relaxed than what you think. In fact the guarantee that you mention is impossible to enforce without some sort of a global mutex that would synchronise Galera IO ops and state changes and at best the performance of that thing will be abysmal (if ever possible).

Technically you may try to commit to Galera node in any state and as soon as it gets replicated and certified, it gets committed. States like SYNCED, JOINED, etc. is only our best effort to provide temporal synchrony from virtual synchrony. As an example consider the situation when the node looses primary component. Until this event propagates to the top layer, it is still in SYNCED state and allows writes and reads. What Galera guarantees is that in this case COMMIT will fail - or maybe not - if by the time it is processed the node is back with primary component.

This is not to say that we can't guarantee 100% synchronous reads. But in this case I doubt that this is a bug. After all when you're performing RSU you're supposed to know what you're doing - and you remove all load from that node. Of course it is cool when the software is totally unbreakable - but at what cost? Personally, at the moment I'd rather spend effort on improving Galera functionality than protecting it from deliberately unreasonable user actions.