Recoverable galera.cache after node crash/shutdown
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC |
New
|
Undecided
|
Unassigned |
Bug Description
Hi, in the following scenario:
Full cluster shutdown ...
Node1 Seq = 100002000
Node2 Seq = 100001000
Node3 Seq = 100000000
I use grastate.dat or --wsrep_recover=1 to find the most up to date node, and bootstrap it.
Other nodes can also recover their position, and will try to do IST to get the 1000 or 2000 missing events.
Unfortunately, as Node1 was just started, it removed it's gcache content, and cannot provide the write-sets.
As the data is physically present, wouldn't it be possible for the node being bootstrapped to recover the write-sets in it's own gcache, and then provide them (assuming they are available) for joining nodes requiring IST ?
That would avoid SST of all nodes of a cluster, after a complete cluster restart, with nodes having different sequence number.
Galera github issue: https:/