wsrep_sst_xtrabackup-v2 fails to clean a stale .sst temporary directory
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC |
Invalid
|
Undecided
|
Unassigned |
Bug Description
If SST previously failed partway through and there are stale files in the "${DATA}/.sst" directory all future SST attempts will fail until administrative action is taken to clean these files.
As I understand, xtrabackup-v2 supports a "cpat" option whose purpose is to details the files that should be preserved when cleaning the datadir and that ".sst" is explicitly included. If ".sst" is removed from that regex, SST fails when trying to write into a non-existent ".sst" directory (although it is then cleaned).
I am not sure if this behavior is intentional, but having some option for xtrabackup-v2 to do the cleaning itself would be convenient rather than forcing administrative action after a failure. I suppose the existing behavior is in place for troubleshooting or avoiding certain data loss scenarios.
My initial thoughts at a fix involved allow manually updating the cpat variable to exclude .sst via configuration and simply ensuring this directory is (re)created before proceeding. Perhaps a change like this:
--- a/wsrep_
+++ b/wsrep_
@@ -861,15 +861,16 @@ then
rm -rf ${DATA}/.sst
fi
- mkdir -p ${DATA}/.sst
- (recv_joiner $DATA/.sst "${stagemsg}-SST" 0 0) &
- jpid=$!
- wsrep_log_info "Proceeding with SST"
find $ib_home_dir $ib_log_dir $ib_undo_dir $DATA -mindepth 1 -regex $cpat -prune -o -exec rm -rfv {} 1>&2 \+
+ mkdir -p ${DATA}/.sst
+ (recv_joiner $DATA/.sst "${stagemsg}-SST" 0 0) &
+ jpid=$!
+ wsrep_log_info "Proceeding with SST"
+
if [[ -n ${tempdir:-} ]];then
Alternatively, maybe just cleaning out the contents of .sst (perhaps via some opt-in option) when a stale .sst is detected would be somewhat cleaner.
I put a duplicate here, although I'm not sure the ./ibdata1 path may be related to the .sst dir