Orphaned xtrabackup_pid file Breaks Cluster SST
Say if for some reason an orphaned xtrabackup_pid file is left inside tmpdir i.e. a manual backup that was killed and did not cleanup properly and the MySQL user is not able to delete it from wsre_sst_xtrabackup - the SST would fail even after the prepare phase on the JOINER is complete. On the DONOR you can get this message
130502 22:19:31 [Note] WSREP: Provider paused at 16e4deca-
130502 22:19:42 [Note] WSREP: Provider resumed.
rm: cannot remove `/tmp/xtrabacku
130502 22:19:44 [ERROR] WSREP: Failed to read from: wsrep_sst_
130502 22:19:44 [ERROR] WSREP: Process completed with error: wsrep_sst_
130502 22:19:44 [Warning] WSREP: 0 (uxdbc01): State transfer to 1 (uxdbc02) failed: -1 (Operation not permitted)
I filed this bug here because I think innobackupex should be doing this as pre-flight check i.e. if a pid-file exists already and it cannot be deleted it should bail immediately if necessary unlike bailing out after the prepare stage which could've caused significant amount of time for the user.