WSREP_SST: [ERROR] with low data rate between DONOR and JOINER
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MySQL patches by Codership |
New
|
Undecided
|
Unassigned |
Bug Description
Galera version: 5.5.37
Xtrabackup version: 1.5.1
In a Galera cluster with 3 nodes, one is down and is restarted.
The bandwidth between the JOINER (node 1) and the DONOR (node 2) is low (less than 10 MB/sec).
The other node (node 3) is SYNCED and there is high bandwidth between nodes 2 and 3.
There is DML traffic running on nodes 2 and 3 while node 1 tries to join the cluster.
The SST provider is xtrabackup-v2.
When bandwidth between the JOINER (node 1) and the DONOR (node 2) is higher, there is no issue.
SST fails and the mysql daemon then stops on node 1.
The logs on JOINER show:
WSREP_SST: [INFO] Evaluating nc -dl 4444 | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20140626 10:09:56.988)
140626 10:22:46 [Note] WSREP: Created page /var/broadworks
WSREP_SST: [ERROR] xtrabackup process ended without creating '/var/broadwork
WSREP_SST: [INFO] Contents of datadir (20140626 10:25:59.238)
WSREP_SST: [INFO] -rw-rw---- 1 myadmin bwadmin 191 Jun 26 10:10 /var/broadworks
-rw-rw---- 1 myadmin bwadmin 134219048 Jun 26 10:22 /var/broadworks
-rw------- 1 myadmin bwadmin 134217728 Jun 26 10:25 /var/broadworks
-rw-rw---- 1 myadmin bwadmin 109 Jun 26 10:09 /var/broadworks
-rw-rw---- 1 myadmin bwadmin 104857600 Jun 26 10:10 /var/broadworks
-rw-rw---- 1 myadmin bwadmin 60468 Jun 26 09:39 /var/broadworks
-rw-rw---- 1 myadmin bwadmin 0 Jun 26 10:09 /var/broadworks
/var/broadworks
total 25325756
-rw-rw---- 1 myadmin bwadmin 71303168 Jun 26 10:14 as_user_test2.ibd
-rw-rw---- 1 myadmin bwadmin 71303168 Jun 26 10:14 as_user_test3.ibd
-rw-rw---- 1 myadmin bwadmin 50331648 Jun 26 10:14 as_user_test.ibd
-rw-rw---- 1 myadmin bwadmin 9806282752 Jun 26 10:19 users_test2.ibd
-rw-rw---- 1 myadmin bwadmin 9667870720 Jun 26 10:25 users_test3.ibd
-rw-rw---- 1 myadmin bwadmin 6241124352 Jun 26 10:14 users_test.ibd (20140626 10:25:59.241)
WSREP_SST: [ERROR] Cleanup after exit with status:32 (20140626 10:25:59.242)
WSREP_SST: [INFO] Removing the sst_in_progress file (20140626 10:25:59.243)
140626 10:25:59 [ERROR] WSREP: Process completed with error: wsrep_sst_
140626 10:25:59 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
140626 10:25:59 [ERROR] WSREP: SST failed: 32 (Broken pipe)
140626 10:25:59 [ERROR] Aborting
140626 10:25:59 [Warning] WSREP: 0.0 (sun92-x4170): State transfer to 1.0 (lin16-hs22) failed: -22 (Invalid argument)
140626 10:25:59 [ERROR] WSREP: gcs/src/
The DONOR logs show:
WSREP_SST: [INFO] Sleeping before data transfer for SST (20140626 10:09:50.790)
WSREP_SST: [INFO] Streaming the backup to joiner at lin16-hs22 4444 (20140626 10:10:00.793)
WSREP_SST: [INFO] Evaluating innobackupex --defaults-
140626 10:25:23 [Note] WSREP: Provider paused at 2b928f66-
140626 10:25:53 [Note] WSREP: resuming provider at 356321
140626 10:25:53 [Note] WSREP: Provider resumed.
WSREP_SST: [ERROR] innobackupex finished with error: 9. Check /var/broadworks
WSREP_SST: [ERROR] Cleanup after exit with status:22 (20140626 10:25:54.122)
140626 10:25:54 [ERROR] WSREP: Failed to read from: wsrep_sst_
140626 10:25:54 [ERROR] WSREP: Process completed with error: wsrep_sst_
140626 10:25:54 [ERROR] WSREP: Command did not run: wsrep_sst_
140626 10:25:54 [Warning] WSREP: 0.0 (sun92-x4170): State transfer to 1.0 (lin16-hs22) failed: -22 (Invalid argument)
So xtrabackup failed on donor side. It must have left a log file in mysql data dir. Could you post it?