Activity log for bug #1696273

Date Who What changed Old value New value Message
2017-06-06 23:36:56 Marcelo Altmann bug added bug
2017-06-06 23:37:39 Marcelo Altmann nominated for series percona-xtradb-cluster/5.7
2017-06-06 23:37:39 Marcelo Altmann bug task added percona-xtradb-cluster/5.7
2017-06-06 23:47:30 Marcelo Altmann tags i192112
2017-06-06 23:48:47 Marcelo Altmann description SST will fail if donor has to send keyring. Looks like the donor is trying to send the file while socat is still opening port 4444 on joiner: 20170606 09:00:15.294 WSREP_SST: [INFO] Streaming GTID file before SST 20170606 09:00:18.368 WSREP_SST: [INFO] Streaming donor-keyring file before SST 2017/06/06 09:00:18 socat[15464] E connect(4, AF=2 10.131.17.240:4444, 16): Connection refused 20170606 09:00:18.376 WSREP_SST: [ERROR] ******************* FATAL ERROR ********************** 20170606 09:00:18.377 WSREP_SST: [ERROR] Error while sending data to joiner node: exit codes: 0 1 20170606 09:00:18.379 WSREP_SST: [ERROR] ****************************************************** 20170606 09:00:18.380 WSREP_SST: [ERROR] Cleanup after exit with status:32 Donor is showing connection refused, but port 4444 is open on joiner. How to repeat. 1) Create a PXC cluster with 2 nodes using 5.7 2) Stop node1 and add below config to my.cnf: in [mysqld] section: ssl-ca=/var/lib/mysql-files/ca.pem ssl-cert=/var/lib/mysql-files/server-cert.pem ssl-key=/var/lib/mysql-files/server-key.pem early-plugin-load=keyring_file.so keyring_file_data=/var/lib/mysql-keyring/keyring in [sst] section: streamfmt = xbstream encrypt=4 ssl-ca=/var/lib/mysql-files/ca.pem ssl-cert=/var/lib/mysql-files/server-cert.pem ssl-key=/var/lib/mysql-files/server-key.pem in [xtrabackup] section: keyring-file-data=/var/lib/mysql-keyring/keyring 3) move *.pem files from /var/lib/mysql to /var/lib/mysql-files 4) scp /var/lib/mysql-files/* to node2 5) Start node1 and wait until it joins the cluster 6) repeat step 2 on node2 7) force sst from node2, it will fail on above error How to fix: I was able to fix it by increasing the time the donor waits on joiner to receive the file. Edit /usr/bin/wsrep_sst_xtrabackup-v2 around line 1286: ++++++++++++++++++++++++++++++++++++++++ 1285 # joiner need to wait to receive the file. 1286 sleep 3 1287 1288 cp $keyring $KEYRING_DIR/$XB_DONOR_KEYRING_FILE 1289 1290 wsrep_log_info "Streaming donor-keyring file before SST" 1291 keyringbackupopt=" --keyring-file-data=${KEYRING_DIR}/${XB_DONOR_KEYRING_FILE} --server-id=$keyringsid " 1292 FILE_TO_STREAM=$XB_DONOR_KEYRING_FILE 1293 send_data_from_donor_to_joiner "${KEYRING_DIR}" "${stagemsg}-keyring" ++++++++++++++++++++++++++++++++++++++++ Increase sleep 3 to sleep 10 SST will fail if donor has to send keyring. Looks like the donor is trying to send the file while socat is still opening port 4444 on joiner:  20170606 09:00:15.294 WSREP_SST: [INFO] Streaming GTID file before SST  20170606 09:00:18.368 WSREP_SST: [INFO] Streaming donor-keyring file before SST 2017/06/06 09:00:18 socat[15464] E connect(4, AF=2 10.131.17.240:4444, 16): Connection refused  20170606 09:00:18.376 WSREP_SST: [ERROR] ******************* FATAL ERROR **********************  20170606 09:00:18.377 WSREP_SST: [ERROR] Error while sending data to joiner node: exit codes: 0 1  20170606 09:00:18.379 WSREP_SST: [ERROR] ******************************************************  20170606 09:00:18.380 WSREP_SST: [ERROR] Cleanup after exit with status:32 Donor is showing connection refused, but port 4444 is open on joiner. How to repeat. 1) Create a PXC cluster with 2 nodes using 5.7 2) Stop node1 and add below config to my.cnf: in [mysqld] section: ssl-ca=/var/lib/mysql-files/ca.pem ssl-cert=/var/lib/mysql-files/server-cert.pem ssl-key=/var/lib/mysql-files/server-key.pem early-plugin-load=keyring_file.so keyring_file_data=/var/lib/mysql-keyring/keyring in [sst] section: streamfmt = xbstream encrypt=4 ssl-ca=/var/lib/mysql-files/ca.pem ssl-cert=/var/lib/mysql-files/server-cert.pem ssl-key=/var/lib/mysql-files/server-key.pem in [xtrabackup] section: keyring-file-data=/var/lib/mysql-keyring/keyring 3) move *.pem files from /var/lib/mysql to /var/lib/mysql-files 4) scp /var/lib/mysql-files/* to node2 5) Start node1 and wait until it joins the cluster 6) repeat step 2 on node2 7) force sst from node2, it will fail on above error How to fix: I was able to fix it by increasing the time the donor waits on joiner to receive the file. Edit /usr/bin/wsrep_sst_xtrabackup-v2 around line 1286: ++++++++++++++++++++++++++++++++++++++++ 1285 # joiner need to wait to receive the file. 1286 sleep 3 1287 1288 cp $keyring $KEYRING_DIR/$XB_DONOR_KEYRING_FILE 1289 1290 wsrep_log_info "Streaming donor-keyring file before SST" 1291 keyringbackupopt=" --keyring-file-data=${KEYRING_DIR}/${XB_DONOR_KEYRING_FILE} --server-id=$keyringsid " 1292 FILE_TO_STREAM=$XB_DONOR_KEYRING_FILE 1293 send_data_from_donor_to_joiner "${KEYRING_DIR}" "${stagemsg}-keyring" ++++++++++++++++++++++++++++++++++++++++ Increase sleep 3 to sleep 10 Tested on 5.7.18-15-57 Percona XtraDB Cluster (GPL), Release rel15, Revision 7693d6e, WSREP version 29.20, wsrep_29.20
2017-06-07 23:02:49 Marcelo Altmann percona-xtradb-cluster/5.7: status New Confirmed
2017-06-07 23:02:53 Marcelo Altmann percona-xtradb-cluster: status New Confirmed
2017-06-15 04:12:40 Krunal Bauskar percona-xtradb-cluster/5.7: assignee Kenn Takara (kenn-takara)
2017-06-21 02:31:23 Kenn Takara percona-xtradb-cluster/5.7: status Confirmed Fix Committed
2017-06-21 02:32:18 Kenn Takara percona-xtradb-cluster: status Confirmed Fix Committed