Comment 4 for bug 1288528

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote : Re: SST Resumes Even When Donor Was Already Detected as SYNCED State

a) Why is wsrep_desync used on donor since it goes to DONOR/Desynced anyways? If it is to reproduce the issue then fine.

b) I tried to reproduce this.

Yes, if desync=ON, the joiner hangs, but after it is set to OFF,
it goes forward fine.

Also, the second "wsrep_desync=OFF;" should be idempotent now
where it gives:

=====================
MySQL [test]> set global wsrep_desync=OFF;
ERROR 1231 (42000): Variable 'wsrep_desync' can't be set to the value of 'OFF'
=======================

or should it be " 5. donor> SET GLOBAL wsrep_desync=ON;" ?

Here are my logs:

Donor:

140308 20:47:35 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 92107)
140308 20:47:50 [Note] WSREP: declaring cfb2ec81-a6d4-11e3-98d8-1ef20a3c488f stable
140308 20:47:50 [Note] WSREP: Node 049f90ce-a6d1-11e3-bb55-3bbb14bffdcd state prim
140308 20:47:50 [Note] WSREP: view(view_id(PRIM,049f90ce-a6d1-11e3-bb55-3bbb14bffdcd,6) memb {
        049f90ce-a6d1-11e3-bb55-3bbb14bffdcd,
        cfb2ec81-a6d4-11e3-98d8-1ef20a3c488f,
} joined {
} left {
} partitioned {
})
140308 20:47:50 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2
140308 20:47:50 [Note] WSREP: STATE_EXCHANGE: sent state UUID: cffff748-a6d4-11e3-96a3-bbee5342f5db
140308 20:47:50 [Note] WSREP: STATE EXCHANGE: sent state msg: cffff748-a6d4-11e3-96a3-bbee5342f5db
140308 20:47:50 [Note] WSREP: STATE EXCHANGE: got state msg: cffff748-a6d4-11e3-96a3-bbee5342f5db from 0 (Arch1)
140308 20:47:50 [Note] WSREP: STATE EXCHANGE: got state msg: cffff748-a6d4-11e3-96a3-bbee5342f5db from 1 (Arch2)
140308 20:47:50 [Note] WSREP: Quorum results:
        version = 2,
        component = PRIMARY,
        conf_id = 5,
        members = 1/2 (joined/total),
        act_id = 92107,
        last_appl. = 0,
        protocols = 0/4/2 (gcs/repl/appl),
        group UUID = d4bea759-94bb-11e3-b9c3-be3eb9eca9c6
140308 20:47:50 [Note] WSREP: Flow-control interval: [23, 23]
140308 20:47:50 [Note] WSREP: New cluster view: global state: d4bea759-94bb-11e3-b9c3-be3eb9eca9c6:92107, view# 6: Primary, number of nodes: 2, my index: 0, protocol version 2
140308 20:47:50 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
140308 20:47:50 [Note] WSREP: Assign initial position for certification: 92107, protocol version: 2
140308 20:48:25 [Note] WSREP: Node 0 (Arch1) resyncs itself to group
140308 20:48:25 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 92107)
140308 20:48:25 [Note] WSREP: Member 0 (Arch1) synced with group.
140308 20:48:25 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 92107)
140308 20:48:25 [Note] WSREP: Synchronized with group, ready for connections
140308 20:48:26 [Note] WSREP: Node 1 (Arch2) requested state transfer from '*any*'. Selected 0 (Arch1)(SYNCED) as donor.
140308 20:48:26 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 92107)
140308 20:48:26 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
140308 20:48:26 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address '127.0.0.1:15001/xtrabackup_sst' --auth '(null)' --socket '/pxc/datadir/pxc.sock' --datadir '/pxc/datadir/' --defaults-file '/pxc/etc/my.cnf.local' --gtid 'd4bea759-94bb-11e3-b9c3-be3eb9eca9c6:92107''
140308 20:48:26 [Note] WSREP: sst_donor_thread signaled with 0
WSREP_SST: [INFO] Streaming with xbstream (20140308 20:48:26.981)
WSREP_SST: [INFO] Using socat as streamer (20140308 20:48:26.985)
WSREP_SST: [INFO] Streaming GTID file before SST (20140308 20:48:27.271)
WSREP_SST: [INFO] Evaluating xbstream -c ${INFO_FILE} | socat -u stdio TCP:127.0.0.1:15001; RC=( ${PIPESTATUS[@]} ) (20140308 20:48:27.380)
WSREP_SST: [INFO] Sleeping before data transfer for SST (20140308 20:48:27.389)
140308 20:48:29 [ERROR] WSREP: Node desync failed.: 11 (Resource temporarily unavailable)
         at galera/src/replicator_smm.cpp:desync():1544
140308 20:48:29 [Warning] WSREP: SET desync failed 3 for set global wsrep_desync=ON
140308 20:48:37 [Note] WSREP: 0 (Arch1): State transfer to 1 (Arch2) complete.
140308 20:48:37 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 92107)
140308 20:48:37 [Note] WSREP: Member 0 (Arch1) synced with group.
140308 20:48:37 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 92107)
140308 20:48:37 [Note] WSREP: Synchronized with group, ready for connections
140308 20:48:37 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
WSREP_SST: [INFO] Streaming the backup to joiner at 127.0.0.1 15001 (20140308 20:48:37.395)
WSREP_SST: [INFO] Evaluating innobackupex --defaults-file=/pxc/etc/my.cnf.local --kill-long-query-type=select --parallel=8 --rebuild-threads=4 --no-version-check $INNOEXTRA --galera-info --stream=$sfmt ${TMPDIR} 2>${DATA}/innobackup.backup.log | socat -u stdio TCP:127.0.0.1:15001; RC=( ${PIPESTATUS[@]} ) (20140308 20:48:37.399)
140308 20:48:43 [Note] WSREP: Provider paused at d4bea759-94bb-11e3-b9c3-be3eb9eca9c6:92107 (28)
140308 20:48:46 [Note] WSREP: resuming provider at 28
140308 20:48:46 [Note] WSREP: Provider resumed.
140308 20:48:46 [ERROR] WSREP: sst sent called when not SST donor, state SYNCED
WSREP_SST: [INFO] Total time on donor: 0 seconds (20140308 20:48:46.595)
140308 20:49:03 [Note] WSREP: 1 (Arch2): State transfer from 0 (Arch1) complete.
140308 20:49:03 [Note] WSREP: Member 1 (Arch2) synced with group.

Joiner:
==============================

140308 20:47:53 [Note] WSREP: Requesting state transfer failed: -11(Resource temporarily unavailable). Will keep retrying every 1 second(s)
WSREP_SST: [INFO] Evaluating socat -u TCP-LISTEN:15001,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20140308 20:47:53.512)
140308 20:48:25 [Warning] WSREP: Could not find peer:
140308 20:48:25 [Note] WSREP: 0 (Arch1): State transfer to -1 (left the group) complete.
140308 20:48:25 [Note] WSREP: Member 0 (Arch1) synced with group.
140308 20:48:26 [Note] WSREP: Node 1 (Arch2) requested state transfer from '*any*'. Selected 0 (Arch1)(SYNCED) as donor.
140308 20:48:26 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 92107)
140308 20:48:26 [Note] WSREP: Requesting state transfer: success after 34 tries, donor: 0
WSREP_SST: [INFO] Proceeding with SST (20140308 20:48:27.411)
WSREP_SST: [INFO] Cleaning the existing datadir (20140308 20:48:27.413)
removed ‘/pxc/datadir2/backup-my.cnf’
removed ‘/pxc/datadir2/test/db.opt’
..................
removed ‘/pxc/datadir2/xtrabackup_checkpoints’
removed ‘/pxc/datadir2/ibdata1’
removed ‘/pxc/datadir2/xtrabackup_binlog_pos_innodb’
removed ‘/pxc/datadir2/ib_logfile0’
removed ‘/pxc/datadir2/ib_logfile1’
removed ‘/pxc/datadir2/mysql_bin.index’
removed ‘/pxc/datadir2/mysql_bin.000001’
WSREP_SST: [INFO] Evaluating socat -u TCP-LISTEN:15001,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20140308 20:48:27.540)
140308 20:48:37 [Note] WSREP: 0 (Arch1): State transfer to 1 (Arch2) complete.
140308 20:48:37 [Note] WSREP: Member 0 (Arch1) synced with group.
WSREP_SST: [INFO] Compressed qpress files found (20140308 20:48:46.613)
WSREP_SST: [INFO] Decompression with 8 threads (20140308 20:48:46.618)
WSREP_SST: [INFO] Evaluating find /pxc/datadir2/ -type f -name '*.qp' -printf '%p\n%h\n' | xargs -n 2 qpress -T8d (20140308 20:48:46.621)
WSREP_SST: [INFO] Removing qpress files after decompression (20140308 20:48:50.612)
WSREP_SST: [INFO] Preparing the backup at /pxc/datadir2/ (20140308 20:48:50.640)
WSREP_SST: [INFO] Evaluating innobackupex --no-version-check --kill-long-query-type=select --throttle=30 --use-memory=2G --tmpdir=/dev/shm --apply-log $rebuildcmd ${DATA} &>${DATA}/innobackup.prepare.log (20140308 20:48:50.645)
WSREP_SST: [INFO] Total time on joiner: 0 seconds (20140308 20:49:02.051)
WSREP_SST: [INFO] Removing the sst_in_progress file (20140308 20:49:02.053)
140308 20:49:02 [Note] WSREP: SST complete, seqno: 92107
140308 20:49:02 InnoDB: !!!!!!!! UNIV_DEBUG switched on !!!!!!!!!
140308 20:49:02 InnoDB: The InnoDB memory heap is disabled
140308 20:49:02 InnoDB: Mutexes and rw_locks use GCC atomic builtins
140308 20:49:02 InnoDB: Compressed tables use zlib 1.2.8
140308 20:49:02 InnoDB: Using Linux native AIO
140308 20:49:02 InnoDB: Initializing buffer pool, size = 500.0M
140308 20:49:02 InnoDB: Completed initialization of buffer pool
140308 20:49:02 InnoDB: highest supported file format is Barracuda.
140308 20:49:02 InnoDB: Waiting for the background threads to start
140308 20:49:03 Percona XtraDB (http://www.percona.com) 5.5.36-34.0-29.3 started; log sequence number 6196028940
140308 20:49:03 [Note] WSREP: Initial TC log open: binlog
140308 20:49:03 [Note] Event Scheduler: Loaded 0 events
140308 20:49:03 [Note] WSREP: Signalling provider to continue.
140308 20:49:03 [Note] WSREP: SST received: d4bea759-94bb-11e3-b9c3-be3eb9eca9c6:92107
140308 20:49:03 [Note] WSREP: 1 (Arch2): State transfer from 0 (Arch1) complete.
140308 20:49:03 [Note] WSREP: Shifting JOINER -> JOINED (TO: 92107)
140308 20:49:03 [Note] WSREP: Member 1 (Arch2) synced with group.
140308 20:49:03 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 92107)
140308 20:49:03 [Note] WSREP: Synchronized with group, ready for connections
140308 20:49:03 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
140308 20:49:03 [Note] WSREP: Nobody is waiting for SST.
140308 20:49:03 [Note] /pxcd/bin/mysqld: ready for connections.
Version: '5.5.36-34.0-debug-log' socket: '/pxc/datadir2/pxc.sock' port: 15000 MySQL Community Server (GPL), wsrep_25.10.r666