Joiner hangs after failed SST

Bug #1707633 reported by Sveta Smirnova on 2017-07-31
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.5
Won't Fix
Undecided
Unassigned
5.6
Incomplete
Undecided
Kenn Takara
5.7
Fix Committed
Undecided
Kenn Takara

Bug Description

How to repeat:

Start 3-nodes cluster.

export PXC=/bigdisk/sveta/Percona-XtraDB-Cluster-5.7.18-rel15-29.20.1.Linux.x86_64.ssl100
export GALERA=$PXC/lib/libgalera_smm.so

export PATH=$PXC:$PATH

./bin/mysqld --basedir=. --datadir=./data1 --socket=/tmp/pxc1.sock --port=3371 --log-error --gtid_mode=ON --log-slave-updates --enforce-gtid-consistency --log-bin --server-id=1 --binlog-format=row --wsrep_sst_method=xtrabackup-v2 --wsrep_sst_auth=root: --wsrep_cluster_address='gcomm://127.0.0.1:3381,127.0.0.1:3382' --wsrep-provider=$GALERA --wsrep_cluster_name=cluster1 --wsrep_node_name=pxc1 --default_storage_engine=InnoDB --innodb_autoinc_lock_mode=2 --wsrep_provider_options='base_host=127.0.0.1;base_port=3381' --wsrep_sst_receive_address=127.0.0.1:3391 --wsrep_node_address=127.0.0.1 --wsrep-new-cluster --innodb-temp-data-file-path=../it/ibtmp1:12M:autoextend &

./bin/mysqld --basedir=. --datadir=./data2 --socket=/tmp/pxc2.sock --port=3372 --log-error --gtid_mode=ON --log-slave-updates --enforce-gtid-consistency --log-bin --server-id=2 --binlog-format=row --wsrep_sst_method=xtrabackup-v2 --wsrep_sst_auth=root: --wsrep_cluster_address='gcomm://127.0.0.1:3381,127.0.0.1:3382' --wsrep-provider=$GALERA --wsrep_cluster_name=cluster1 --wsrep_node_name=pxc2 --default_storage_engine=InnoDB --innodb_autoinc_lock_mode=2 --wsrep_provider_options='base_host=127.0.0.1;base_port=3382;evs.suspect_timeout=PT10S;evs.inactive_timeout=PT30S;evs.install_timeout=PT15S' --wsrep_sst_receive_address=127.0.0.1:3392 --wsrep_node_incoming_address=127.0.0.1:3372 --wsrep_node_address=127.0.0.1 --wsrep_on=ON --innodb-temp-data-file-path=../it1/ibtmp1:12M:autoextend &

./bin/mysqld --basedir=. --datadir=./data3 --socket=/tmp/pxc3.sock --port=3373 --log-error --gtid_mode=ON --log-slave-updates --enforce-gtid-consistency --log-bin --server-id=3 --binlog-format=row --wsrep_sst_method=xtrabackup-v2 --wsrep_sst_auth=root: --wsrep_cluster_address='gcomm://127.0.0.1:3381,127.0.0.1:3382' --wsrep-provider=$GALERA --wsrep_cluster_name=cluster1 --wsrep_node_name=pxc2 --default_storage_engine=InnoDB --innodb_autoinc_lock_mode=2 --wsrep_provider_options='base_host=127.0.0.1;base_port=3383;evs.suspect_timeout=PT10S;evs.inactive_timeout=PT30S;evs.install_timeout=PT15S' --wsrep_sst_receive_address=127.0.0.1:3393 --wsrep_node_incoming_address=127.0.0.1:3373 --wsrep_node_address=127.0.0.1 --wsrep_on=ON --innodb-temp-data-file-path=../it3/ibtmp1:12M:autoextend &

Make sure cluster works, data replicated.

Stop node1.

Create a table and load a lot of data into it:

USE test;

DROP TABLE IF EXISTS `joinit`;

CREATE TABLE `joinit` (
  `i` int(11) NOT NULL AUTO_INCREMENT,
  `s` varchar(64) DEFAULT NULL,
  `t` time NOT NULL,
  `g` int(11) NOT NULL,
  PRIMARY KEY (`i`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

INSERT INTO joinit VALUES (NULL, uuid(), time(now()), (FLOOR( 1 + RAND( ) *60 )));
INSERT INTO joinit SELECT NULL, uuid(), time(now()), (FLOOR( 1 + RAND( ) *60 )) FROM joinit;
INSERT INTO joinit SELECT NULL, uuid(), time(now()), (FLOOR( 1 + RAND( ) *60 )) FROM joinit;
INSERT INTO joinit SELECT NULL, uuid(), time(now()), (FLOOR( 1 + RAND( ) *60 )) FROM joinit;
...
INSERT INTO joinit SELECT NULL, uuid(), time(now()), (FLOOR( 1 + RAND( ) *60 )) FROM joinit;

mysql> select count(*) from joinit;
+----------+
| count(*) |
+----------+
| 67108864 |
+----------+
1 row in set (28.93 sec)

On one of nodes (node 2) run OPTIMIZE TABLE joinit;

Wait few seconds and start earlier stopped node.

This node will try to initiate SST from node3 which will fail with error:

2017-07-29T10:26:12.201338Z 0 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'donor' --address '127.0.0.1:3391/xtrabackup_sst//1' --socket '/tmp/pxc3.sock' --datadir '/bigdisk/sveta/Percona-XtraDB-Cluster-5.7.18-rel15-29.20.1.Linux.x86_64.ssl100/data3/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --binlog 'bm-support01-bin' --gtid 'd66a0e04-6b2c-11e7-98ef-c27e1d637690:300039': 22 (Invalid argument)
2017-07-29T10:26:12.201591Z 0 [ERROR] WSREP: Command did not run: wsrep_sst_xtrabackup-v2 --role 'donor' --address '127.0.0.1:3391/xtrabackup_sst//1' --socket '/tmp/pxc3.sock' --datadir '/bigdisk/sveta/Percona-XtraDB-Cluster-5.7.18-rel15-29.20.1.Linux.x86_64.ssl100/data3/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --binlog 'bm-support01-bin' --gtid 'd66a0e04-6b2c-11e7-98ef-c27e1d637690:300039'
2017-07-29T10:26:12.202033Z 0 [Warning] WSREP: Could not find peer: a24ab218-7447-11e7-ac8a-f3686cc44f71
2017-07-29T10:26:12.202109Z 0 [Warning] WSREP: 0.0 (pxc2): State transfer to -1.-1 (left the group) failed: -22 (Invalid argument)
2017-07-29T10:26:12.202142Z 0 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 300040)
2017-07-29T10:26:12.202455Z 0 [Note] WSREP: Member 0.0 (pxc2) synced with group.
2017-07-29T10:26:12.202489Z 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 300040)

$ tail -n 45 data3/innobackup.backup.log
170729 06:26:04 >> log scanned up to (8909890959)
170729 06:26:05 >> log scanned up to (8909891005)
170729 06:26:06 >> log scanned up to (8909891051)
170729 06:26:07 >> log scanned up to (8909891113)
170729 06:26:08 >> log scanned up to (8909891159)
170729 06:26:09 >> log scanned up to (8909891205)
InnoDB: Last flushed lsn: 8901930471 load_index lsn 8909891205
[FATAL] InnoDB: An optimized(without redo logging) DDLoperation has been performed. All modified pages may not have been flushed to the disk yet.
    PXB will not be able take a consistent backup. Retry the backup operation
2017-07-29 06:26:10 0x7fc17076e700 InnoDB: Assertion failure in thread 140468792256256 in file ut0ut.cc line 916
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
10:26:10 UTC - xtrabackup got signal 6 ;
This could be because you hit a bug or data is corrupted.
This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x10000
xtrabackup(my_print_stacktrace+0x35)[0xc81ce5]
xtrabackup(handle_fatal_signal+0x273)[0xb1cb53]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7fc17381a330]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7fc171b6bc37]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7fc171b6f028]
xtrabackup[0x75dd4f]
xtrabackup(_ZN2ib5fatalD1Ev+0xb3)[0x80aa43]
xtrabackup[0x9a35a1]
xtrabackup(_Z19recv_parse_log_recsm7store_tb+0x4df)[0x9a61ef]
xtrabackup[0x766e60]
xtrabackup[0x767476]
xtrabackup[0x767823]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7fc173812184]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fc171c2f37d]

Please report a bug at https://bugs.launchpad.net/percona-xtrabackup

Node1 will not try to repeat SST, but stall:

$ tail -n20 data1/bm-support01.err
        be949bc4,0
        }
)
2017-07-29T10:26:12.185200Z 2 [Note] WSREP: Current view of cluster as seen by this node
view ((empty))
2017-07-29T10:26:12.185558Z 2 [Note] WSREP: gcomm: closed
2017-07-29T10:26:12.185595Z 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2017-07-29T10:26:12.185632Z 0 [Note] WSREP: Flow-control interval: [100, 100]
2017-07-29T10:26:12.185642Z 0 [Note] WSREP: Received NON-PRIMARY.
2017-07-29T10:26:12.185651Z 0 [Note] WSREP: Shifting JOINER -> OPEN (TO: 300040)
2017-07-29T10:26:12.185667Z 0 [Note] WSREP: Received self-leave message.
2017-07-29T10:26:12.185678Z 0 [Note] WSREP: Flow-control interval: [0, 0]
2017-07-29T10:26:12.185686Z 0 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2017-07-29T10:26:12.185693Z 0 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 300040)
2017-07-29T10:26:12.185703Z 0 [Note] WSREP: RECV thread exiting 0: Success
2017-07-29T10:26:12.185733Z 2 [Note] WSREP: recv_thread() joined.
2017-07-29T10:26:12.185746Z 2 [Note] WSREP: Closing replication queue.
2017-07-29T10:26:12.185754Z 2 [Note] WSREP: Closing slave action queue.
2017-07-29T10:26:14.184075Z 0 [Note] WSREP: Waiting for active wsrep applier to exit
2017-07-29T10:26:14.184137Z 1 [Note] WSREP: rollbacker thread exiting

But mysqld will still run:

$ ps -ef | grep sveta | grep mysqld | grep pxc1
sveta.s+ 46217 26371 0 Jul29 pts/206 00:00:00 ./bin/mysqld --basedir=. --datadir=./data1 --socket=/tmp/pxc1.sock --port=3371 --log-error --gtid_mode=ON --log-slave-updates --enforce-gtid-consistency --log-bin --server-id=1 --binlog-format=row --wsrep_sst_method=xtrabackup-v2 --wsrep_sst_auth=root: --wsrep_cluster_address=gcomm://127.0.0.1:3381,127.0.0.1:3382 --wsrep-provider=/bigdisk/sveta/Percona-XtraDB-Cluster-5.7.18-rel15-29.20.1.Linux.x86_64.ssl100/lib/libgalera_smm.so --wsrep_cluster_name=cluster1 --wsrep_node_name=pxc1 --default_storage_engine=InnoDB --innodb_autoinc_lock_mode=2 --wsrep_provider_options=base_host=127.0.0.1;base_port=3381 --wsrep_sst_receive_address=127.0.0.1:3391 --wsrep_node_address=127.0.0.1 --innodb-temp-data-file-path=../it/ibtmp1:12M:autoextend

Suggested fix: retry SST after failure or at least stop joiner node, so some failover script can restart it again

Kenn Takara (kenn-takara) wrote :

I have verified this with PXC 5.7.18

However the steps (at the end are slightly different).

Start node 1 up (and as it starts the SST) then start the 'optimize table joinit' on node 2.
This will also cause PXB to crash. (Also I did it with 16M rows).

Kenn Takara (kenn-takara) wrote :

The bug doesn't happen with PXC 5.6

Krunal Bauskar (krunal-bauskar) wrote :

commit 6ea7d1786122ebb7832611aa96996a68332a29c7
Author: Kenn Takara <email address hidden>
Date: Thu Aug 10 13:49:09 2017 -1000

    Fix for PXC-810 PXC-843
    PXC-810: PXC 5.7: segfault in GCommConn::close in galera_var_node_address
    PXC-843: Joiner hangs after failed SST

    Issue:
    If an SST fails due to a DML (such as OPTIMIZE or ALTER TABLE), which is run as the SST
    is running, then the joiner node can hang (PXC-843). This is due to the applier
    thread picking a transaction off of the recv_q and attempting to apply it, even though
    the thread has been killed.

    This exposed PXC-810, which would then happen while testing for PXC-843.

    Solution:
    For PXC-843, if the SST has failed, then we just drop any transactions that hav
    been received. If the SST has failed, the node should be shutting down anyway.

    For PXC-810, we do an additional check within the critical section to avoid the
    null pointer dereference.

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-843

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers