Galera SST Broken by XtraBackup 2.4.5

Bug #1647425 reported by Mike Kelly
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona XtraBackup moved to https://jira.percona.com/projects/PXB
New
Undecided
Unassigned

Bug Description

There appears to be a regression with Galera SST on MariaDB 10.1, after upgrading to Percona XtraBackup 2.4.5, at least in my setup.

On the donor:

2016-10-21 15:26:42 140224017458944 [Note] WSREP: Member 0.0 (REDACTED) requested state transfer from '*any*'. Selected 1.0 (REDACTED)(SYNCED) as donor. 2016-10-21 15:26:42 140224017458944 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 4995) 2016-10-21 15:26:42 140224327415552 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. 2016-10-21 15:26:42 140223165884160 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address '10.1.0.99:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' '' --gtid '23ee37e0-97b2-11e6-9da6-5e052758fb77:4995' --gtid-domain-id '0''
2016-10-21 15:26:42 140224327415552 [Note] WSREP: sst_donor_thread signaled with 0
WSREP_SST: [INFO] Streaming with xbstream (20161021 15:26:43.235)
WSREP_SST: [INFO] Using socat as streamer (20161021 15:26:43.237)
WSREP_SST: [INFO] Using openssl based encryption with socat: with key and crt (20161021 15:26:43.243)
WSREP_SST: [INFO] Encrypting with cert=/etc/pki/tls/certs/REDACTED_mariadb.pem, key=/etc/pki/tls/private/REDACTED_mariadb.pem, cafile=/etc/pki/tls/cert.pem (20161021 15:26:43.245) WSREP_SST: [INFO] Using /tmp/tmp.V5pN9RYusP as innobackupex temporary directory (20161021 15:26:43.259) WSREP_SST: [INFO] Streaming GTID file before SST (20161021 15:26:43.261) WSREP_SST: [INFO] Evaluating xbstream -c ${INFO_FILE} | socat -u stdio openssl-connect:REDACTED:4444,cert=/etc/pki/tls/certs/REDACTED_mariadb.pem,key=/etc/pki/tls/private/REDACTED_mariadb.pem,cafile=/etc/pki/tls/cert.pem; RC=( ${PIPESTATUS[@]} ) (20161021 15:26:43.263)
WSREP_SST: [INFO] Sleeping before data transfer for SST (20161021 15:26:43.366) 2016-10-21 15:26:44 140224025851648 [Note] WSREP: (5a4486be, 'ssl://0.0.0.0:4567') turning message relay requesting off WSREP_SST: [INFO] Streaming the backup to joiner at 10.1.0.99 4444 (20161021 15:26:53.388) WSREP_SST: [INFO] Evaluating innobackupex --no-version-check $tmpopts $INNOEXTRA --galera-info --stream=$sfmt $itmpdir 2>${DATA}/innobackup.backup.log | socat -u stdio openssl-connect:REDACTED:4444,cert=/etc/pki/tls/certs/REDACTED_mariadb.pem,key=/etc/pki/tls/private/REDACTED_mariadb.pem,cafile=/etc/pki/tls/cert.pem; RC=( ${PIPESTATUS[@]} ) (20161021 15:26:53.412) WSREP_SST: [ERROR] innobackupex finished with error: 1. Check /var/lib/mysql//innobackup.backup.log (20161021 15:26:53.546) WSREP_SST: [ERROR] Cleanup after exit with status:22 (20161021 15:26:53.569) WSREP_SST: [INFO] Cleaning up temporary directories (20161021 15:26:53.603) 2016-10-21 15:26:53 140223165884160 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup-v2 --role 'donor' --address '10.1.0.99:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' '' --gtid '23ee37e0-97b2-11e6-9da6-5e052758fb77:4995' --gtid-domain-id '0' 2016-10-21 15:26:53 140223165884160 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'donor' --address '10.1.0.99:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' '' --gtid '23ee37e0-97b2-11e6-9da6-5e052758fb77:4995' --gtid-domain-id '0': 22 (Invalid argument) 2016-10-21 15:26:53 140223165884160 [ERROR] WSREP: Command did not run: wsrep_sst_xtrabackup-v2 --role 'donor' --address '10.1.0.99:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' '' --gtid '23ee37e0-97b2-11e6-9da6-5e052758fb77:4995' --gtid-domain-id '0'
2016-10-21 15:26:53 140224017458944 [Warning] WSREP: 1.0 (REDACTED): State transfer to 0.0 (REDACTED) failed: -22 (Invalid argument)
2016-10-21 15:26:53 140224017458944 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 4998)
2016-10-21 15:26:53 140224017458944 [Note] WSREP: Member 1.0 (REDACTED) synced with group.
2016-10-21 15:26:53 140224017458944 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 4998)

On the recipient:

2016-11-30 14:12:56 140312227866368 [Note] WSREP: Member 1.0 (REDACTED) requested state transfer from '*any*'. Selected 0.0 (REDACTED)(SYNCED) as donor.
2016-11-30 14:12:56 140312227866368 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 985581)
2016-11-30 14:12:56 140312546125568 [Note] WSREP: Requesting state transfer: success, donor: 0
WSREP_SST: [INFO] WARNING: Stale temporary SST directory: /var/lib/mysql//.sst from previous state transfer. Removing (20161130 14:12:57.094)
WSREP_SST: [INFO] Proceeding with SST (20161130 14:12:57.166)
WSREP_SST: [INFO] Evaluating socat -u openssl-listen:4444,reuseaddr,cert=/etc/pki/tls/certs/REDACTED_mariadb.pem,key=/etc/pki/tls/private/REDACTED_mariadb.pem,cafile=/etc/pki/tls/cert.pem stdio | xbstream -x;
 RC=( ${PIPESTATUS[@]} ) (20161130 14:12:57.172)
WSREP_SST: [INFO] Cleaning the existing datadir and innodb-data/log directories (20161130 14:12:57.175)
[...]
WSREP_SST: [INFO] Waiting for SST streaming to complete! (20161130 14:12:57.270)
2016-11-30 14:12:58 140312236259072 [Note] WSREP: (fefd883a, 'ssl://0.0.0.0:4567') turning message relay requesting off
xb_stream_read_chunk(): wrong chunk magic at offset 0x1ed7767a.
2016-11-30 14:13:21 140312227866368 [Note] WSREP: 0.0 (REDACTED): State transfer to 1.0 (REDACTED) complete.
2016-11-30 14:13:21 140312227866368 [Note] WSREP: Member 0.0 (REDACTED) synced with group.
WSREP_SST: [ERROR] Error while getting data from donor node: exit codes: 0 1 (20161130 14:13:21.297)
WSREP_SST: [ERROR] Cleanup after exit with status:32 (20161130 14:13:21.303)
2016-11-30 14:13:21 140312219473664 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.1.0.99' --datadir '/var/lib/mysql/' --parent '18548' '' : 32 (Broken pipe)
2016-11-30 14:13:21 140312219473664 [ERROR] WSREP: Failed to read uuid:seqno and wsrep_gtid_domain_id from joiner script.
2016-11-30 14:13:21 140312546400448 [ERROR] WSREP: SST failed: 32 (Broken pipe)
2016-11-30 14:13:21 140312546400448 [ERROR] Aborting

Error in my_thread_global_end(): 1 threads didn't exit
2016-11-30 14:21:22 139868311013568 [Note] WSREP: Read nil XID from storage engines, skipping position init
2016-11-30 14:21:22 139868311013568 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
2016-11-30 14:21:22 139868311013568 [Note] WSREP: wsrep_load(): Galera 25.3.18(r3632) by Codership Oy <email address hidden> loaded successfully.
2016-11-30 14:21:22 139868311013568 [ERROR] WSREP: getaddrinfo() failed on '*': -2 (Name or service not known)

Revision history for this message
Mike Kelly (pioto) wrote :

With the same configuration, I see no problem with SST after downgrading back to XtraBackup 2.4.4.

Revision history for this message
Mike Kelly (pioto) wrote :
Download full text (3.9 KiB)

Relevant my.cnf configuration:

[client]
port = 3306
socket = /var/lib/mysql/mysql.sock

[mysqld]
basedir = /usr bind-address = * binlog_format = ROW datadir = /var/lib/mysql default-storage-engine = innodb expire_logs_days = 10 innodb_autoinc_lock_mode = 2 innodb_flush_log_at_trx_commit = 0 key_buffer_size = 16M log-error = /var/log/mariadb/mariadb.log
max_allowed_packet = 16M
max_binlog_size = 100M
max_connections = 151
pid-file = /var/run/mariadb/mariadb.pid
port = 3306
query_cache_limit = 1M
query_cache_size = 16M
skip-external-locking
socket = /var/lib/mysql/mysql.sock
ssl = true
ssl-ca = /etc/pki/tls/cert.pem
ssl-cert = /etc/pki/tls/certs/REDACTED_mariadb.pem
ssl-cipher = ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES12...

Read more...

Revision history for this message
Sergei Glushchenko (sergei.glushchenko) wrote :

Please see if 1646429 is relevant to you

Revision history for this message
Mike Kelly (pioto) wrote :

I've tried adding this to my configuration, but it doesn't seem to have an effect:

[sst]
ssl = 0

The joiner has:

xb_stream_read_chunk(): wrong chunk magic at offset 0x1ed77c7b.

Revision history for this message
Mike Kelly (pioto) wrote :

Also, it seems the issue is specific to the Donor node.

If I downgrade XtraBackup back to 2.4.4 on the Joiner only, that is not enough to resolve the issue.

If I downgrade only on the Donor, and not on the Joiner, then things work.

Revision history for this message
Sergei Glushchenko (sergei.glushchenko) wrote :

See also bug 1647340. In 2.4 you need to add ssl_mode=DISABLED

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.