Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC

SST xtrabackup in 5.5.31 is not clobbering JOINER datadir

Bug #1213073 reported by Jay Janssen on 2013-08-16

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	MySQL patches by Codership	New	Undecided	Unassigned
	Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC	Fix Released	High	Raghavendra D Prabhu	Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC 5.5.33-23.7.6

Bug Description

3 node cluster with table 'test.sbtest1'.

All nodes report:

[root@ip-10-224-135-229 ~]# ls -la /var/lib/mysql/test/sbtest1.*
-rw-rw----. 1 mysql mysql 8632 Aug 16 12:40 /var/lib/mysql/test/sbtest1.frm
-rw-rw----. 1 mysql mysql 214958080 Aug 16 12:41 /var/lib/mysql/test/sbtest1.ibd

[root@ip-10-32-231-30 ~]# ls -la /var/lib/mysql/test/sbtest1.*
-rw-rw----. 1 mysql mysql 8632 Aug 16 12:40 /var/lib/mysql/test/sbtest1.frm
-rw-rw----. 1 mysql mysql 243269632 Aug 16 12:41 /var/lib/mysql/test/sbtest1.ibd

[root@ip-10-40-197-24 ~]# ls -la /var/lib/mysql/test/sbtest1.*
-rw-rw----. 1 mysql mysql 8632 Aug 16 12:40 /var/lib/mysql/test/sbtest1.frm
-rw-rw----. 1 mysql mysql 243269632 Aug 16 12:41 /var/lib/mysql/test/sbtest1.ibd

Now, shutdown last node:

[root@ip-10-40-197-24 mysql]# service mysql stop
Shutting down MySQL (Percona XtraDB Cluster)..... SUCCESS!

Drop the table on one of the other nodes:

ip-10-32-231-30 mysql> drop table test.sbtest1;
Query OK, 0 rows affected (0.10 sec)

I can see table is gone on running nodes, still present on offline node (as expected):

[root@ip-10-224-135-229 ~]# ls -la /var/lib/mysql/test/sbtest1.*
ls: cannot access /var/lib/mysql/test/sbtest1.*: No such file or directory

[root@ip-10-32-231-30 ~]# ls -la /var/lib/mysql/test/sbtest1.*
ls: cannot access /var/lib/mysql/test/sbtest1.*: No such file or directory

[root@ip-10-40-197-24 ~]# ls -la /var/lib/mysql/test/sbtest1.*
-rw-rw----. 1 mysql mysql 8632 Aug 16 12:40 /var/lib/mysql/test/sbtest1.frm
-rw-rw----. 1 mysql mysql 289406976 Aug 16 12:41 /var/lib/mysql/test/sbtest1.ibd

So, I remove grastate.dat to force SST and restart last node:

[root@ip-10-40-197-24 ~]# rm /var/lib/mysql/grastate.dat
rm: remove regular file `/var/lib/mysql/grastate.dat'? y
[root@ip-10-40-197-24 ~]# service mysql start
Starting MySQL (Percona XtraDB Cluster)....SST in progress, setting sleep higher....

I can see innodbackupex running, so this is using xtrabackup and is does full SST. Node comes up and joins the cluster, and yet I still see the tablespace from sbtest1:

[root@ip-10-224-135-229 ~]# ls -la /var/lib/mysql/test/sbtest*
ls: cannot access /var/lib/mysql/test/sbtest*: No such file or directory

[root@ip-10-32-231-30 ~]# ls -la /var/lib/mysql/test/sbtest*
ls: cannot access /var/lib/mysql/test/sbtest*: No such file or directory

[root@ip-10-40-197-24 ~]# ls -la /var/lib/mysql/test/sbtest*
-rw-rw----. 1 mysql mysql 8632 Aug 16 12:40 /var/lib/mysql/test/sbtest1.frm
-rw-rw----. 1 mysql mysql 289406976 Aug 16 12:41 /var/lib/mysql/test/sbtest1.ibd

ip-10-40-197-24 mysql> show global status like 'wsrep%';
+----------------------------+---------------------------------------------------------+
| Variable_name | Value |
+----------------------------+---------------------------------------------------------+
| wsrep_local_state_uuid | 5ed71b23-066a-11e3-aee0-32646929a736 |
| wsrep_protocol_version | 4 |
| wsrep_last_committed | 1746813 |
| wsrep_replicated | 0 |
| wsrep_replicated_bytes | 0 |
| wsrep_received | 3 |
| wsrep_received_bytes | 297 |
| wsrep_local_commits | 0 |
| wsrep_local_cert_failures | 0 |
| wsrep_local_bf_aborts | 0 |
| wsrep_local_replays | 0 |
| wsrep_local_send_queue | 0 |
| wsrep_local_send_queue_avg | 0.333333 |
| wsrep_local_recv_queue | 0 |
| wsrep_local_recv_queue_avg | 0.000000 |
| wsrep_flow_control_paused | 0.000000 |
| wsrep_flow_control_sent | 0 |
| wsrep_flow_control_recv | 0 |
| wsrep_cert_deps_distance | 0.000000 |
| wsrep_apply_oooe | 0.000000 |
| wsrep_apply_oool | 0.000000 |
| wsrep_apply_window | 0.000000 |
| wsrep_commit_oooe | 0.000000 |
| wsrep_commit_oool | 0.000000 |
| wsrep_commit_window | 0.000000 |
| wsrep_local_state | 4 |
| wsrep_local_state_comment | Synced |
| wsrep_cert_index_size | 0 |
| wsrep_causal_reads | 0 |
| wsrep_incoming_addresses | 10.32.231.30:3306,10.224.135.229:3306,10.40.197.24:3306 |
| wsrep_cluster_conf_id | 21 |
| wsrep_cluster_size | 3 |
| wsrep_cluster_state_uuid | 5ed71b23-066a-11e3-aee0-32646929a736 |
| wsrep_cluster_status | Primary |
| wsrep_connected | ON |
| wsrep_local_index | 2 |
| wsrep_provider_name | Galera |
| wsrep_provider_vendor | Codership Oy <email address hidden> |
| wsrep_provider_version | 2.6(r152) |
| wsrep_ready | ON |
+----------------------------+---------------------------------------------------------+
40 rows in set (0.00 sec)

My configuration is thus:

[mysqld]
datadir = /var/lib/mysql
log_error = error.log

key_buffer_size = 128M

binlog_format = ROW

innodb_buffer_pool_size = 10G
innodb_buffer_pool_instances = 4
innodb_log_file_size = 1G
innodb_flush_method = O_DIRECT
innodb_file_per_table
innodb_flush_log_at_trx_commit = 2

wsrep_cluster_name = mycluster
wsrep_cluster_address = gcomm://10.224.135.229,10.32.231.30,10.40.197.24
wsrep_node_name = ip-10-40-197-24

wsrep_provider = /usr/lib64/libgalera_smm.so
wsrep_provider_options = "gcs.fc_limit=1024; evs.user_send_window=32; evs.send_window=32; gcache.mem_size=1G"

wsrep_sst_method = xtrabackup
wsrep_sst_auth = sst:secret

wsrep_replicate_myisam = 1
wsrep_slave_threads = 16
wsrep_auto_increment_control = OFF

innodb_locks_unsafe_for_binlog = 1
innodb_autoinc_lock_mode = 2

[mysql]
prompt = "ip-10-40-197-24 mysql> "

[client]
user = root

I have not modified /usr/bin/wsrep_sst_xtrabackup at all. The last node stays in the cluster until I try to re-create the table.

Related branches

lp:percona-xtradb-cluster/percona-xtradb-cluster-5.5

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-08-24:

Is the SST fully complete when it is checked with ls?

Need the innobackupex logs and error logs of donor and joiner.

Since the default streaming is done with tar - assuming same configuration on donor and joiner - then the clobbering should be fine.

Few other things:

a) Note that the timestamp is identical before and after on that node.

" -rw-rw----. 1 mysql mysql 243269632 Aug 16 12:41 /var/lib/mysql/test/sbtest1.ibd"

b) The file sizes are different. (and with a large difference)

-rw-rw----. 1 mysql mysql 214958080 Aug 16 12:41 /var/lib/mysql/test/sbtest1.ibd

on first node

v/s

-rw-rw----. 1 mysql mysql 243269632 Aug 16 12:41 /var/lib/mysql/test/sbtest1.ibd

on others.

This indicates inconsistency beforehand.

c) Because of point b and/or due to others, SST may not have occurred correctly. Hence the logs are required.

Revision history for this message

Jay Janssen (jay-janssen) wrote on 2013-08-26: Re: [Bug 1213073] SST xtrabackup in 5.5.31 (no tweaking) is not clobbering JOINER datadir

On Aug 24, 2013, at 2:17 PM, Raghavendra D Prabhu <email address hidden> wrote:

> Is the SST fully complete when it is checked with ls?

Yes. The node starts and is in Primary state.

> c) Because of point b and/or due to others, SST may not have occurred
> correctly. Hence the logs are required.
>

I'm not sure what the logs will show you, I've never seen any indication of the tar unpacking in any JOINER logs ever. My simulation was a reasonable situation:

node goes down
table gets dropped
node rejoins with SST
tablespace incorrectly still in place, even after SST and node joins

Reading wsrep_sst_xtrabackup, I don't see any reason why you would think this would even work right. What is in the script that would ensure the JOINER's datadir would get clobbered? The default tar command is simply:

strmcmd="tar xfi - -C ${DATA}"

and -C doesn't stand for "clobber".

Jay Janssen, MySQL Consulting Lead, Percona
http://about.me/jay.janssen

Raghavendra D Prabhu (raghavendra-prabhu) on 2013-09-05

Changed in percona-xtradb-cluster:
milestone:	none → 5.5.33-23.7.6

Raghavendra D Prabhu (raghavendra-prabhu) on 2013-09-05

Changed in percona-xtradb-cluster:
status:	New → In Progress
assignee:	nobody → Raghavendra D Prabhu (raghavendra-prabhu)
importance:	Undecided → High

Raghavendra D Prabhu (raghavendra-prabhu) on 2013-09-05

Changed in percona-xtradb-cluster:
status:	In Progress → Fix Committed

Raghavendra D Prabhu (raghavendra-prabhu) on 2013-09-05

Changed in percona-xtradb-cluster:
status:	Fix Committed → In Progress

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-09-05: Re: SST xtrabackup in 5.5.31 (no tweaking) is not clobbering JOINER datadir

Rsync SST shouldn't be affected since rsync is used with --delete.

For Xtrabackup SST:
Fixed in https://bazaar.launchpad.net/~percona-core/percona-xtradb-cluster/5.5/revision/477

This still leaves out 'drop database' based inconsistency - since with above, an invalid directory under ${DATA} will still remain.

Changed in percona-xtradb-cluster:
status:	In Progress → Fix Committed

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-09-07:

For further fixing of this bug (and related issues) check lp:1222122

Raghavendra D Prabhu (raghavendra-prabhu) on 2013-09-08

summary:

- SST xtrabackup in 5.5.31 (no tweaking) is not clobbering JOINER datadir
+ SST xtrabackup in 5.5.31 is not clobbering JOINER datadir

Raghavendra D Prabhu (raghavendra-prabhu) on 2013-09-23

Changed in percona-xtradb-cluster:
status:	Fix Committed → Fix Released

Revision history for this message

Shahriyar Rzayev (rzayev-sehriyar) wrote on 2018-01-18:

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-974

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.