SST donor node has status Joined after successful sync

Bug #1608680 reported by Dan Urist
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Fix Released
Undecided
Unassigned

Bug Description

The SST donor node has status "Joined" rather than "Synced" following successful SST.

To replicate:

1) Existing cluster is down. Bootstrap first node by adding "wsrep_cluster_address = gcomm://" to my.cnf and start mysql server with "/etc/init.d/mysql bootstrap-pxc"

2) Check status of bootstrapped node:

mysql> SHOW STATUS LIKE 'wsrep_local_state%';
+---------------------------+--------------------------------------+
| Variable_name | Value |
+---------------------------+--------------------------------------+
| wsrep_local_state_uuid | 7cb7d33a-8d28-11e3-a157-fb52b065702a |
| wsrep_local_state | 4 |
| wsrep_local_state_comment | Synced |
+---------------------------+--------------------------------------+

3) Start second node (force SST by deleting grastate.dat if it exists)

/etc/init.d/mysql start

4) Check status on second node

mysql> SHOW STATUS LIKE 'wsrep_local_state%';
+---------------------------+--------------------------------------+
| Variable_name | Value |
+---------------------------+--------------------------------------+
| wsrep_local_state_uuid | 7cb7d33a-8d28-11e3-a157-fb52b065702a |
| wsrep_local_state | 4 |
| wsrep_local_state_comment | Synced |
+---------------------------+--------------------------------------+

5) Check status again on first node:

mysql> SHOW STATUS LIKE 'wsrep_local_state%';
+---------------------------+--------------------------------------+
| Variable_name | Value |
+---------------------------+--------------------------------------+
| wsrep_local_state_uuid | 7cb7d33a-8d28-11e3-a157-fb52b065702a |
| wsrep_local_state | 3 |
| wsrep_local_state_comment | Joined |
+---------------------------+--------------------------------------+

Status on first node is stuck on "Joined" and never becomes "Synced"; thus it's unavailable as a donor. If the second node crashes, no new nodes can join the cluster.

If I restart the first node, it shows status "Synced".

If I then stop the first node and force a sync from the second node, the donor node ends up with status "Joined".

Environment:

OS: Debian 8 (jessie)
percona-nagios-plugins 1.1.6-1
percona-toolkit 2.2.18-1
percona-xtrabackup 2.3.5-1.jessie
percona-xtradb-cluster-56 5.6.30-25.16-1.jessie
percona-xtradb-cluster-client-5.6 5.6.30-25.16-1.jessie
percona-xtradb-cluster-common-5.6 5.6.30-25.16-1.jessie
percona-xtradb-cluster-galera-3 3.16-1.jessie
percona-xtradb-cluster-galera-3.x 3.16-1.jessie
percona-xtradb-cluster-server-5.6 5.6.30-25.16-1.jessie

Revision history for this message
Dan Urist (durist-ucar) wrote :

Here are the log entries from the donor, from the end of the SST:

2016-08-01 14:50:23 19837 [Note] WSREP: 0.0 (dev-drupalsql2.ucar.edu): State transfer to 1.0 (dev-drupalsqlmaster.ucar.edu) complete.
2016-08-01 14:50:23 19837 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 3208942)
2016-08-01 14:50:23 19837 [Note] WSREP: SYNC message ignored as node 0.0 (dev-drupalsql2.ucar.edu) was re-transitioned to DONOR mode before it synced.
2016-08-01 14:50:23 19837 [ERROR] WSREP: sst sent called when not SST donor, state JOINED
WSREP_SST: [INFO] Total time on donor: 0 seconds (20160801 14:50:23.764)
WSREP_SST: [INFO] Cleaning up temporary directories (20160801 14:50:23.776)
2016-08-01 14:50:37 19837 [Note] WSREP: 1.0 (dev-drupalsqlmaster.ucar.edu): State transfer from 0.0 (dev-drupalsql2.ucar.edu) complete.
2016-08-01 14:50:37 19837 [Note] WSREP: Member 1.0 (dev-drupalsqlmaster.ucar.edu) synced with group.

Revision history for this message
Dan Urist (durist-ucar) wrote :

Configuration:

[mysqld]
basedir = /usr
binlog_format = ROW
default-storage-engine = INNODB
expire_logs_days = 10
innodb_additional_mem_pool_size = 16M
innodb_autoinc_lock_mode = 2
innodb_buffer_pool_size = 512M
innodb_log_file_size = 256M
key_buffer_size = 512M
max_allowed_packet = 64M
max_binlog_size = 100M
max_connections = 200
max_user_connections = 20
open_files_limit = 50000
query_cache_limit = 1M
query_cache_size = 0
query_cache_type = 0
read_buffer_size = 1M
skip-external-locking
socket = /var/run/mysqld/mysqld.sock
sort_buffer_size = 1M
table_open_cache = 64
thread_cache_size = 9000
thread_stack = 512K
tmpdir = /tmp
user = mysql
wsrep_cluster_name = dev_drupalsql
wsrep_notify_cmd = /usr/local/bin/galera_notify
wsrep_provider = /usr/lib/libgalera_smm.so
wsrep_provider_options = 'gcache.size=1G'
wsrep_slave_threads = 8
wsrep_sst_auth = XXXXX:XXXXXXXX
wsrep_sst_method = xtrabackup-v2

[sst]
encrypt = 2
inno-apply-opts = '--use-memory=1G'
inno-backup-opts = '--no-backup-locks'
sockopt = ',verify=0'
tca = /etc/mysql/server.crt
tcert = /etc/mysql/server.pem

Revision history for this message
Dan Urist (durist-ucar) wrote :
Revision history for this message
Krunal Bauskar (krunal-bauskar) wrote :

Dan,

Thanks for the report and log file snippet.

Log-file snippet could help us get some clue of what may be going on and so we have further questions too:

a. While the node was acting as donor was there any active workload that was running against it.

b. What is interesting is node is transition from DONOR -> JOINED and sync message post that is ignored given that node was re-transitioned to DONOR so wondering if there is any action that is forcing it to get back to DONOR state (RSU/FLUSH TABLE ? action).

(If possible can you share complete log file or at-least till the point incident took place).

Also, is this easily repeatable at your end then may be you can help with steps.
Of-course in normal sequence we don't see it.

Revision history for this message
Dan Urist (durist-ucar) wrote : Re: [Bug 1608680] Re: SST donor node has status Joined after successful sync
Download full text (5.3 KiB)

On Tue, Aug 2, 2016 at 7:56 PM, Krunal Bauskar <email address hidden>
wrote:

> Dan,
>
> Thanks for the report and log file snippet.
>
> Log-file snippet could help us get some clue of what may be going on and
> so we have further questions too:
>
> a. While the node was acting as donor was there any active workload that
> was running against it.
>

Very little, if any. This database just backs two development drupal sites
that are only getting regularly hit by monitoring software.

>
> b. What is interesting is node is transition from DONOR -> JOINED and
> sync message post that is ignored given that node was re-transitioned to
> DONOR so wondering if there is any action that is forcing it to get back
> to DONOR state (RSU/FLUSH TABLE ? action).
>
> (If possible can you share complete log file or at-least till the point
> incident took place).
>

Oddly, when I got up this morning, both nodes are in Donor/Desynced status.
I've attached log files from each.

>
> Also, is this easily repeatable at your end then may be you can help with
> steps.
> Of-course in normal sequence we don't see it.
>

Yes, I've been able to replicate the behavior on the development cluster
with the steps detailed in the bug report.

Also, in addition to the development cluster, I have a test cluster and a
production cluster. The problem first appeared in the production cluster,
which right now is running in a degraded state with only two nodes up (so
this is fairly urgent for me). The test cluster hasn't (yet) shown this
behavior, but it hasn't done an SST since the upgrade from the previous
version of Xtradb. If it's useful, I can send along logs from the
production cluster as well.

As far as I can tell, the issue started following the last xtradb upgrade;
we've been running xtradb clusters for a couple of years without issue.

Thanks very much for your help.

> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1608680
>
> Title:
> SST donor node has status Joined after successful sync
>
> Status in Percona XtraDB Cluster:
> New
>
> Bug description:
> The SST donor node has status "Joined" rather than "Synced" following
> successful SST.
>
> To replicate:
>
> 1) Existing cluster is down. Bootstrap first node by adding
> "wsrep_cluster_address = gcomm://" to my.cnf and start mysql server
> with "/etc/init.d/mysql bootstrap-pxc"
>
> 2) Check status of bootstrapped node:
>
> mysql> SHOW STATUS LIKE 'wsrep_local_state%';
> +---------------------------+--------------------------------------+
> | Variable_name | Value |
> +---------------------------+--------------------------------------+
> | wsrep_local_state_uuid | 7cb7d33a-8d28-11e3-a157-fb52b065702a |
> | wsrep_local_state | 4 |
> | wsrep_local_state_comment | Synced |
> +---------------------------+--------------------------------------+
>
> 3) Start second node (force SST by deleting grastate.dat if it exists)
>
> /etc/init.d/mysql start
>
> 4) Check status on second node
>
> m...

Read more...

Revision history for this message
Dan Urist (durist-ucar) wrote :
Download full text (15.2 KiB)

Here are the wsrep status variables on each server after a sync; the one
stuck in "Joined" state shows wsrep_desync_count = 1

mysql> SHOW STATUS LIKE 'wsrep_%';
+------------------------------+-------------------------------------------+
| Variable_name | Value |
+------------------------------+-------------------------------------------+
| wsrep_local_state_uuid | 7cb7d33a-8d28-11e3-a157-fb52b065702a |
| wsrep_protocol_version | 7 |
| wsrep_last_committed | 3208942 |
| wsrep_replicated | 0 |
| wsrep_replicated_bytes | 0 |
| wsrep_repl_keys | 0 |
| wsrep_repl_keys_bytes | 0 |
| wsrep_repl_data_bytes | 0 |
| wsrep_repl_other_bytes | 0 |
| wsrep_received | 3 |
| wsrep_received_bytes | 265 |
| wsrep_local_commits | 0 |
| wsrep_local_cert_failures | 0 |
| wsrep_local_replays | 0 |
| wsrep_local_send_queue | 0 |
| wsrep_local_send_queue_max | 1 |
| wsrep_local_send_queue_min | 0 |
| wsrep_local_send_queue_avg | 0.000000 |
| wsrep_local_recv_queue | 0 |
| wsrep_local_recv_queue_max | 1 |
| wsrep_local_recv_queue_min | 0 |
| wsrep_local_recv_queue_avg | 0.000000 |
| wsrep_local_cached_downto | 0 |
| wsrep_flow_control_paused_ns | 0 |
| wsrep_flow_control_paused | 0.000000 |
| wsrep_flow_control_sent | 0 |
| wsrep_flow_control_recv | 0 |
| wsrep_cert_deps_distance | 0.000000 |
| wsrep_apply_oooe | 0.000000 |
| wsrep_apply_oool | 0.000000 |
| wsrep_apply_window | 0.000000 |
| wsrep_commit_oooe | 0.000000 |
| wsrep_commit_oool | 0.000000 |
| wsrep_commit_window | 0.000000 |
| wsrep_local_state | 4 |
| wsrep_local_state_comment | Synced |
| wsrep_cert_index_size | 0 ...

Revision history for this message
Krunal Bauskar (krunal-bauskar) wrote :

Hi Dan,

From the log we discovered that backup locks use is suppressed causing XB to use FLUSH TABLE WITH READ LOCK. We have found an issue where-in SST and FLUSH TABLE WITH READ LOCK are conflicting due to updated FTWRL semantics. We are working on this issue but parallely you can simply enable use of backup lock and things will work as expected (w/o error)

------------
WSREP_SST: [INFO] Evaluating innobackupex --defaults-file=/etc/mysql/my.cnf --defaults-group=mysqld --no-version-check --no-backup-locks $tmpopts $INNOEXTRA --galera-info --stream=$sfmt $itmpdir 2>${DATA}/innobackup.backup.log | pv -f -i 10 -N donor -F '%N => Rate:%r Avg:%a Elapsed:%t %e Bytes: %b %p' -s 927236096 2>>/mysql-data/dev_drupalsql/log/sst.log | socat -u stdio openssl-connect:128.117.224.166:4444,cert=/etc/mysql/server.pem,cafile=/etc/mysql/server.crt,verify=0; RC=( ${PIPESTATUS[@]} ) (20160803 06:32:36.371)
------------

--no-backup-locks limitation was added to PXC few years back when backup locks had issues but they are stable now and default way of using XB too.

You would have either of these 2 settings (FORCE_FTWRL or [sst] section inno-backup-opts..) to suppress use of backup-locks. You can comment it for now.

<snippet of limitation which is no more valid with newer version of PXC-5.6>
Backup locks used during SST or with Xtrabackup can crash, on donor, either use inno-backup-opts=’–no-backup-locks’ under [sst] in my.cnf or set FORCE_FTWRL=1 in /etc/sysconfig/mysql (or /etc/sysconfig/mysql.%i for corresponding unit/service) for CentOS/RHEL or /etc/default/mysql in debian/ubuntu. You can also use rsync as the alternate SST method if those don’t work.
</snippet>

Regards,
Krunal

Revision history for this message
Joel Daves (jdaves-ucar) wrote :

Disabling --no-backup-locks has returned both the development and production clusters to normal operations.

Thank you.

Revision history for this message
Dan Urist (durist-ucar) wrote :
Download full text (3.7 KiB)

Yay!!! Thanks for doing that.

On Thu, Aug 11, 2016 at 7:22 PM, Joel Daves <email address hidden>
wrote:

> Disabling --no-backup-locks has returned both the development and
> production clusters to normal operations.
>
> Thank you.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1608680
>
> Title:
> SST donor node has status Joined after successful sync
>
> Status in Percona XtraDB Cluster:
> New
>
> Bug description:
> The SST donor node has status "Joined" rather than "Synced" following
> successful SST.
>
> To replicate:
>
> 1) Existing cluster is down. Bootstrap first node by adding
> "wsrep_cluster_address = gcomm://" to my.cnf and start mysql server
> with "/etc/init.d/mysql bootstrap-pxc"
>
> 2) Check status of bootstrapped node:
>
> mysql> SHOW STATUS LIKE 'wsrep_local_state%';
> +---------------------------+--------------------------------------+
> | Variable_name | Value |
> +---------------------------+--------------------------------------+
> | wsrep_local_state_uuid | 7cb7d33a-8d28-11e3-a157-fb52b065702a |
> | wsrep_local_state | 4 |
> | wsrep_local_state_comment | Synced |
> +---------------------------+--------------------------------------+
>
> 3) Start second node (force SST by deleting grastate.dat if it exists)
>
> /etc/init.d/mysql start
>
> 4) Check status on second node
>
> mysql> SHOW STATUS LIKE 'wsrep_local_state%';
> +---------------------------+--------------------------------------+
> | Variable_name | Value |
> +---------------------------+--------------------------------------+
> | wsrep_local_state_uuid | 7cb7d33a-8d28-11e3-a157-fb52b065702a |
> | wsrep_local_state | 4 |
> | wsrep_local_state_comment | Synced |
> +---------------------------+--------------------------------------+
>
> 5) Check status again on first node:
>
> mysql> SHOW STATUS LIKE 'wsrep_local_state%';
> +---------------------------+--------------------------------------+
> | Variable_name | Value |
> +---------------------------+--------------------------------------+
> | wsrep_local_state_uuid | 7cb7d33a-8d28-11e3-a157-fb52b065702a |
> | wsrep_local_state | 3 |
> | wsrep_local_state_comment | Joined |
> +---------------------------+--------------------------------------+
>
> Status on first node is stuck on "Joined" and never becomes "Synced";
> thus it's unavailable as a donor. If the second node crashes, no new
> nodes can join the cluster.
>
> If I restart the first node, it shows status "Synced".
>
> If I then stop the first node and force a sync from the second node,
> the donor node ends up with status "Joined".
>
> Environment:
>
> OS: Debian 8 (jessie)
> percona-nagios-plugins 1.1.6-1
> percona-tool...

Read more...

Changed in percona-xtradb-cluster:
status: New → Confirmed
status: Confirmed → Fix Committed
Revision history for this message
Evan (eskinner) wrote :

Hello,

In comment #7 you mentioned that you were working on the actual problem this bug was about in parallel:

-----
We have found an issue where-in SST and FLUSH TABLE WITH READ LOCK are conflicting due to updated FTWRL semantics. We are working on this issue but parallely you can simply enable use of backup lock and things will work as expected (w/o error)
-----

Can you please link that bug here? This bug is also affecting me and I cannot use the workaround as I end up (sometimes) with deadlock on the Donor when I use the backup locks (Will raise a diff bug about that).

Revision history for this message
Krunal Bauskar (krunal-bauskar) wrote :

Hi Evan,

Sorry didn't get you. Are you looking for the bug that we are trying to resolve.
We are using this bug to track the problem.

Regards,
Krunal

Revision history for this message
Evan (eskinner) wrote :

Aaaah. Sorry my confusion. So this bug will still track the actual fix then.

I was just a bit confused as the state of this bug is "Fix committed" but I couldn't find a link to the actual bug fix in the code from here. Considering re-building from source if there is a fix for this bug already in the code but just not released yet

Revision history for this message
Evan (eskinner) wrote :

Think I found the actual fixed code in the github repo, committed just a few days ago.

https://github.com/percona/galera/pull/80/commits/c51109c44cb39c94cfa9644334a810d3aeee66d2
https://github.com/percona/percona-xtradb-cluster/pull/259/commits

I'm, all good now. Thank you.

Changed in percona-xtradb-cluster:
milestone: none → 5.6.32-25.17
status: Fix Committed → Fix Released
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-666

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-665

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.