Comment 13 for bug 1396601

Revision history for this message
Jay Janssen (jay-janssen) wrote : Re: [Bug 1396601] Freshly started joiner gets stuck in joining state

That is the entire my.cnf, yes.

> On Dec 2, 2014, at 8:25 PM, Alex Yurchenko <email address hidden> wrote:
>
> *** This bug is a duplicate of bug 1373796 ***
> https://bugs.launchpad.net/bugs/1373796
>
> Huh, is that all my.cnf? Only one slave thread? Any errors in the error log between JOINED and SYNCED?
> Suspicion: two subsequent monitor drains with the same seqno.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1396601
>
> Title:
> Freshly started joiner gets stuck in joining state
>
> Status in Percona XtraDB Cluster - HA scalable solution for MySQL:
> New
>
> Bug description:
> I have an intermittent problem with freshly installed nodes (nodes
> with a clean datadir and fresh package install via vagrant) that get
> wedged in a 'Joining' state.
>
>
> [root@pxc3 mysql]# mysql -e "show global status like 'wsrep%'"
> +------------------------------+-------------------------------------------------------+
> | Variable_name | Value |
> +------------------------------+-------------------------------------------------------+
> | wsrep_local_state_uuid | cabef144-756b-11e4-ae70-326c53cc04d5 |
> | wsrep_protocol_version | 6 |
> | wsrep_last_committed | 12 |
> | wsrep_replicated | 0 |
> | wsrep_replicated_bytes | 0 |
> | wsrep_repl_keys | 0 |
> | wsrep_repl_keys_bytes | 0 |
> | wsrep_repl_data_bytes | 0 |
> | wsrep_repl_other_bytes | 0 |
> | wsrep_received | 2 |
> | wsrep_received_bytes | 276 |
> | wsrep_local_commits | 0 |
> | wsrep_local_cert_failures | 0 |
> | wsrep_local_replays | 0 |
> | wsrep_local_send_queue | 0 |
> | wsrep_local_send_queue_max | 1 |
> | wsrep_local_send_queue_min | 0 |
> | wsrep_local_send_queue_avg | 0.000000 |
> | wsrep_local_recv_queue | 392 |
> | wsrep_local_recv_queue_max | 392 |
> | wsrep_local_recv_queue_min | 0 |
> | wsrep_local_recv_queue_avg | 194.507614 |
> | wsrep_local_cached_downto | 18446744073709551615 |
> | wsrep_flow_control_paused_ns | 0 |
> | wsrep_flow_control_paused | 0.000000 |
> | wsrep_flow_control_sent | 0 |
> | wsrep_flow_control_recv | 0 |
> | wsrep_cert_deps_distance | 0.000000 |
> | wsrep_apply_oooe | 0.000000 |
> | wsrep_apply_oool | 0.000000 |
> | wsrep_apply_window | 0.000000 |
> | wsrep_commit_oooe | 0.000000 |
> | wsrep_commit_oool | 0.000000 |
> | wsrep_commit_window | 0.000000 |
> | wsrep_local_state | 1 |
> | wsrep_local_state_comment | Joining |
> | wsrep_cert_index_size | 0 |
> | wsrep_causal_reads | 0 |
> | wsrep_cert_interval | 0.000000 |
> | wsrep_incoming_addresses | 172.28.128.4:3306,172.28.128.7:3306,172.28.128.3:3306 |
> | wsrep_evs_delayed | |
> | wsrep_evs_evict_list | |
> | wsrep_evs_repl_latency | 0/0/0/0/0 |
> | wsrep_evs_state | OPERATIONAL |
> | wsrep_gcomm_uuid | 67850d43-756d-11e4-93cc-82163e0cb5e1 |
> | wsrep_cluster_conf_id | 7 |
> | wsrep_cluster_size | 3 |
> | wsrep_cluster_state_uuid | cabef144-756b-11e4-ae70-326c53cc04d5 |
> | wsrep_cluster_status | Primary |
> | wsrep_connected | ON |
> | wsrep_local_bf_aborts | 0 |
> | wsrep_local_index | 1 |
> | wsrep_provider_name | Galera |
> | wsrep_provider_vendor | Codership Oy <email address hidden> |
> | wsrep_provider_version | 3.8(rf6147dd) |
> | wsrep_ready | OFF |
> +------------------------------+-------------------------------------------------------+
>
>
> The node is stuck and I have to kill it. After restarting mysqld, it joins the cluster fine and normally. I cannot reproduce the 'Joining' wedge after the first time unless I rebuild the entire node from scratch.
>
> Oddly, the other nodes see it as a member of the cluster and it
> receives replication (wsrep_local_recv_queue grows).
>
>
> [root@pxc3 mysql]# rpm -qa | grep -i percona
> percona-toolkit-2.2.11-1.noarch
> percona-xtrabackup-2.2.6-5042.el7.x86_64
> Percona-XtraDB-Cluster-devel-56-5.6.21-25.8.938.el7.x86_64
> Percona-XtraDB-Cluster-shared-56-5.6.21-25.8.938.el7.x86_64
> Percona-XtraDB-Cluster-client-56-5.6.21-25.8.938.el7.x86_64
> Percona-XtraDB-Cluster-galera-3-3.8-1.3390.rhel7.x86_64
> Percona-XtraDB-Cluster-server-56-5.6.21-25.8.938.el7.x86_64
>
>
> I will attach the error log.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1396601/+subscriptions

Jay Janssen, Managing Consultant, Percona
http://about.me/jay.janssen