First connection after changing wsrep_cluster_address hangs
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
MySQL patches by Codership |
Fix Released
|
Medium
|
Seppo Jaakola | |||
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC | Status tracked in 5.6 | |||||
5.5 |
Fix Released
|
Undecided
|
Unassigned | |||
5.6 |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
As reported by Patrick Zoblisein:
I've been doing some more testing of the `SET GLOBAL wsrep_cluster_
Performing the dynamic change in my lab environment works - no issues as Seppo notes above. It's not until I make a connection to the node, "post-set-
[root@lab1 pxc]# mysql -u root -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.5.20-beta23.4-log Percona XtraDB Cluster (GPL) 5.5.20-beta23.4, Revision 3724, wsrep_23.4.r3724
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> show processlist;
+----+-
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |
+----+-
| 1 | system user | | NULL | Sleep | 13 | wsrep aborter idle | NULL | 0 | 0 | 1 |
| 2 | system user | | NULL | Sleep | 13 | NULL | NULL | 0 | 0 | 1 |
| 3 | event_scheduler | localhost | NULL | Daemon | 13 | Waiting on empty queue | NULL | 0 | 0 | 1 |
| 4 | root | localhost | NULL | Query | 0 | sleeping | show processlist | 0 | 0 | 1 |
+----+-
4 rows in set (0.00 sec)
mysql> show variables like 'wsrep_
+------
| Variable_name | Value |
+------
| wsrep_cluster_
+------
1 row in set (0.00 sec)
mysql> show processlist;
+----+-
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |
+----+-
| 1 | system user | | NULL | Sleep | 38 | wsrep aborter idle | NULL | 0 | 0 | 1 |
| 2 | system user | | NULL | Sleep | 38 | NULL | NULL | 0 | 0 | 1 |
| 3 | event_scheduler | localhost | NULL | Daemon | 38 | Waiting on empty queue | NULL | 0 | 0 | 1 |
| 4 | root | localhost | NULL | Query | 0 | sleeping | show processlist | 0 | 0 | 2 |
+----+-
4 rows in set (0.00 sec)
mysql> set global wsrep_cluster_
Query OK, 0 rows affected (2.01 sec)
mysql> show processlist;
+----+-
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |
+----+-
| 3 | event_scheduler | localhost | NULL | Daemon | 67 | Waiting on empty queue | NULL | 0 | 0 | 1 |
| 4 | root | localhost | NULL | Query | 0 | sleeping | show processlist | 0 | 0 | 2 |
| 5 | system user | | NULL | Sleep | 4 | wsrep aborter idle | NULL | 0 | 0 | 1 |
| 6 | system user | | NULL | Sleep | 4 | NULL | NULL | 0 | 0 | 1 |
+----+-
4 rows in set (0.00 sec)
mysql> set global wsrep_cluster_
Query OK, 0 rows affected (3.00 sec)
mysql> set global wsrep_cluster_
Query OK, 0 rows affected (2.00 sec)
mysql> show variables like 'wsrep_
+------
| Variable_name | Value |
+------
| wsrep_cluster_
+------
1 row in set (0.01 sec)
mysql> set global wsrep_cluster_
Query OK, 0 rows affected (3.01 sec)
mysql> show variables like 'wsrep_
+------
| Variable_name | Value |
+------
| wsrep_cluster_
+------
1 row in set (0.00 sec)
mysql> set global wsrep_cluster_
Query OK, 0 rows affected (2.00 sec)
Now, I try to make another connection - which just sits there:
[root@lab1 workarea]# mysql -u root -p
Enter password:
Flip over to my working connection and I see a new connection stuck in the "login" state:
mysql> show processlist;
+----+-
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |
+----+-
| 3 | event_scheduler | localhost | NULL | Daemon | 810 | Waiting on empty queue | NULL | 0 | 0 | 1 |
| 4 | root | localhost | NULL | Query | 0 | sleeping | show processlist | 0 | 0 | 2 |
| 13 | system user | | NULL | Sleep | 430 | NULL | NULL | 0 | 0 | 1 |
| 14 | system user | | NULL | Sleep | 430 | wsrep aborter idle | NULL | 0 | 0 | 1 |
| 15 | unauthenticated user | connecting host | NULL | Connect | NULL | login | NULL | 0 | 0 | 1 |
+----+-
5 rows in set (0.00 sec)
NOTE - this does not block all new connections by the way - open another window and make another connection and we're OK.
[root@lab1 ~]# mysql -u root -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 16
Server version: 5.5.20-beta23.4-log Percona XtraDB Cluster (GPL) 5.5.20-beta23.4 , Revision 3724, wsrep_23.4.r3724
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> show processlist;
+----+-
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |
+----+-
| 3 | event_scheduler | localhost | NULL | Daemon | 1016 | Waiting on empty queue | NULL | 0 | 0 | 1 |
| 4 | root | localhost | NULL | Sleep | 206 | | NULL | 0 | 0 | 2 |
| 13 | system user | | NULL | Sleep | 636 | NULL | NULL | 0 | 0 | 1 |
| 14 | system user | | NULL | Sleep | 636 | wsrep aborter idle | NULL | 0 | 0 | 1 |
| 15 | unauthenticated user | connecting host | NULL | Connect | NULL | login | NULL | 0 | 0 | 1 |
| 16 | root | localhost | NULL | Query | 0 | sleeping | show processlist | 0 | 0 | 1 |
+----+-
6 rows in set (0.00 sec)
The issue then becomes not being able to do another dynamic setting of the cluster_address:
mysql> set global wsrep_cluster_
(wsrep_debug is equal to 1 in the cnf file)
120329 23:48:27 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120329 23:48:27 [Note] WSREP: Assign initial position for certification: 150026, protocol version: 2
120329 23:48:27 [Note] WSREP: Synchronized with group, ready for connections
120329 23:48:27 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120330 0:00:03 [Note] WSREP: Stop replication
120330 0:00:03 [Note] WSREP: Provider disconnect
120330 0:00:03 [Note] WSREP: Closing send monitor...
120330 0:00:03 [Note] WSREP: Closed send monitor.
120330 0:00:03 [Note] WSREP: gcomm: terminating thread
120330 0:00:03 [Note] WSREP: gcomm: joining thread
120330 0:00:03 [Note] WSREP: gcomm: closing backend
120330 0:00:03 [Note] WSREP: GMCast:
120330 0:00:03 [Note] WSREP: Received self-leave message.
120330 0:00:03 [Note] WSREP: gcomm: closed
120330 0:00:03 [Note] WSREP: Flow-control interval: [0, 0]
120330 0:00:03 [Note] WSREP: Received SELF-LEAVE. Closing connection.
120330 0:00:03 [Note] WSREP: Shifting SYNCED -> CLOSED (TO: 150026)
120330 0:00:03 [Note] WSREP: RECV thread exiting 0: Success
120330 0:00:03 [Note] WSREP: New cluster view: global state: b2a15cb4-
120330 0:00:03 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120330 0:00:03 [Note] WSREP: recv_thread() joined.
120330 0:00:03 [Note] WSREP: Closing slave action queue.
120330 0:00:03 [Note] WSREP: closing connection 15
120330 0:00:03 [Note] WSREP: closing connection 4
120330 0:00:03 [Note] WSREP: applier thread exiting (code:0)
120330 0:00:03 [Note] WSREP: closing applier 13
120330 0:00:05 [Note] WSREP: SST kill local trx: 15
120330 0:00:05 [Note] WSREP: SST kill local trx: 4
120330 0:00:05 [Note] WSREP: waiting for client connections to close: 5
My once working connection is no longer:
mysql> show processlist;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Performing a shutdown of the mysql service hangs and eventually times out - only option is to kill -9.
Is there anything I can do to better troubleshoot the hung user connection?
Thanks
Patrick
summary: |
- First connection after changeing wsrep_cluster_address hangs + First connection after changing wsrep_cluster_address hangs |
Changed in codership-mysql: | |
milestone: | 5.5.28-23.7 → 5.5.28-23.8 |
Changed in percona-xtradb-cluster: | |
status: | New → Confirmed |
Changed in codership-mysql: | |
milestone: | 5.5.30-24.8 → 5.5.31-23.7.4 |
Changed in percona-xtradb-cluster: | |
milestone: | none → 5.5.31-25 |
Changed in codership-mysql: | |
milestone: | 5.5.31-23.7.5 → 5.5.32-23.7.6 |
Changed in percona-xtradb-cluster: | |
milestone: | 5.5.33-23.7.6 → future-5.5 |
Changed in codership-mysql: | |
milestone: | 5.5.33-23.7.6 → 5.5.34-24.9 |
Changed in codership-mysql: | |
milestone: | 5.5.34-25.9 → 5.5.34-25.10 |
Changed in codership-mysql: | |
importance: | High → Medium |
This concerns only the very first connection. Subsequent connections go through.