Node stuck when performing FLUSH TABLES WITH READ LOCK; and DDL action.

Bug #1265656 reported by Marco Tusa on 2014-01-02
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Status tracked in 5.6
5.5
High
Seppo Jaakola
5.6
High
Seppo Jaakola
Percona XtraDB Cluster
Status tracked in 5.6
5.5
Undecided
Unassigned
5.6
Undecided
Unassigned

Bug Description

Node stuck when performing FLUSH TABLES WITH READ LOCK; and DDL action.
We had an issue while performing a simple operation that include FLUSH TABLES WITH READ LOCK; CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB;
Our setup is a simple 3 node using Percona PXC 5.5.34 OR 5.6.14.
The three nodes are new and not doing any insert.

We run the command as above on node1 in TRX A.
FLUSH TABLES WITH READ LOCK; CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB;
The server remain in :
 2014-01-02 15:51:30 32587 [Note] WSREP: thread holds MDL locks at TI begin: CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB 1604
 2014-01-02 15:51:30 32587 [Note] WSREP: TO BEGIN: -1, 0 : CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB
not able to complete the DDL command:
(root@localhost:pm) [mysql]>show processlist;
+------+-------------+-----------+-------+---------+------+----------------------------+--------------------------------------------------------------+-----------+---------------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined |
+------+-------------+-----------+-------+---------+------+----------------------------+--------------------------------------------------------------+-----------+---------------+
| 1 | system user | | NULL | Sleep | 1967 | NULL | NULL | 0 | 0 |
| 2 | system user | | NULL | Sleep | 1967 | wsrep aborter idle | NULL | 0 | 0 |
| 3 | system user | | NULL | Sleep | 1966 | NULL | NULL | 0 | 0 |
| 4 | system user | | NULL | Sleep | 1966 | NULL | NULL | 0 | 0 |
| 5 | system user | | NULL | Sleep | 1966 | NULL | NULL | 0 | 0 |
| 1303 | root | localhost | mysql | Query | 0 | init | show processlist | 0 | 0 |
| 1604 | root | localhost | mysql | Query | 694 | Preparing for TO isolation | CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB | 0 | 0 | <-----------------------
+------+-------------+-----------+-------+---------+------+----------------------------+--------------------------------------------------------------+-----------+---------------+
7 rows in set (0.00 sec)

We then try to unlock the tables using a different connection on the same node.
So on node1 TRX B we run:
 (root@localhost:pm) [mysql]>UNLOCK TABLES;
 Query OK, 0 rows affected (0.00 sec)

 Nothing happened:
 | 1604 | root | localhost | mysql | Query | 807 | Preparing for TO isolation | CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB | 0 | 0 |

Also the error log is not reporting anything new.

So we kill the operation IN MySQL from TRX B:
 KILL 1604;

 At this point TRX A report:
 ERROR 2013 (HY000): Lost connection to MySQL server during query
and the operation is mark as killed:
1604 | root | localhost | mysql | Killed | 1175 | Preparing for TO isolation | CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB

On TRX A we try to run only the DDL statement CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB;
Again in the log we have:
2014-01-02 16:12:49 32587 [Note] WSREP: TO BEGIN: -1, 0 : CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB

And the processlist report:
3161 | root | localhost | mysql | Query | 14 | Preparing for TO isolation | CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB

At this point the only way to unlock the server is to restart it KILLING the process given it remain in pending state when try to stop it with mysqladministrator.

Doing the operation in a different way, using two connection it succeed.
In this test we will use 2 connections TRX A and B all on the same node Node1.

 So on TRX A we issue:
(root@localhost:pm) [test]>FLUSH TABLES WITH READ LOCK;
Query OK, 0 rows affected (0.00 sec)

And in the log we have:
 2014-01-02 16:35:30 13420 [Note] WSREP: Provider paused at b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:34

Then on TRX B we run:
  CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB;
in the log we have:
2014-01-02 16:37:38 13420 [Note] WSREP: TO BEGIN: -1, 0 : CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB

And processlist report the action pending:
 734 | root | localhost | mysql | Query | 72 | Preparing for TO isolation | CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB |

Issueing the UNLOCK on TRX A:
UNLOCK TABLES;

On TRX B:
Query OK, 0 rows affected (2 min 5.79 sec)

and log report:
2014-01-02 16:39:43 13420 [Note] WSREP: Provider resumed.
2014-01-02 16:39:43 13420 [Note] WSREP: TO BEGIN: 35, 2
2014-01-02 16:39:43 13420 [Note] WSREP: TO END: 35, 2 : CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB
2014-01-02 16:39:43 13420 [Note] WSREP: TO END: 35

So all is executed fine.

This issue is then present only when you do FLUSH TABLE WITH READ LOCK; and DDL action in the connection.

 How to replicate the issue
 --------------------------------
 1) ON NODE 1, connection 1 execute FLUSH TABLES WITH READ LOCK; CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB;
 2) ON NODE 2, try to perform UNLOCK TABLES; KILL etc

 LOG

FIRST test that doesn not work

140102 15:30:17 mysqld_safe Starting mysqld daemon with databases from /home/mysql/instances/galera1-56/data
140102 15:30:17 mysqld_safe Skipping wsrep-recover for b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:27 pair
140102 15:30:17 mysqld_safe Assigning b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:27 to wsrep_start_position
2014-01-02 15:30:17 0 [Note] WSREP: wsrep_start_position var submitted: 'b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:27'
2014-01-02 15:30:17 32587 [Warning] You need to use --log-bin to make --log-slave-updates work.
2014-01-02 15:30:17 32587 [Note] WSREP: Setting wsrep_ready to 0
2014-01-02 15:30:17 32587 [Note] WSREP: Read nil XID from storage engines, skipping position init
2014-01-02 15:30:17 32587 [Note] WSREP: wsrep_load(): loading provider library '/home/mysql/templates/PCX-56/lib/libgalera_smm.so'
2014-01-02 15:30:17 32587 [Note] WSREP: wsrep_load(): Galera 3.1(r169) by Codership Oy <email address hidden> loaded successfully.
2014-01-02 15:30:17 32587 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
2014-01-02 15:30:17 32587 [Note] WSREP: Found saved state: b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:27
2014-01-02 15:30:17 32587 [Note] WSREP: Passing config to GCS: base_host = 192.168.0.35; base_port = 4567; cert.log_conflicts = no; evs.send_window = 1024; evs.user_send_window = 1024; gcache.dir = /home/mysql/instances/galera1-56/data; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /home/mysql/instances/galera1-56/data/galera.cache; gcache.page_size = 1024M; gcache.size = 1024M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 30; gcs.fc_master_slave = NO; gcs.max_packet_size = 129000; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 2147483647; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; pc.npvo = true; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.proto_max = 5; socket.ssl_compression = no
2014-01-02 15:30:17 32587 [Note] WSREP: Assign initial position for certification: 27, protocol version: -1
2014-01-02 15:30:17 32587 [Note] WSREP: wsrep_sst_grab()
2014-01-02 15:30:17 32587 [Note] WSREP: Start replication
2014-01-02 15:30:17 32587 [Note] WSREP: Setting initial position to b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:27
2014-01-02 15:30:17 32587 [Note] WSREP: protonet asio version 0
2014-01-02 15:30:17 32587 [Note] WSREP: Using CRC-32C (optimized) for message checksums.
2014-01-02 15:30:17 32587 [Note] WSREP: backend: asio
2014-01-02 15:30:17 32587 [Note] WSREP: GMCast version 0
2014-01-02 15:30:17 32587 [Note] WSREP: (b110cbac-73ec-11e3-9752-73b9c8d41973, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2014-01-02 15:30:17 32587 [Note] WSREP: (b110cbac-73ec-11e3-9752-73b9c8d41973, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2014-01-02 15:30:17 32587 [Note] WSREP: EVS version 0
2014-01-02 15:30:17 32587 [Note] WSREP: PC version 0
2014-01-02 15:30:17 32587 [Note] WSREP: gcomm: connecting to group 'pxctest', peer '192.168.0.35:,192.168.0.7:,192.168.0.3:'
2014-01-02 15:30:17 32587 [Warning] WSREP: (b110cbac-73ec-11e3-9752-73b9c8d41973, 'tcp://0.0.0.0:4567') address 'tcp://192.168.0.35:4567' points to own listening address, blacklisting
2014-01-02 15:30:17 32587 [Note] WSREP: (b110cbac-73ec-11e3-9752-73b9c8d41973, 'tcp://0.0.0.0:4567') address 'tcp://192.168.0.35:4567' pointing to uuid b110cbac-73ec-11e3-9752-73b9c8d41973 is blacklisted, skipping
2014-01-02 15:30:17 32587 [Note] WSREP: (b110cbac-73ec-11e3-9752-73b9c8d41973, 'tcp://0.0.0.0:4567') address 'tcp://192.168.0.35:4567' pointing to uuid b110cbac-73ec-11e3-9752-73b9c8d41973 is blacklisted, skipping
2014-01-02 15:30:17 32587 [Note] WSREP: (b110cbac-73ec-11e3-9752-73b9c8d41973, 'tcp://0.0.0.0:4567') address 'tcp://192.168.0.35:4567' pointing to uuid b110cbac-73ec-11e3-9752-73b9c8d41973 is blacklisted, skipping
2014-01-02 15:30:17 32587 [Note] WSREP: (b110cbac-73ec-11e3-9752-73b9c8d41973, 'tcp://0.0.0.0:4567') address 'tcp://192.168.0.35:4567' pointing to uuid b110cbac-73ec-11e3-9752-73b9c8d41973 is blacklisted, skipping
2014-01-02 15:30:17 32587 [Note] WSREP: declaring b82e00ed-73e5-11e3-b0ac-d32a19f46857 stable
2014-01-02 15:30:17 32587 [Note] WSREP: declaring fb2a1879-73e4-11e3-bf3e-c77638b17d76 stable
2014-01-02 15:30:17 32587 [Note] WSREP: Node b82e00ed-73e5-11e3-b0ac-d32a19f46857 state prim
2014-01-02 15:30:17 32587 [Note] WSREP: view(view_id(PRIM,b110cbac-73ec-11e3-9752-73b9c8d41973,18) memb {
 b110cbac-73ec-11e3-9752-73b9c8d41973,0
 b82e00ed-73e5-11e3-b0ac-d32a19f46857,0
 fb2a1879-73e4-11e3-bf3e-c77638b17d76,0
} joined {
} left {
} partitioned {
})
2014-01-02 15:30:17 32587 [Note] WSREP: gcomm: connected
2014-01-02 15:30:17 32587 [Note] WSREP: Changing maximum packet size to 129000, resulting msg size: 32636
2014-01-02 15:30:17 32587 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2014-01-02 15:30:17 32587 [Note] WSREP: Opened channel 'pxctest'
2014-01-02 15:30:17 32587 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 3
2014-01-02 15:30:17 32587 [Note] WSREP: Waiting for SST to complete.
2014-01-02 15:30:17 32587 [Note] WSREP: STATE_EXCHANGE: sent state UUID: b15ea193-73ec-11e3-85dd-b348a04a85ad
2014-01-02 15:30:17 32587 [Note] WSREP: STATE EXCHANGE: sent state msg: b15ea193-73ec-11e3-85dd-b348a04a85ad
2014-01-02 15:30:17 32587 [Note] WSREP: STATE EXCHANGE: got state msg: b15ea193-73ec-11e3-85dd-b348a04a85ad from 0 (node1)
2014-01-02 15:30:17 32587 [Note] WSREP: STATE EXCHANGE: got state msg: b15ea193-73ec-11e3-85dd-b348a04a85ad from 1 (node1)
2014-01-02 15:30:17 32587 [Note] WSREP: STATE EXCHANGE: got state msg: b15ea193-73ec-11e3-85dd-b348a04a85ad from 2 (node1)
2014-01-02 15:30:17 32587 [Note] WSREP: Quorum results:
 version = 3,
 component = PRIMARY,
 conf_id = 17,
 members = 3/3 (joined/total),
 act_id = 27,
 last_appl. = -1,
 protocols = 0/5/2 (gcs/repl/appl),
 group UUID = b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc
2014-01-02 15:30:17 32587 [Note] WSREP: Flow-control interval: [52, 52]
2014-01-02 15:30:17 32587 [Note] WSREP: Restored state OPEN -> JOINED (27)
2014-01-02 15:30:17 32587 [Note] WSREP: New cluster view: global state: b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:27, view# 18: Primary, number of nodes: 3, my index: 0, protocol version 2
2014-01-02 15:30:17 32587 [Note] WSREP: SST complete, seqno: 27
2014-01-02 15:30:17 32587 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
2014-01-02 15:30:17 32587 [Note] Plugin 'FEDERATED' is disabled.
2014-01-02 15:30:17 32587 [Note] WSREP: Member 0 (node1) synced with group.
2014-01-02 15:30:17 32587 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 27)
2014-01-02 15:30:17 32587 [Note] InnoDB: The InnoDB memory heap is disabled
2014-01-02 15:30:17 32587 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2014-01-02 15:30:17 32587 [Note] InnoDB: Compressed tables use zlib 1.2.3
2014-01-02 15:30:17 32587 [Note] InnoDB: Using Linux native AIO
2014-01-02 15:30:17 32587 [Note] InnoDB: Not using CPU crc32 instructions
2014-01-02 15:30:17 32587 [Note] InnoDB: Initializing buffer pool, size = 200.0M
2014-01-02 15:30:17 32587 [Note] InnoDB: Completed initialization of buffer pool
2014-01-02 15:30:17 32587 [Note] InnoDB: Highest supported file format is Barracuda.
2014-01-02 15:30:17 32587 [Note] InnoDB: 128 rollback segment(s) are active.
2014-01-02 15:30:17 32587 [Note] InnoDB: Waiting for purge to start
2014-01-02 15:30:18 32587 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.14-rel61.1 started; log sequence number 1665787
2014-01-02 15:30:18 32587 [Note] WSREP: Initial TC log open: dummy
2014-01-02 15:30:18 32587 [Note] RSA private key file not found: /home/mysql/instances/galera1-56/data//private_key.pem. Some authentication plugins will not work.
2014-01-02 15:30:18 32587 [Note] RSA public key file not found: /home/mysql/instances/galera1-56/data//public_key.pem. Some authentication plugins will not work.
2014-01-02 15:30:18 32587 [Note] Server hostname (bind-address): '0.0.0.0'; port: 3306
2014-01-02 15:30:18 32587 [Note] - '0.0.0.0' resolves to '0.0.0.0';
2014-01-02 15:30:18 32587 [Note] Server socket created on IP: '0.0.0.0'.
2014-01-02 15:30:18 32587 [Warning] 'user' entry 'root@tusacentral01' ignored in --skip-name-resolve mode.
2014-01-02 15:30:18 32587 [Warning] 'proxies_priv' entry '@ root@tusacentral01' ignored in --skip-name-resolve mode.
2014-01-02 15:30:18 32587 [Note] Event Scheduler: Loaded 0 events
2014-01-02 15:30:18 32587 [Note] WSREP: Set WSREPXid for InnoDB: b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:27
2014-01-02 15:30:18 32587 [Note] WSREP: inited wsrep sidno 1
2014-01-02 15:30:18 32587 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2014-01-02 15:30:18 32587 [Note] /home/mysql/templates/PCX-56/bin/mysqld: ready for connections.
Version: '5.6.14-25.1' socket: '/home/mysql/instances/galera1-56/mysql.sock' port: 3306 Percona XtraDB Cluster (GPL) 5.6.14-25.1, Revision 566, wsrep_25.1.r4019
2014-01-02 15:30:18 32587 [Note] WSREP: REPL Protocols: 5 (3, 1)
2014-01-02 15:30:18 32587 [Note] WSREP: Assign initial position for certification: 27, protocol version: 3
2014-01-02 15:30:18 32587 [Note] WSREP: Service thread queue flushed.
2014-01-02 15:30:18 32587 [Note] WSREP: Synchronized with group, ready for connections
2014-01-02 15:30:18 32587 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2014-01-02 15:51:30 32587 [Note] WSREP: Provider paused at b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:27
2014-01-02 15:51:30 32587 [Note] WSREP: thread holds MDL locks at TI begin: CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB 1604
2014-01-02 15:51:30 32587 [Note] WSREP: TO BEGIN: -1, 0 : CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB
2014-01-02 16:12:49 32587 [Note] WSREP: TO BEGIN: -1, 0 : CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB
2014-01-02 16:14:09 32587 [Note] WSREP: TO BEGIN: -1, 0 : CREATE TABLE IF NOT EXISTS test.mytest2 (a INT) ENGINE=INNODB
2014-01-02 16:15:02 32587 [Note] WSREP: TO BEGIN: -1, 0 : CREATE TABLE IF NOT EXISTS test.mytest2 (a INT) ENGINE=INNODB
2014-01-02 16:15:20 32587 [Note] WSREP: TO BEGIN: -1, 0 : CREATE TABLE IF NOT EXISTS test.mytest2 (a INT) ENGINE=INNODB

Second TEST that works

140102 16:27:40 mysqld_safe Starting mysqld daemon with databases from /home/mysql/instances/galera1-56/data
140102 16:27:40 mysqld_safe Skipping wsrep-recover for b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:27 pair
140102 16:27:40 mysqld_safe Assigning b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:27 to wsrep_start_position
2014-01-02 16:27:40 0 [Note] WSREP: wsrep_start_position var submitted: 'b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:27'
2014-01-02 16:27:40 13420 [Warning] You need to use --log-bin to make --log-slave-updates work.
2014-01-02 16:27:40 13420 [Note] WSREP: Setting wsrep_ready to 0
2014-01-02 16:27:40 13420 [Note] WSREP: Read nil XID from storage engines, skipping position init
2014-01-02 16:27:40 13420 [Note] WSREP: wsrep_load(): loading provider library '/home/mysql/templates/PCX-56/lib/libgalera_smm.so'
2014-01-02 16:27:40 13420 [Note] WSREP: wsrep_load(): Galera 3.1(r169) by Codership Oy <email address hidden> loaded successfully.
2014-01-02 16:27:40 13420 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
2014-01-02 16:27:40 13420 [Note] WSREP: Found saved state: b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:27
2014-01-02 16:27:40 13420 [Note] WSREP: Passing config to GCS: base_host = 192.168.0.35; base_port = 4567; cert.log_conflicts = no; evs.send_window = 1024; evs.user_send_window = 1024; gcache.dir = /home/mysql/instances/galera1-56/data; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /home/mysql/instances/galera1-56/data/galera.cache; gcache.page_size = 1024M; gcache.size = 1024M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 30; gcs.fc_master_slave = NO; gcs.max_packet_size = 129000; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 2147483647; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; pc.npvo = true; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.proto_max = 5; socket.ssl_compression = no
2014-01-02 16:27:40 13420 [Note] WSREP: Assign initial position for certification: 27, protocol version: -1
2014-01-02 16:27:40 13420 [Note] WSREP: wsrep_sst_grab()
2014-01-02 16:27:40 13420 [Note] WSREP: Start replication
2014-01-02 16:27:40 13420 [Note] WSREP: Setting initial position to b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:27
2014-01-02 16:27:40 13420 [Note] WSREP: protonet asio version 0
2014-01-02 16:27:40 13420 [Note] WSREP: Using CRC-32C (optimized) for message checksums.
2014-01-02 16:27:40 13420 [Note] WSREP: backend: asio
2014-01-02 16:27:40 13420 [Note] WSREP: GMCast version 0
2014-01-02 16:27:40 13420 [Note] WSREP: (b53d8c69-73f4-11e3-9631-a39552f86a14, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2014-01-02 16:27:40 13420 [Note] WSREP: (b53d8c69-73f4-11e3-9631-a39552f86a14, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2014-01-02 16:27:40 13420 [Note] WSREP: EVS version 0
2014-01-02 16:27:40 13420 [Note] WSREP: PC version 0
2014-01-02 16:27:40 13420 [Note] WSREP: gcomm: connecting to group 'pxctest', peer '192.168.0.35:,192.168.0.7:,192.168.0.3:'
2014-01-02 16:27:40 13420 [Warning] WSREP: (b53d8c69-73f4-11e3-9631-a39552f86a14, 'tcp://0.0.0.0:4567') address 'tcp://192.168.0.35:4567' points to own listening address, blacklisting
2014-01-02 16:27:40 13420 [Note] WSREP: (b53d8c69-73f4-11e3-9631-a39552f86a14, 'tcp://0.0.0.0:4567') address 'tcp://192.168.0.35:4567' pointing to uuid b53d8c69-73f4-11e3-9631-a39552f86a14 is blacklisted, skipping
2014-01-02 16:27:40 13420 [Note] WSREP: (b53d8c69-73f4-11e3-9631-a39552f86a14, 'tcp://0.0.0.0:4567') address 'tcp://192.168.0.35:4567' pointing to uuid b53d8c69-73f4-11e3-9631-a39552f86a14 is blacklisted, skipping
2014-01-02 16:27:40 13420 [Note] WSREP: (b53d8c69-73f4-11e3-9631-a39552f86a14, 'tcp://0.0.0.0:4567') address 'tcp://192.168.0.35:4567' pointing to uuid b53d8c69-73f4-11e3-9631-a39552f86a14 is blacklisted, skipping
2014-01-02 16:27:40 13420 [Note] WSREP: (b53d8c69-73f4-11e3-9631-a39552f86a14, 'tcp://0.0.0.0:4567') address 'tcp://192.168.0.35:4567' pointing to uuid b53d8c69-73f4-11e3-9631-a39552f86a14 is blacklisted, skipping
2014-01-02 16:27:40 13420 [Note] WSREP: declaring b82e00ed-73e5-11e3-b0ac-d32a19f46857 stable
2014-01-02 16:27:40 13420 [Note] WSREP: declaring fb2a1879-73e4-11e3-bf3e-c77638b17d76 stable
2014-01-02 16:27:40 13420 [Note] WSREP: Node b82e00ed-73e5-11e3-b0ac-d32a19f46857 state prim
2014-01-02 16:27:40 13420 [Note] WSREP: view(view_id(PRIM,b53d8c69-73f4-11e3-9631-a39552f86a14,20) memb {
 b53d8c69-73f4-11e3-9631-a39552f86a14,0
 b82e00ed-73e5-11e3-b0ac-d32a19f46857,0
 fb2a1879-73e4-11e3-bf3e-c77638b17d76,0
} joined {
} left {
} partitioned {
})
2014-01-02 16:27:40 13420 [Note] WSREP: gcomm: connected
2014-01-02 16:27:40 13420 [Note] WSREP: Changing maximum packet size to 129000, resulting msg size: 32636
2014-01-02 16:27:40 13420 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2014-01-02 16:27:40 13420 [Note] WSREP: Opened channel 'pxctest'
2014-01-02 16:27:40 13420 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 3
2014-01-02 16:27:40 13420 [Note] WSREP: Waiting for SST to complete.
2014-01-02 16:27:40 13420 [Note] WSREP: STATE_EXCHANGE: sent state UUID: b58b7387-73f4-11e3-923e-dfa43ab9965c
2014-01-02 16:27:40 13420 [Note] WSREP: STATE EXCHANGE: sent state msg: b58b7387-73f4-11e3-923e-dfa43ab9965c
2014-01-02 16:27:40 13420 [Note] WSREP: STATE EXCHANGE: got state msg: b58b7387-73f4-11e3-923e-dfa43ab9965c from 0 (node1)
2014-01-02 16:27:40 13420 [Note] WSREP: STATE EXCHANGE: got state msg: b58b7387-73f4-11e3-923e-dfa43ab9965c from 1 (node1)
2014-01-02 16:27:40 13420 [Note] WSREP: STATE EXCHANGE: got state msg: b58b7387-73f4-11e3-923e-dfa43ab9965c from 2 (node1)
2014-01-02 16:27:40 13420 [Note] WSREP: Quorum results:
 version = 3,
 component = PRIMARY,
 conf_id = 19,
 members = 2/3 (joined/total),
 act_id = 32,
 last_appl. = -1,
 protocols = 0/5/2 (gcs/repl/appl),
 group UUID = b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc
2014-01-02 16:27:40 13420 [Note] WSREP: Flow-control interval: [52, 52]
2014-01-02 16:27:40 13420 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 32)
2014-01-02 16:27:40 13420 [Note] WSREP: State transfer required:
 Group state: b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:32
 Local state: b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:27
2014-01-02 16:27:40 13420 [Note] WSREP: New cluster view: global state: b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:32, view# 20: Primary, number of nodes: 3, my index: 0, protocol version 2
2014-01-02 16:27:40 13420 [Warning] WSREP: Gap in state sequence. Need state transfer.
2014-01-02 16:27:40 13420 [Note] WSREP: Setting wsrep_ready to 0
2014-01-02 16:27:40 13420 [Note] WSREP: [debug]: closing client connections for PRIM
2014-01-02 16:27:42 13420 [Note] WSREP: waiting for client connections to close: 2
2014-01-02 16:27:42 13420 [Note] WSREP: Running: 'wsrep_sst_xtrabackup --role 'joiner' --address '192.168.0.35' --auth 'sstuser:mysqlsst' --datadir '/home/mysql/instances/galera1-56/data/' --defaults-file '/home/mysql/instances/galera1-56/my.cnf' --parent '13420' '' '
WSREP_SST: [INFO] Streaming with tar (20140102 16:27:42.792)
WSREP_SST: [INFO] Using socat as streamer (20140102 16:27:42.795)
2014-01-02 16:27:42 13420 [Note] WSREP: Prepared SST request: xtrabackup|192.168.0.35:4444/xtrabackup_sst
2014-01-02 16:27:42 13420 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2014-01-02 16:27:42 13420 [Note] WSREP: REPL Protocols: 5 (3, 1)
2014-01-02 16:27:42 13420 [Note] WSREP: Assign initial position for certification: 32, protocol version: 3
2014-01-02 16:27:42 13420 [Note] WSREP: Service thread queue flushed.
2014-01-02 16:27:42 13420 [Note] WSREP: Prepared IST receiver, listening at: tcp://192.168.0.35:4568
2014-01-02 16:27:42 13420 [Note] WSREP: Node 0.0 (node1) requested state transfer from '*any*'. Selected 1.0 (node1)(SYNCED) as donor.
2014-01-02 16:27:42 13420 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 32)
2014-01-02 16:27:42 13420 [Note] WSREP: Requesting state transfer: success, donor: 1
WSREP_SST: [INFO] Evaluating socat -u TCP-LISTEN:4444,reuseaddr stdio | tar xfi - --recursive-unlink -h; RC=( ${PIPESTATUS[@]} ) (20140102 16:27:42.813)
tar: xtrabackup_galera_info: time stamp 2014-01-02 16:28:13 is 29.894023605 s in the future
tar: xtrabackup_ist: time stamp 2014-01-02 16:28:13 is 29.893577406 s in the future
WSREP_SST: [INFO] xtrabackup_ist received from donor: Running IST (20140102 16:27:43.130)
WSREP_SST: [INFO] Total time on joiner: 0 seconds (20140102 16:27:43.133)
WSREP_SST: [INFO] Removing the sst_in_progress file (20140102 16:27:43.135)
2014-01-02 16:27:43 13420 [Note] WSREP: SST complete, seqno: 27
2014-01-02 16:27:43 13420 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
2014-01-02 16:27:43 13420 [Note] Plugin 'FEDERATED' is disabled.
2014-01-02 16:27:43 13420 [Note] InnoDB: The InnoDB memory heap is disabled
2014-01-02 16:27:43 13420 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2014-01-02 16:27:43 13420 [Note] InnoDB: Compressed tables use zlib 1.2.3
2014-01-02 16:27:43 13420 [Note] InnoDB: Using Linux native AIO
2014-01-02 16:27:43 13420 [Note] InnoDB: Not using CPU crc32 instructions
2014-01-02 16:27:43 13420 [Note] InnoDB: Initializing buffer pool, size = 200.0M
2014-01-02 16:27:43 13420 [Note] InnoDB: Completed initialization of buffer pool
2014-01-02 16:27:43 13420 [Note] InnoDB: Highest supported file format is Barracuda.
2014-01-02 16:27:43 13420 [Note] InnoDB: The log sequence numbers 1665787 and 1665787 in ibdata files do not match the log sequence number 1665951 in the ib_logfiles!
2014-01-02 16:27:43 13420 [Note] InnoDB: Database was not shutdown normally!
2014-01-02 16:27:43 13420 [Note] InnoDB: Starting crash recovery.
2014-01-02 16:27:43 13420 [Note] InnoDB: Reading tablespace information from the .ibd files...
2014-01-02 16:27:43 13420 [Note] InnoDB: Restoring possible half-written data pages
2014-01-02 16:27:43 13420 [Note] InnoDB: from the doublewrite buffer...
2014-01-02 16:27:43 13420 [Note] InnoDB: 128 rollback segment(s) are active.
2014-01-02 16:27:43 13420 [Note] InnoDB: Waiting for purge to start
2014-01-02 16:27:43 13420 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.14-rel61.1 started; log sequence number 1665951
2014-01-02 16:27:43 13420 [Note] WSREP: Initial TC log open: dummy
2014-01-02 16:27:43 13420 [Note] RSA private key file not found: /home/mysql/instances/galera1-56/data//private_key.pem. Some authentication plugins will not work.
2014-01-02 16:27:43 13420 [Note] RSA public key file not found: /home/mysql/instances/galera1-56/data//public_key.pem. Some authentication plugins will not work.
2014-01-02 16:27:43 13420 [Note] Server hostname (bind-address): '0.0.0.0'; port: 3306
2014-01-02 16:27:43 13420 [Note] - '0.0.0.0' resolves to '0.0.0.0';
2014-01-02 16:27:43 13420 [Note] Server socket created on IP: '0.0.0.0'.
2014-01-02 16:27:43 13420 [Warning] 'user' entry 'root@tusacentral01' ignored in --skip-name-resolve mode.
2014-01-02 16:27:43 13420 [Warning] 'proxies_priv' entry '@ root@tusacentral01' ignored in --skip-name-resolve mode.
2014-01-02 16:27:43 13420 [Note] Event Scheduler: Loaded 0 events
2014-01-02 16:27:43 13420 [Note] WSREP: Signalling provider to continue.
2014-01-02 16:27:43 13420 [Note] WSREP: inited wsrep sidno 1
2014-01-02 16:27:43 13420 [Note] WSREP: SST received: b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:27
2014-01-02 16:27:43 13420 [Note] WSREP: Receiving IST: 5 writesets, seqnos 27-32
2014-01-02 16:27:43 13420 [Note] /home/mysql/templates/PCX-56/bin/mysqld: ready for connections.
Version: '5.6.14-25.1' socket: '/home/mysql/instances/galera1-56/mysql.sock' port: 3306 Percona XtraDB Cluster (GPL) 5.6.14-25.1, Revision 566, wsrep_25.1.r4019
2014-01-02 16:27:44 13420 [Note] WSREP: 1.0 (node1): State transfer to 0.0 (node1) complete.
2014-01-02 16:27:44 13420 [Note] WSREP: Member 1 (node1) synced with group.
2014-01-02 16:27:44 13420 [Note] WSREP: IST received: b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:32
2014-01-02 16:27:44 13420 [Note] WSREP: 0.0 (node1): State transfer from 1.0 (node1) complete.
2014-01-02 16:27:44 13420 [Note] WSREP: Shifting JOINER -> JOINED (TO: 32)
2014-01-02 16:27:44 13420 [Note] WSREP: Member 0 (node1) synced with group.
2014-01-02 16:27:44 13420 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 32)
2014-01-02 16:27:44 13420 [Note] WSREP: Synchronized with group, ready for connections
2014-01-02 16:27:44 13420 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

2014-01-02 16:35:30 13420 [Note] WSREP: Provider paused at b8aedb8b-73c7-11e3-bb4f-6796f0fb29fc:34
2014-01-02 16:37:38 13420 [Note] WSREP: TO BEGIN: -1, 0 : CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB
2014-01-02 16:39:43 13420 [Note] WSREP: Provider resumed.
2014-01-02 16:39:43 13420 [Note] WSREP: TO BEGIN: 35, 2
2014-01-02 16:39:43 13420 [Note] WSREP: TO END: 35, 2 : CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB
2014-01-02 16:39:43 13420 [Note] WSREP: TO END: 35

Related branches

Alex Yurchenko (ayurchen) wrote :

Hi Marco,

At first glance this is an expected behavior. The same will happen with any write you want to replicate.
The way it is implemented right now FTWRL pretty much makes the node read-only. Is it a problem?

Alex,
I don't think it is expected behavior

Please read carefully the tests we have run and you will see that the expected behavior is the second test.

To summarize the tests:
Both against a cluster of 3 nodes running fine.

Test 1
Two connection against the same node.
Connection A run FLUSH TABLES WITH READ LOCK; and DDL
Connection A remain in a BLOCK state and nothing can go ahead.

Connection B run UNLOCK tables; nothing happened;
Connection B kill the Connection A;
Node remain unusable and do not apply changes from the other node.
The only way to UNLOCK the node is to restart it.

Expected behavior happen in
TEST 2.
There we have 2 connection against the same node.
Connection A run FLUSH TABLES WITH READ LOCK;
Connection B run the DDL, and remain pending;
Connection A UNLOCK the tables;
Connection B execute the DDL.

------------------------------------------------
Marco Tusa,
Mobile: +1-613-282-7337
do something for your planet!
http://www.tusacentral.net/
------------------------------------------------
Save a tree... please don't print this e-mail unless you really need to

On 04/01/2014 12:58, Alex Yurchenko wrote:
> Hi Marco,
>
> At first glance this is an expected behavior. The same will happen with any write you want to replicate.
> The way it is implemented right now FTWRL pretty much makes the node read-only. Is it a problem?
>

Alex Yurchenko (ayurchen) wrote :

Marco, sorry I didn't read the description carefully. And ability to deadlock a node is a bug anyways. So, what we have now:
1) Attempt to execute DML under FTWRL is rejected with "incompatible lock" error.
2) Killing unblocked FTWRL session results in lock removal.
Would it then be a sufficient fix to simply reject DDLs the way DMLs are rejected?

Marco Tusa (marcotusa) wrote :

Yes I suppose it is.

------------------------------------------------
Marco Tusa,
Mobile: +1-613-282-7337
do something for your planet!
http://www.tusacentral.net/
------------------------------------------------
Save a tree... please don't print this e-mail unless you really need to

On 05/01/2014 06:28, Alex Yurchenko wrote:
> Marco, sorry I didn't read the description carefully. And ability to deadlock a node is a bug anyways. So, what we have now:
> 1) Attempt to execute DML under FTWRL is rejected with "incompatible lock" error.
> 2) Killing unblocked FTWRL session results in lock removal.
> Would it then be a sufficient fix to simply reject DDLs the way DMLs are rejected?
>

Marco Tusa (marcotusa) wrote :

Alex, Yes I assume it will be enough for now to have an error rejecting it, at least we can code something around it.

Rather than galera this should be fixed in mysqld for following
reasons:

a) Only the connection/thread which issued FTWRL should be prevented from
issuing other DDL/DML and not other threads. I am not sure if
Galera can make that distinction.

b) To conform to vanilla mysql behavior:

MySQL [test]> flush tables with read lock;
Query OK, 0 rows affected (0.00 sec)

MySQL [test]> CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB;
ERROR 1223 (HY000): Can't execute the query because you have a conflicting read lock

So, I have fixed this in wsrep_to_isolation_begin.

After the fix:

MySQL [test]> FLUSH TABLES WITH READ LOCK;
Query OK, 0 rows affected (0.00 sec)

MySQL [test]> CREATE TABLE IF NOT EXISTS test.mytest (a INT) ENGINE=INNODB;
ERROR 1223 (HY000): Can't execute the query because you have a conflicting read lock
MySQL [test]> unlock tables;
Query OK, 0 rows affected (0.00 sec)

Alex Yurchenko (ayurchen) wrote :

Raghu, you're absolutely right there about MySQL

no longer affects: galera
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers