"Certification failed for TO isolated action" when frequent truncate table

Bug #1737731 reported by Przemek
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.6
Fix Committed
Undecided
Unassigned
5.7
Fix Committed
Undecided
Unassigned

Bug Description

On a workload where one PXC node receives any DML writes and second node receives frequent TRUNCATE TABLE commands (to unrelated tables), while third node restarts, whole cluster fails with "Certification failed for TO isolated action" error.

Reproduced on PXC 5.6.37 with pretty basic configuration:
[mysqld]
binlog_format = ROW
innodb_buffer_pool_size = 100M
innodb_flush_log_at_trx_commit = 0
innodb_flush_method = O_DIRECT
datadir = /var/lib/mysql
innodb_autoinc_lock_mode = 2
wsrep_cluster_address = gcomm://172.28.128.3,172.28.128.4,172.28.128.5
wsrep_provider = /usr/lib64/galera3/libgalera_smm.so
wsrep_slave_threads = 1
wsrep_cluster_name = Cluster
wsrep_node_name = Node1
wsrep_node_address = 172.28.128.3
wsrep_sst_auth = "root:"

* How to reproduce *

Prepare simple tables:

Node1 > show create table t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `id` int(11) DEFAULT NULL,
  `a` char(10) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

Node1 > show create table t2\G
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `id` int(11) DEFAULT NULL,
  `a` char(10) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

Run in parallel on the nodes accordingly

-- node1:

for i in {1..10000}; do mysql test -e "TRUNCATE table t2"; sleep 0.1; done

-- node2:

./mysql_random_data_load_linux_amd64 -h172.28.128.4 -uroot --max-threads=10 --bulk-size=5 --max-retries=100 test t1 1000000

-- node3:

for i in {1..40}; do systemctl restart mysql; sleep 2; done

Result logs in the attachment. Basically, nodes 1 and 2 fail with:

2017-12-12 11:36:15 11545 [Note] WSREP: Assign initial position for certification: 767581, protocol version: 3
2017-12-12 11:36:15 11545 [Note] WSREP: Service thread queue flushed.
2017-12-12 11:36:15 11545 [ERROR] WSREP: Certification failed for TO isolated action: source: f55d4498-df2f-11e7-83f2-8f43083d5103 version: 3 local: 0 state: CERTIFYING flags: 65 conn_id: 1058 trx_id: -1 seqnos (l: 170264, g: 767592, s: 767571, d: -1, ts: 336496926390836)

Tags: i213972
Revision history for this message
Przemek (pmalkowski) wrote :
  • Logs Edit (34.7 KiB, application/x-tar)
Revision history for this message
Przemek (pmalkowski) wrote :

Also reproduced on PXC 5.7.19-17-57-log:

2017-12-12T12:39:47.150706Z 5 [Note] WSREP: New cluster view: global state: ea51527b-df36-11e7-a529-3e7b80537600:92030, view# 8: Primary, number of nodes: 2, my index: 1, protocol version 3
2017-12-12T12:39:47.150714Z 5 [Note] WSREP: Setting wsrep_ready to true
2017-12-12T12:39:47.159516Z 5 [Note] WSREP: REPL Protocols: 7 (3, 2)
2017-12-12T12:39:47.159996Z 5 [Note] WSREP: Assign initial position for certification: 92030, protocol version: 3
2017-12-12T12:39:47.161030Z 0 [Note] WSREP: Service thread queue flushed.
2017-12-12T12:39:47.161814Z 5 [ERROR] WSREP: Certification failed for TO isolated action: source: 1b19a14d-d689-11e7-8bab-9ebe5b9af4cc version: 3 local: 0 state: CERTIFYING flags: 65 conn_id: 416 trx_id: -1 seqnos (l: 93906, g: 92031, s: 92020, d: -1, ts: 1151558439465456)
2017-12-12T12:39:47.161848Z 5 [Note] WSREP: Closing send monitor...
2017-12-12T12:39:47.161854Z 5 [Note] WSREP: Closed send monitor.
2017-12-12T12:39:47.161861Z 5 [Note] WSREP: gcomm: terminating thread
2017-12-12T12:39:47.161870Z 5 [Note] WSREP: gcomm: joining thread
2017-12-12T12:39:47.162019Z 5 [Note] WSREP: gcomm: closing backend

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-906

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.