Error "no such a transition EXECUTING -> COMMITTED" on the master node, table with external key
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
MySQL patches by Codership |
Fix Released
|
Undecided
|
Seppo Jaakola | |||
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC | Status tracked in 5.6 | |||||
5.5 |
Fix Released
|
Undecided
|
Unassigned | |||
5.6 |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
We have 3-nodes setup, huge tables, heavy load. Experienced frequent "no such a transition EXECUTING -> COMMITTED" shutdown. After some experiments We've found quite simple setup: 3 nodes, only first node is used for writing, two other nodes are just "hot-swap" to handle request upon first node failure. Error happened more and more frequently until cluster could not live for 5 minutes without destruction. Writing to "user_best_times" table causes master node (first node) to fail. Repeating the same write to the second node brings second node down as well. Third node then hangs in "Unknown comand" state (desync or whatever) or (sometimes) survives.
We intensively write to this table:
CREATE TABLE `user_best_times` (
`user` int(10) unsigned NOT NULL,
`week_day` tinyint(3) unsigned NOT NULL,
`hour` tinyint(3) unsigned NOT NULL,
`hits` smallint(5) unsigned NOT NULL,
PRIMARY KEY (`user`
KEY `fk_user_
CONSTRAINT `fk_user_
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Writes are like this:
INSERT INTO user_best_times (hits, hour, week_day, user)
Table is quite huge (87000000+ rows) and under heavy concurrency (4-50 requests) Percona goes down in a minute. Linked "users" table is updated not that frequently, it has 1868675 rows.
Finally it looks like I've found a workaround: I've removed FOREIGN KEY fk_user_
my.cnf in the attachment.
I can send logs (wsrep_debug=1) for the crush time via e-mail (don't want to publish it for everyone).
We use Percona cluster 5.5.29, Galera 23.2.2rc2(r136):
mysql> \s
--------------
/usr/local/
Connection id: 400
Current database: schema_lock
Current user: fbhub@192.
SSL: Not in use
Current pager: less
Using outfile: ''
Using delimiter: ;
Server version: 5.5.29 Source distribution, wsrep_23.7.1.rXXXX
Protocol version: 10
Connection: dbp01 via TCP/IP
Server characterset: utf8
Db characterset: utf8
Client characterset: utf8
Conn. characterset: utf8
TCP port: 8661
Uptime: 40 min 22 sec
Threads: 415 Questions: 257794 Slow queries: 2 Opens: 991 Flush tables: 3 Open tables: 182 Queries per second avg: 106.438
--------------
mysql> mysql> show status like 'wsrep%';
+------
| Variable_name | Value |
+------
| wsrep_local_
| wsrep_protocol_
| wsrep_last_
| wsrep_replicated | 66873 |
| wsrep_replicate
| wsrep_received | 7362 |
| wsrep_received_
| wsrep_local_commits | 66208 |
| wsrep_local_
| wsrep_local_
| wsrep_local_replays | 0 |
| wsrep_local_
| wsrep_local_
| wsrep_local_
| wsrep_local_
| wsrep_flow_
| wsrep_flow_
| wsrep_flow_
| wsrep_cert_
| wsrep_apply_oooe | 0.007636 |
| wsrep_apply_oool | 0.000000 |
| wsrep_apply_window | 1.007636 |
| wsrep_commit_oooe | 0.000000 |
| wsrep_commit_oool | 0.000000 |
| wsrep_commit_window | 1.000000 |
| wsrep_local_state | 4 |
| wsrep_local_
| wsrep_cert_
| wsrep_causal_reads | 0 |
| wsrep_incoming_
| wsrep_cluster_
| wsrep_cluster_size | 3 |
| wsrep_cluster_
| wsrep_cluster_
| wsrep_connected | ON |
| wsrep_local_index | 0 |
| wsrep_provider_name | Galera |
| wsrep_provider_
| wsrep_provider_
| wsrep_ready | ON |
+------
Changed in codership-mysql: | |
assignee: | nobody → Seppo Jaakola (seppo-jaakola) |
tags: |
added: foreign-keys removed: foreign-key |
tags: | added: i34695 |
Changed in percona-xtradb-cluster: | |
status: | Confirmed → In Progress |
Changed in percona-xtradb-cluster: | |
milestone: | 5.5.33-23.7.6 → future-5.5 |
@Dmitry,
Can you provide your error log (containing the backtrace etc.) to
check if this is a known issue - lp:1110561