node does not abort/exit when trx replaying fails
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Galera |
Fix Released
|
Critical
|
Teemu Ollakka |
Bug Description
E.g. caught this:
110313 19:28:12 [Note] WSREP: Member 2 (ip-10-226-70-143) synced with group.
110313 19:34:53 [ERROR] WSREP: invalid state APPLYING (FATAL)
at galera/
110313 19:34:53 [Warning] WSREP: cancel commit bad exit: 6 9470094
110313 19:44:32 [ERROR] Slave SQL: Could not execute Update_rows event on table test.comm04; Deadlock found when trying to get lock; try restarting transaction, Error_code: 1213; handler error HA_ERR_
110313 19:44:32 [Warning] WSREP: RBR event 30 apply warning: 149, 6393734
110313 19:44:32 [Warning] WSREP trx_replay failed for: 7
110313 20:28:03 [Warning] WSREP attempting net_end_statement while replaying
110313 20:28:03 InnoDB: Error: MySQL is freeing a thd
InnoDB: though trx->n_
InnoDB: and trx->mysql_
TRANSACTION 9948B6, not started, process no 31374, OS thread id 46920119482112
mysql tables in use 1, locked 1
MySQL thread id 6858, query id 6314006 ip-10-227-
The bug is twofold:
1) the node has inconsistent state, but does not abort
2) the node hangs but pretends to be live and holds the whole cluster
Changed in galera: | |
milestone: | none → 0.8.0 |
assignee: | nobody → Teemu Ollakka (teemu-ollakka) |
importance: | Undecided → Critical |
status: | New → Confirmed |
Changed in galera: | |
status: | Confirmed → Fix Released |