Comment 13 for bug 1243156

Revision history for this message
Przemek (pmalkowski) wrote :

First let me apologize - I mixed INSERT IGNORE with INSERT DELAYED which was the case in lp:1236378, so please discard my last comment.

From the wsrep status on this node it is not clear why it is stuck. Can you enable the wsrep_log_conflicts and wsrep_debug to see if anything interesting gets logged?

Also the longest transaction is interesting:
---TRANSACTION 9DC96209B, ACTIVE 148 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 937 lock struct(s), heap size 129464, 15880 row lock(s), undo log entries 4078
MySQL thread id 2175, OS thread handle 0x7f50f510d700, query id 4179022 192.168.0.215 fbhub update
INSERT IGNORE INTO arts_deploy VALUES('119704','312','2013-01-01 00:00:00','0.99847')
------- TRX HAS BEEN WAITING 12 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12856 page no 2176 n bits 520 index `PRIMARY` of table `lib_area_100`.`arts_deploy` trx id 9DC96209B lock mode S locks rec but not gap waiting
------------------
TABLE LOCK table `lib_area_100`.`tasty_arts` trx id 9DC96209B lock mode IX
RECORD LOCKS space id 474 page no 1523461 n bits 384 index `uaf` of table `lib_area_100`.`tasty_arts` trx id 9DC96209B lock_mode X locks rec but not gap
TABLE LOCK table `lib_area_100`.`spam_hits` trx id 9DC96209B lock mode IX
TABLE LOCK table `lib_area_100`.`spam_packs` trx id 9DC96209B lock mode IS
RECORD LOCKS space id 52 page no 4234 n bits 208 index `PRIMARY` of table `lib_area_100`.`spam_packs` trx id 9DC96209B lock mode S locks rec but not gap
TABLE LOCK table `lib_area_100`.`users` trx id 9DC96209B lock mode IS
RECORD LOCKS space id 12854 page no 5861 n bits 112 index `PRIMARY` of table `lib_area_100`.`users` trx id 9DC96209B lock mode S locks rec but not gap
TABLE LOCK table `lib_area_100`.`spam_hits_pj` trx id 9DC96209B lock mode IX
TABLE LOCK table `lib_area_100`.`spam_tickets` trx id 9DC96209B lock mode IX
RECORD LOCKS space id 10 page no 256253 n bits 296 index `PRIMARY` of table `lib_area_100`.`spam_tickets` trx id 9DC96209B lock_mode X locks rec but not gap

Can you outline what operations were made in this transaction prior to this stuck INSERT INGORE? And can you give all the involved tables definitions?

Also I would like to see "show status like 'ws%'; also from at least one other node during the problem is happening.
So you are saying the whole cluster gets stuck, but restarting just one node restores it? How do you know which node to restart then?