Cluster stalls while resolving statements conflict

Bug #1243156 reported by Dmitry Gribov
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Status tracked in 5.6
5.5
New
Undecided
Unassigned
5.6
New
Undecided
Unassigned
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Confirmed
High
Unassigned

Bug Description

In some cases cluster stalls when one node starts resolving two statements with the same ID and hangs on this forever. Restart of the node restores cluster back.

Error log fragment attached.

Revision history for this message
Dmitry Gribov (grib-d) wrote :
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

@Dmitry,

Can you provide my.cnf - specifically do you have lock_wait_timeout set?

Revision history for this message
Dmitry Gribov (grib-d) wrote :

File in the attachment

lock_wait_timeout = 120

Revision history for this message
Dmitry Gribov (grib-d) wrote :

sry, I mean innodb_lock_wait_timeout = 20. lock_wait_timeout is not set and it contains something strange:

mysql> show variables like "%lock_w%";
+--------------------------+----------+
| Variable_name | Value |
+--------------------------+----------+
| innodb_lock_wait_timeout | 120 |
| lock_wait_timeout | 31536000 |
+--------------------------+----------+

Revision history for this message
Przemek (pmalkowski) wrote :

Dimitry, the lock_wait_timeout = 31536000 is just default value since MySQL 5.5+, nothing strange here (http://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_lock_wait_timeout).

Is this issue maybe related to another bug report of yours: https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1197771 ?

What do you mean by "resolving two statements with the same ID"? Is this about updating rows with the same PK/UK id?

Does this problem still happen with latest PXC 5.5.37 or 5.6.19? If so, can you also attach full processlist and innodb engine status outputs from the time node or cluster is stalled? Also "show status like 'ws%';" from this and at least other node.

Changed in percona-xtradb-cluster:
status: New → Incomplete
Revision history for this message
Dmitry Gribov (grib-d) wrote :

I doubt this may be related to 1197771 in general, this all looks quite different. But on the level of code I can't say.
"resolving two statements with the same ID" - I mean "updating rows with the same PK/UK id", but this is only a hypothesis, and this is definitely not the only requirement, there is something more.
With 5.5.37 we still have the issue, I was thinking to try the latest dev release (can't see when it's going to be released).

Collecting processlist seemed impossible last time we tried it if "WSREP: BF lock wait long" appeared, we will try once again.
And, btw, now we only have ONE "WSREP: BF lock wait long" in the log - when the bug was first met they go in series (as you can see in the attachment).

Revision history for this message
Dmitry Gribov (grib-d) wrote :

But this may be related to #1280896. Is there a test version with 1280896 fixed? I always have a hard time trying to get the test release of PXC... Шэв expect something like 5.5.39-25.11 there:
http://www.percona.com/downloads/TESTING/Percona-XtraDB-Cluster/
but no luck. Any ideas? We would test this issue against 5.5.39-25.11, for example.

Revision history for this message
Dmitry Gribov (grib-d) wrote :

Nope. We have tested with this:
Percona XtraDB Cluster 5.5.39-25.11 + Galera 2.11.2675

Variable_name Value
wsrep_local_state_uuid 2db76c9f-1717-11e3-b9d2-2f0c3a53a9e5
wsrep_protocol_version 4
wsrep_last_committed 12010943836
wsrep_replicated 49401
wsrep_replicated_bytes 1095127796
wsrep_received 1250506
wsrep_received_bytes 8274408205
wsrep_local_commits 48527
wsrep_local_cert_failures 869
wsrep_local_replays 0
wsrep_local_send_queue 0
wsrep_local_send_queue_avg 0
wsrep_local_recv_queue 0
wsrep_local_recv_queue_avg 0.318141
wsrep_flow_control_paused 0
wsrep_flow_control_sent 0
wsrep_flow_control_recv 0
wsrep_cert_deps_distance 57.492063
wsrep_apply_oooe 0.218834
wsrep_apply_oool 0
wsrep_apply_window 3.387444
wsrep_commit_oooe 0
wsrep_commit_oool 0
wsrep_commit_window 2.859617
wsrep_local_state 4
wsrep_local_state_comment Synced
wsrep_cert_index_size 66
wsrep_causal_reads 0
wsrep_incoming_addresses 192.168.0.212:8661,192.168.1.72:8661,192.168.0.214:8661,192.168.1.69:8661
wsrep_cluster_conf_id 30
wsrep_cluster_size 4
wsrep_cluster_state_uuid 2db76c9f-1717-11e3-b9d2-2f0c3a53a9e5
wsrep_cluster_status Primary
wsrep_connected ON
wsrep_local_bf_aborts 138
wsrep_local_index 0
wsrep_provider_name Galera
wsrep_provider_vendor Codership Oy <email address hidden>
wsrep_provider_version 2.11(r318911d)
wsrep_ready ON
wsrep_thread_count 49

it still hangs with "WSREP: BF lock wait long" (see attachment)

Revision history for this message
Dmitry Gribov (grib-d) wrote :

+INNODB MONITOR OUTPUT

Revision history for this message
Dmitry Gribov (grib-d) wrote :
Revision history for this message
Przemek (pmalkowski) wrote :

Thank you Dmitry for the details.
I can see a different problem here. You are using INSERT IGNORE statements, like:
INSERT IGNORE INTO arts_deploy ...
INSERT IGNORE INTO users ...
Can you confirm any of these tables is MyISAM? If yes, then there is this bug - lp:1236378 which shows a similar problem. Basically after any INSERT IGNORE a node becomes completely blocked.

Revision history for this message
Dmitry Gribov (grib-d) wrote :

Nope, we have no myisam tables at all.
And I'm getting a STRONG feeling the stalling queries always (or at least often) look like
insert ... on duplicate key update
where there IS duplicate and update works.

Revision history for this message
Przemek (pmalkowski) wrote :

First let me apologize - I mixed INSERT IGNORE with INSERT DELAYED which was the case in lp:1236378, so please discard my last comment.

From the wsrep status on this node it is not clear why it is stuck. Can you enable the wsrep_log_conflicts and wsrep_debug to see if anything interesting gets logged?

Also the longest transaction is interesting:
---TRANSACTION 9DC96209B, ACTIVE 148 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 937 lock struct(s), heap size 129464, 15880 row lock(s), undo log entries 4078
MySQL thread id 2175, OS thread handle 0x7f50f510d700, query id 4179022 192.168.0.215 fbhub update
INSERT IGNORE INTO arts_deploy VALUES('119704','312','2013-01-01 00:00:00','0.99847')
------- TRX HAS BEEN WAITING 12 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12856 page no 2176 n bits 520 index `PRIMARY` of table `lib_area_100`.`arts_deploy` trx id 9DC96209B lock mode S locks rec but not gap waiting
------------------
TABLE LOCK table `lib_area_100`.`tasty_arts` trx id 9DC96209B lock mode IX
RECORD LOCKS space id 474 page no 1523461 n bits 384 index `uaf` of table `lib_area_100`.`tasty_arts` trx id 9DC96209B lock_mode X locks rec but not gap
TABLE LOCK table `lib_area_100`.`spam_hits` trx id 9DC96209B lock mode IX
TABLE LOCK table `lib_area_100`.`spam_packs` trx id 9DC96209B lock mode IS
RECORD LOCKS space id 52 page no 4234 n bits 208 index `PRIMARY` of table `lib_area_100`.`spam_packs` trx id 9DC96209B lock mode S locks rec but not gap
TABLE LOCK table `lib_area_100`.`users` trx id 9DC96209B lock mode IS
RECORD LOCKS space id 12854 page no 5861 n bits 112 index `PRIMARY` of table `lib_area_100`.`users` trx id 9DC96209B lock mode S locks rec but not gap
TABLE LOCK table `lib_area_100`.`spam_hits_pj` trx id 9DC96209B lock mode IX
TABLE LOCK table `lib_area_100`.`spam_tickets` trx id 9DC96209B lock mode IX
RECORD LOCKS space id 10 page no 256253 n bits 296 index `PRIMARY` of table `lib_area_100`.`spam_tickets` trx id 9DC96209B lock_mode X locks rec but not gap

Can you outline what operations were made in this transaction prior to this stuck INSERT INGORE? And can you give all the involved tables definitions?

Also I would like to see "show status like 'ws%'; also from at least one other node during the problem is happening.
So you are saying the whole cluster gets stuck, but restarting just one node restores it? How do you know which node to restart then?

Revision history for this message
Dmitry Gribov (grib-d) wrote :
Download full text (7.8 KiB)

CREATE TABLE `arts_deploy` (
  `art` int(10) unsigned NOT NULL,
  `face` smallint(5) unsigned NOT NULL,
  `added` datetime NOT NULL,
  `is_hot` decimal(6,2) unsigned NOT NULL DEFAULT '0.00',
  PRIMARY KEY (`art`,`face`),
  KEY `fi` (`face`,`is_hot`),
  KEY `fa` (`face`,`added`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

CREATE TABLE `tasty_arts` (
  `user` int(10) unsigned NOT NULL,
  `art` int(10) unsigned NOT NULL,
  `face` smallint(5) unsigned NOT NULL,
  `updated` datetime NOT NULL,
  `rate` float unsigned NOT NULL DEFAULT '0',
  UNIQUE KEY `uaf` (`user`,`art`,`face`),
  KEY `ufra` (`user`,`face`,`rate`,`art`),
  KEY `userupd` (`user`,`face`,`updated`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

CREATE TABLE `spam_hits` (
  `user` int(10) unsigned NOT NULL,
  `spam` int(10) unsigned NOT NULL,
  `time` datetime NOT NULL,
  `type` tinyint(3) unsigned NOT NULL,
  PRIMARY KEY (`time`,`spam`,`user`,`type`),
  KEY `spam_type` (`spam`,`type`,`user`),
  KEY `user_type` (`user`,`type`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

 CREATE TABLE `spam_packs` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `send_start` datetime DEFAULT NULL,
  `send_end` datetime DEFAULT NULL,
  `keyparam` varchar(50) DEFAULT NULL,
  `s_desc` varchar(1000) DEFAULT NULL,
  `type` tinyint(3) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `index2` (`keyparam`,`type`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

CREATE TABLE `users` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `login` varchar(100) DEFAULT NULL,
  `pwd` varchar(100) DEFAULT NULL,
  `s_mail` varchar(50) DEFAULT NULL,
  `s_www` varchar(255) DEFAULT NULL,
  `s_inn` varchar(50) DEFAULT NULL,
  `s_descr` text,
  `s_phone` varchar(100) DEFAULT NULL,
  `offert_accepted` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `s_full_name` varchar(255) DEFAULT NULL,
  `s_first_name` varchar(100) DEFAULT NULL,
  `s_middle_name` varchar(100) DEFAULT NULL,
  `s_last_name` varchar(100) DEFAULT NULL,
  `s_city` varchar(100) DEFAULT NULL,
  `s_address` varchar(255) DEFAULT NULL,
  `last_used` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `last_host_id` smallint(5) unsigned DEFAULT NULL,
  `mail_confirmed` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `msisdn_confirmed` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `msisdn_confirm_code` varchar(4) DEFAULT NULL,
  `msisdn_req_time` datetime DEFAULT NULL,
  `java_phone_model` varchar(30) DEFAULT NULL,
  `java_font` varchar(30) DEFAULT NULL,
  `java_safe_mode` tinyint(4) NOT NULL DEFAULT '0',
  `user_pic` varchar(16) DEFAULT NULL,
  `denied_libs` varchar(255) DEFAULT NULL,
  `recenser_type` int(11) DEFAULT NULL,
  `partner_id` int(11) DEFAULT NULL,
  `creat_date` datetime DEFAULT NULL,
  `partner` int(10) unsigned DEFAULT NULL,
  `partner_valid_till` date DEFAULT NULL,
  `partner_pin` int(10) unsigned DEFAULT NULL,
  `account` decimal(9,2) NOT NULL DEFAULT '0.00',
  `abonement_start` datetime DEFAULT NULL,
  `abonement_expires` date NOT NULL DEFAULT '2006-01-01',
  `abonement_period` smallint(6) DEFAULT NULL,
  `abonement_delay` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `abonement_max_price` decimal(6,2) DEFAULT NULL,
  `abonement_downloads` tinyint(3) unsigned NOT NULL DEFAULT '0',
 ...

Read more...

Przemek (pmalkowski)
Changed in percona-xtradb-cluster:
status: Incomplete → New
Revision history for this message
Dmitry Gribov (grib-d) wrote :

Here is detailed data. Node dbp07 was affected by "BF lock wait long", logs from neighbors provided (-2 minutes from the incident).

Revision history for this message
Dmitry Gribov (grib-d) wrote :

One more clean example: dbh06 hang

Revision history for this message
Dmitry Gribov (grib-d) wrote :

Noted that mass "delete" statements increase "BF lock wait long" rate dramatically.

Revision history for this message
Dmitry Gribov (grib-d) wrote :

We are testing current 5.6 build, it is finally working without core dumps and it seem to not have the issue. We have tested the "killing" load with it and not a single "WSREP: BF lock wait long" so far.

Revision history for this message
Dmitry Gribov (grib-d) wrote :

Funny, we have no "WSREP: BF lock wait long" in the log and locks a seldom. But we already had a cluster stall twice with quite a similar sympthoms: all nodes have many "wsrep in pre-commit stage" threads, then the whole cluster stalls. If you restart the "bad node" cluster unlocks. Only two differences: it happens not so often and there is no way to identify the "bad" node. Looks like "BF lock wait long" logging was disabled|removed while the problem is still there.

Revision history for this message
Dmitry Gribov (grib-d) wrote :

Catched "WSREP: BF lock wait long" in the exactly old form today, so it's seldom but it's still here

Changed in percona-xtradb-cluster:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Rodrigo (rodri-bernardo) wrote :

We are experiencing this issue with 5.6.24-72.2-56-log. Fixed in later versions?

Revision history for this message
Dmitry Gribov (grib-d) wrote :

Looks like the issue was fixed, haven't seen it for quite a long time.

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-978

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.