Bug #1516933 “innodb_print_all_deadlocks doesn't work” : Bugs : Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC

Revision history for this message

Ramesh Sivaraman (rameshvs02) wrote on 2015-11-20:

#1

Could not reproduce the issue.

mysql> select version();
+-----------------------+
| version() |
+-----------------------+
| 5.6.26-74.0-25.12-log |
+-----------------------+
1 row in set (0.00 sec)

mysql> \! grep innodb_print_all_deadlocks /etc/my.cnf
innodb_print_all_deadlocks=1
mysql>

Error log

2015-11-20 04:52:56 8436 [Note] WSREP: (44f4637b, 'tcp://0.0.0.0:4567') turning message relay requesting off
2015-11-20 05:15:39 7f6976732700InnoDB: transactions deadlock detected, dumping detailed information.
2015-11-20 05:15:39 7f6976732700
*** (1) TRANSACTION:
TRANSACTION 2385, ACTIVE 20 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1
MySQL thread id 4, OS thread handle 0x7f6976773700, query id 88 localhost root updating
UPDATE t2 SET i=99 WHERE i=9
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12 page no 3 n bits 72 index `GEN_CLUST_INDEX` of table `test`.`t2` trx id 2385 lock_mode X locks rec but not gap waiting
*** (2) TRANSACTION:
TRANSACTION 2386, ACTIVE 15 sec starting index read
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1
MySQL thread id 6, OS thread handle 0x7f6976732700, query id 89 localhost root updating
UPDATE t1 SET i=11 WHERE i=1
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 12 page no 3 n bits 72 index `GEN_CLUST_INDEX` of table `test`.`t2` trx id 2386 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 11 page no 3 n bits 72 index `GEN_CLUST_INDEX` of table `test`.`t1` trx id 2386 lock_mode X locks rec but not gap waiting
*** WE ROLL BACK TRANSACTION (2)

mysql>

Could not reproduce the issue.

mysql> select version();
+-----------------------+
| version()             |
+-----------------------+
| 5.6.26-74.0-25.12-log |
+-----------------------+
1 row in set (0.00 sec)

mysql> \! grep innodb_print_all_deadlocks /etc/my.cnf
innodb_print_all_deadlocks=1
mysql>

Error log

2015-11-20 04:52:56 8436 [Note] WSREP: (44f4637b, 'tcp://0.0.0.0:4567') turning message relay requesting off
2015-11-20 05:15:39 7f6976732700InnoDB: transactions deadlock detected, dumping detailed information.
2015-11-20 05:15:39 7f6976732700
*** (1) TRANSACTION:
TRANSACTION 2385, ACTIVE 20 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1
MySQL thread id 4, OS thread handle 0x7f6976773700, query id 88 localhost root updating
UPDATE t2 SET i=99 WHERE i=9
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12 page no 3 n bits 72 index `GEN_CLUST_INDEX` of table `test`.`t2` trx id 2385 lock_mode X locks rec but not gap waiting
*** (2) TRANSACTION:
TRANSACTION 2386, ACTIVE 15 sec starting index read
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1
MySQL thread id 6, OS thread handle 0x7f6976732700, query id 89 localhost root updating
UPDATE t1 SET i=11 WHERE i=1
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 12 page no 3 n bits 72 index `GEN_CLUST_INDEX` of table `test`.`t2` trx id 2386 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 11 page no 3 n bits 72 index `GEN_CLUST_INDEX` of table `test`.`t1` trx id 2386 lock_mode X locks rec but not gap waiting
*** WE ROLL BACK TRANSACTION (2)

mysql>

Changed in percona-xtradb-cluster:
status:	New → Invalid

Revision history for this message

Dom (launchpal) wrote on 2015-11-20:

#2

Can you confirm you are using the latest 5.6.26 PXC release? Others are reporting similar issues:
https://www.percona.com/forums/questions-discussions/percona-xtradb-cluster/43147-deadlocks

Revision history for this message

Krunal Bauskar (krunal-bauskar) wrote on 2015-11-20:

#3

I tried the use-case on ubuntu-15.04 using binary tarball freshly downloaded from "https://www.percona.com/downloads/Percona-XtraDB-Cluster-56/Percona-XtraDB-Cluster-5.6.26-25.12/binary/tarball/Percona-XtraDB-Cluster-5.6.26-rel74.0-25.12.1.Linux.x86_64.tar.gz" to be consistent.

I couldn't reproduce the issue.

| wsrep_provider_version | 3.12.2(rf3e626d)

mysql>
mysql> select version();
+-------------------+
| version() |
+-------------------+
| 5.6.26-74.0-25.12 |
+-------------------+
1 row in set (0.00 sec)

Do you have any other custom setting in my.cnf that can cause this ?

Revision history for this message

Ramesh Sivaraman (rameshvs02) wrote on 2015-11-20:

#4

Used latest PXC version

mysql> select @@version,@@version_comment;
+--------------------+---------------------------------------------------------------------------------------------+
| @@version | @@version_comment |
+--------------------+---------------------------------------------------------------------------------------------+
| 5.6.26-74.0-56-log | Percona XtraDB Cluster (GPL), Release rel74.0, Revision 1, WSREP version 25.12, wsrep_25.12 |
+--------------------+---------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql>

I tried following steps to reproduce the issue

1) Download latest binary tar ball from https://www.percona.com/downloads/Percona-XtraDB-Cluster-56/LATEST/
2) Start PXC nodes with innodb_print_all_deadlocks=1
3) Simulate following deadlock tc to reproduce the issue

CREATE TABLE t1 (i int);
CREATE TABLE t2 (i int);

INSERT t1 SELECT 1;
INSERT t2 SELECT 9;

#session1
START TRANSACTION;
UPDATE t1 SET i=11 WHERE i=1;
select SLEEP(15);
UPDATE t2 SET i=99 WHERE i=9;
COMMIT;

#session 2
START TRANSACTION;
UPDATE t2 SET i=99 WHERE i=9;
select SLEEP(15);
UPDATE t1 SET i=11 WHERE i=1;
COMMIT;

4) Check deadlock status and error log

Can you re-check (probably with the steps I mentioned to be doubly sure)

Revision history for this message

Dom (launchpal) wrote on 2015-11-21:

#5

Download full text (4.7 KiB)

So I've just invoked deadlock following guide here:
https://www.percona.com/blog/2012/08/17/percona-xtradb-cluster-multi-node-writing-and-unexpected-deadlocks/

Here you can see the counter doesn't increase:

mysql> update autoinc set j='node1' where i=2
-> ;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

mysql> commit;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
mysql> show global status like '%Innodb_deadlocks%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| Innodb_deadlocks | 0 |
+------------------+-------+
1 row in set (0.00 sec)

Nothing is logged to error log.

mysql> select @@version,@@version_comment;
+--------------------+---------------------------------------------------------------------------------------------+
| @@version | @@version_comment |
+--------------------+---------------------------------------------------------------------------------------------+
| 5.6.26-74.0-56-log | Percona XtraDB Cluster (GPL), Release rel74.0, Revision 1, WSREP version 25.12, wsrep_25.12 |
+--------------------+---------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

I've since completely revamped my.cnf to inline with percona online wizard and galera docs, here is current my.cnf:

[mysqld]

######################
# General options
######################

datadir = /var/lib/mysql
tmpdir = /tmp
user = mysql
socket = /var/lib/mysql/mysql.sock
pid-file = mysqld.pid
port = 3306
sysdate-is-now = 1
memlock = 0

# MyISAM
key-buffer-size = 24M
myisam-recover = FORCE,BACKUP

# SAFETY #
max-allowed-packet = 16M
max-connect-errors = 1000000
skip-name-resolve
sql-mode = STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZER
O,NO_ENGINE_SUBSTITUTION,NO_ZERO_DATE,NO_ZERO_IN_DATE,ONLY_FULL_GROUP_BY
innodb = FORCE
innodb-strict-mode = 1

# DATA STORAGE #
datadir = /var/lib/mysql/

# CACHES AND LIMITS #
tmp-table-size = 32M
max-heap-table-size = 32M
query-cache-type = 0
query-cache-size = 0
max-connections = 500
thread-cache-size = 50
open-files-limit = 65535...

So I've just invoked deadlock following guide here:
 https://www.percona.com/blog/2012/08/17/percona-xtradb-cluster-multi-node-writing-and-unexpected-deadlocks/

Here you can see the counter doesn't increase:

mysql> update autoinc set j='node1' where i=2
    -> ;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> commit;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
mysql> show global status like '%Innodb_deadlocks%';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| Innodb_deadlocks | 0     |
+------------------+-------+
1 row in set (0.00 sec)

Nothing is logged to error log.

mysql> select @@version,@@version_comment;
+--------------------+---------------------------------------------------------------------------------------------+
| @@version          | @@version_comment                                                                           |
+--------------------+---------------------------------------------------------------------------------------------+
| 5.6.26-74.0-56-log | Percona XtraDB Cluster (GPL), Release rel74.0, Revision 1, WSREP version 25.12, wsrep_25.12 |
+--------------------+---------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

I've since completely revamped my.cnf to inline with percona online wizard and galera docs, here is current my.cnf:

[mysqld]

######################
# General options
######################

datadir                      = /var/lib/mysql
tmpdir                       = /tmp
user                         = mysql
socket                       = /var/lib/mysql/mysql.sock
pid-file                     = mysqld.pid
port                         = 3306
sysdate-is-now               = 1
memlock                      = 0

# MyISAM
key-buffer-size 			= 24M
myisam-recover 				= FORCE,BACKUP

# SAFETY #
max-allowed-packet 			= 16M
max-connect-errors 			= 1000000
skip-name-resolve
sql-mode 					= STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZER
O,NO_ENGINE_SUBSTITUTION,NO_ZERO_DATE,NO_ZERO_IN_DATE,ONLY_FULL_GROUP_BY
innodb                      = FORCE
innodb-strict-mode          = 1

# DATA STORAGE #
datadir                        = /var/lib/mysql/

# CACHES AND LIMITS #
tmp-table-size                 = 32M
max-heap-table-size            = 32M
query-cache-type               = 0
query-cache-size               = 0
max-connections                = 500
thread-cache-size              = 50
open-files-limit               = 65535
table-definition-cache         = 4096
table-open-cache               = 4096

# INNODB # 
innodb-flush-method = O_DIRECT 
innodb-log-files-in-group = 2 
innodb-log-file-size = 200M 
innodb-flush-log-at-trx-commit = 2 
innodb-file-per-table = 1 
innodb-buffer-pool-size = 2G 
#innodb-stats-on-metadata=0
default-storage-engine=innodb
innodb_print_all_deadlocks = 1

# LOGGING # 
#log_output 						= table
log-error = /var/lib/mysql/mysql-error.log 
log-queries-not-using-indexes = 1 
slow-query-log = 1 
slow-query-log-file = /var/lib/mysql/mysql-slow.log

######################
# PXC/Galera options
######################

# In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW
default_storage_engine=InnoDB

binlog_row_image=minimal # 5.6 only

# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2

# galera settings
wsrep_provider=/usr/lib64/libgalera_smm.so
wsrep_cluster_name=phoenixdev-sqlcluster
wsrep_cluster_address=gcomm://xxx1,xxx2,xxx3?pc.wait_prim=no
wsrep_node_address=xxx1
wsrep_provider_options="gcs.fc_limit=500; gcs.fc_master_slave=YES; gcs.fc_factor=1.0; gcache.size=128M; evs.keepalive_period=PT3S; evs.in
active_check_period=PT15S; evs.suspect_timeout=PT60S; evs.inactive_timeout=PT2M; evs.install_timeout=PT2M; evs.send_window=1024; evs.user
_send_window=512;"
wsrep_sst_method=xtrabackup-v2 # can also be mysqldump, xtrabackup, xtrabackup-v2
wsrep_sst_auth="sstuser:sstpasswordshouldbesecret"
#wsrep_retry_autocommit=5 # this is to auto-retry deadlocks
wsrep_certify_nonPK=1
wsrep_replicate_myisam=1

Revision history for this message

Dom (launchpal) wrote on 2015-11-21:

#6

Guess I should also change my sst secret now :)

Revision history for this message

Krunal Bauskar (krunal-bauskar) wrote on 2015-11-23:

#7

Dom,

I am not sure why you expect a innodb_deadlock count to increase with the given use-case.

Let me get some background/semantics:

a. innodb_deadlock is used to track the deadlock traced by the innodb engine on the given node.

b. blog use-case is not inducing a deadlock on node level but at cluster level. innodb_deadlock is not meant to capture cluster level deadlock. From the node perspective there wasn't any deadlock or can't be as node-1 has only one active session.
node-1 trx failed due to certification failure at cluster level.

So let's not mixup use of innodb_deadlock to monitor cluster deadlocks. They are 2 different semantics.

Also, note that if the same use-case is tried on same node InnoDB locks will not allow the use-case to proceed but cluster is allowing it due to pessimistic locking protocol (this is even mentioned in blog)

------------

Hope this clarifies the issue.

Revision history for this message

Dom (launchpal) wrote on 2015-11-23:

#8

Hi Krunal

Ah! It could be made much clearer that deadlock counters/logging is only relevent at the local innodb level. I couldn't use Ramesh's steps to invoke deadlock - I just get Lock wait timeout exceeded, but instead I invoked a local deadlock using innotop (commands D, w). And I do now see the deadlock counter increase and the deadlock logs to error log.

So, is there a way to log the cluster deadlocks? This could be made much clearer in the pxc docs/release notes, I'm not the only one who has made this confusion.

Revision history for this message

Krunal Bauskar (krunal-bauskar) wrote on 2015-11-23:

#9

Good to hear you got it.

Sure, I will ask the doc-team to make it more clear to avoid such confusion in future.

Revision history for this message

Dom (launchpal) wrote on 2015-11-23:

#10

Is there a way to print cluster deadlocks to file? Otherwise how do we identify and fix them?

Revision history for this message

Krunal Bauskar (krunal-bauskar) wrote on 2015-11-23:

#11

Dom,

Please check --wsrep_log_conflicts=1

For example when I tried this for the use-case mentioned in Jay's blog I got following information

-------------------------

*** Priority TRANSACTION:
TRANSACTION 2318, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
1 lock struct(s), heap size 376, 0 row lock(s)
MySQL thread id 1, OS thread handle 0x7f6071b97700, query id 24 System lock

*** Victim TRANSACTION:
TRANSACTION 2317, ACTIVE 11 sec
2 lock struct(s), heap size 376, 1 row lock(s), undo log entries 1
MySQL thread id 4, OS thread handle 0x7f60704ce700, query id 22 localhost root cleaning up
*** WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6 page no 3 n bits 72 index `PRIMARY` of table `test`.`autoinc` trx id 2317 lock_mode X locks rec but not gap
2015-11-23 16:03:33 28537 [Note] WSREP: cluster conflict due to high priority abort for threads:
2015-11-23 16:03:33 28537 [Note] WSREP: Winning thread:
   THD: 1, mode: applier, state: executing, conflict: no conflict, seqno: 5
   SQL: (null)
2015-11-23 16:03:33 28537 [Note] WSREP: Victim thread:
   THD: 4, mode: local, state: idle, conflict: no conflict, seqno: -1
   SQL: (null)

---------------

I assume this will surely help solve your purpose.

Revision history for this message

Dom (launchpal) wrote on 2015-11-24:

#12

Yes, wsrep_log_conflicts works perfectly. Thanks!

Revision history for this message

Shahriyar Rzayev (rzayev-sehriyar) wrote on 2018-01-18:

#13

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1866

Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC

innodb_print_all_deadlocks doesn't work

Bug Description

Other bug subscribers

Remote bug watches