innodb_print_all_deadlocks doesn't work

Bug #1516933 reported by Dom
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Invalid
Undecided
Unassigned

Bug Description

Including innodb_print_all_deadlocks with value 'on' or '1' doesn't enable deadlock logging, and the value doesn't show up in 'show global status'.

Percona-XtraDB-Cluster-server-56-5.6.26-25.12.1.el7.x86_64
Percona-XtraDB-Cluster-galera-3-3.12.2-1.rhel7.x86_64

Revision history for this message
Ramesh Sivaraman (rameshvs02) wrote :

Could not reproduce the issue.

mysql> show variables like '%innodb_print_all_deadlocks%';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| innodb_print_all_deadlocks | ON |
+----------------------------+-------+
1 row in set (0.00 sec)

mysql> select version();
+-----------------------+
| version() |
+-----------------------+
| 5.6.26-74.0-25.12-log |
+-----------------------+
1 row in set (0.00 sec)

mysql> \! grep innodb_print_all_deadlocks /etc/my.cnf
innodb_print_all_deadlocks=1
mysql>

Error log

2015-11-20 04:52:56 8436 [Note] WSREP: (44f4637b, 'tcp://0.0.0.0:4567') turning message relay requesting off
2015-11-20 05:15:39 7f6976732700InnoDB: transactions deadlock detected, dumping detailed information.
2015-11-20 05:15:39 7f6976732700
*** (1) TRANSACTION:
TRANSACTION 2385, ACTIVE 20 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1
MySQL thread id 4, OS thread handle 0x7f6976773700, query id 88 localhost root updating
UPDATE t2 SET i=99 WHERE i=9
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12 page no 3 n bits 72 index `GEN_CLUST_INDEX` of table `test`.`t2` trx id 2385 lock_mode X locks rec but not gap waiting
*** (2) TRANSACTION:
TRANSACTION 2386, ACTIVE 15 sec starting index read
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1
MySQL thread id 6, OS thread handle 0x7f6976732700, query id 89 localhost root updating
UPDATE t1 SET i=11 WHERE i=1
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 12 page no 3 n bits 72 index `GEN_CLUST_INDEX` of table `test`.`t2` trx id 2386 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 11 page no 3 n bits 72 index `GEN_CLUST_INDEX` of table `test`.`t1` trx id 2386 lock_mode X locks rec but not gap waiting
*** WE ROLL BACK TRANSACTION (2)

mysql> show global status like '%Innodb_deadlocks%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| Innodb_deadlocks | 1 |
+------------------+-------+
1 row in set (0.00 sec)

mysql>

Changed in percona-xtradb-cluster:
status: New → Invalid
Revision history for this message
Dom (launchpal) wrote :

Can you confirm you are using the latest 5.6.26 PXC release? Others are reporting similar issues:
https://www.percona.com/forums/questions-discussions/percona-xtradb-cluster/43147-deadlocks

Revision history for this message
Krunal Bauskar (krunal-bauskar) wrote :

I tried the use-case on ubuntu-15.04 using binary tarball freshly downloaded from "https://www.percona.com/downloads/Percona-XtraDB-Cluster-56/Percona-XtraDB-Cluster-5.6.26-25.12/binary/tarball/Percona-XtraDB-Cluster-5.6.26-rel74.0-25.12.1.Linux.x86_64.tar.gz" to be consistent.

I couldn't reproduce the issue.

| wsrep_provider_version | 3.12.2(rf3e626d)

mysql>
mysql> select version();
+-------------------+
| version() |
+-------------------+
| 5.6.26-74.0-25.12 |
+-------------------+
1 row in set (0.00 sec)

Do you have any other custom setting in my.cnf that can cause this ?

Revision history for this message
Ramesh Sivaraman (rameshvs02) wrote :

Used latest PXC version

mysql> show status like 'wsrep_provider_version';
+------------------------+------------------+
| Variable_name | Value |
+------------------------+------------------+
| wsrep_provider_version | 3.12.2(rf3e626d) |
+------------------------+------------------+
1 row in set (0.00 sec)

mysql> select @@version,@@version_comment;
+--------------------+---------------------------------------------------------------------------------------------+
| @@version | @@version_comment |
+--------------------+---------------------------------------------------------------------------------------------+
| 5.6.26-74.0-56-log | Percona XtraDB Cluster (GPL), Release rel74.0, Revision 1, WSREP version 25.12, wsrep_25.12 |
+--------------------+---------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql>

I tried following steps to reproduce the issue

1) Download latest binary tar ball from https://www.percona.com/downloads/Percona-XtraDB-Cluster-56/LATEST/
2) Start PXC nodes with innodb_print_all_deadlocks=1
3) Simulate following deadlock tc to reproduce the issue

CREATE TABLE t1 (i int);
CREATE TABLE t2 (i int);

INSERT t1 SELECT 1;
INSERT t2 SELECT 9;

#session1
START TRANSACTION;
UPDATE t1 SET i=11 WHERE i=1;
select SLEEP(15);
UPDATE t2 SET i=99 WHERE i=9;
COMMIT;

#session 2
START TRANSACTION;
UPDATE t2 SET i=99 WHERE i=9;
select SLEEP(15);
UPDATE t1 SET i=11 WHERE i=1;
COMMIT;

4) Check deadlock status and error log

Can you re-check (probably with the steps I mentioned to be doubly sure)

Revision history for this message
Dom (launchpal) wrote :
Download full text (4.7 KiB)

So I've just invoked deadlock following guide here:
 https://www.percona.com/blog/2012/08/17/percona-xtradb-cluster-multi-node-writing-and-unexpected-deadlocks/

Here you can see the counter doesn't increase:

mysql> update autoinc set j='node1' where i=2
    -> ;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

mysql> commit;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
mysql> show global status like '%Innodb_deadlocks%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| Innodb_deadlocks | 0 |
+------------------+-------+
1 row in set (0.00 sec)

Nothing is logged to error log.

mysql> show status like 'wsrep_provider_version';
+------------------------+------------------+
| Variable_name | Value |
+------------------------+------------------+
| wsrep_provider_version | 3.12.2(rf3e626d) |
+------------------------+------------------+
1 row in set (0.00 sec)

mysql> select @@version,@@version_comment;
+--------------------+---------------------------------------------------------------------------------------------+
| @@version | @@version_comment |
+--------------------+---------------------------------------------------------------------------------------------+
| 5.6.26-74.0-56-log | Percona XtraDB Cluster (GPL), Release rel74.0, Revision 1, WSREP version 25.12, wsrep_25.12 |
+--------------------+---------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> show variables like '%innodb_print_all_deadlocks%';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| innodb_print_all_deadlocks | ON |
+----------------------------+-------+
1 row in set (0.00 sec)

I've since completely revamped my.cnf to inline with percona online wizard and galera docs, here is current my.cnf:

[mysqld]

######################
# General options
######################

datadir = /var/lib/mysql
tmpdir = /tmp
user = mysql
socket = /var/lib/mysql/mysql.sock
pid-file = mysqld.pid
port = 3306
sysdate-is-now = 1
memlock = 0

# MyISAM
key-buffer-size = 24M
myisam-recover = FORCE,BACKUP

# SAFETY #
max-allowed-packet = 16M
max-connect-errors = 1000000
skip-name-resolve
sql-mode = STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZER
O,NO_ENGINE_SUBSTITUTION,NO_ZERO_DATE,NO_ZERO_IN_DATE,ONLY_FULL_GROUP_BY
innodb = FORCE
innodb-strict-mode = 1

# DATA STORAGE #
datadir = /var/lib/mysql/

# CACHES AND LIMITS #
tmp-table-size = 32M
max-heap-table-size = 32M
query-cache-type = 0
query-cache-size = 0
max-connections = 500
thread-cache-size = 50
open-files-limit = 65535...

Read more...

Revision history for this message
Dom (launchpal) wrote :

Guess I should also change my sst secret now :)

Revision history for this message
Krunal Bauskar (krunal-bauskar) wrote :

Dom,

I am not sure why you expect a innodb_deadlock count to increase with the given use-case.

Let me get some background/semantics:

a. innodb_deadlock is used to track the deadlock traced by the innodb engine on the given node.

b. blog use-case is not inducing a deadlock on node level but at cluster level. innodb_deadlock is not meant to capture cluster level deadlock. From the node perspective there wasn't any deadlock or can't be as node-1 has only one active session.
node-1 trx failed due to certification failure at cluster level.

So let's not mixup use of innodb_deadlock to monitor cluster deadlocks. They are 2 different semantics.

Also, note that if the same use-case is tried on same node InnoDB locks will not allow the use-case to proceed but cluster is allowing it due to pessimistic locking protocol (this is even mentioned in blog)

------------

Hope this clarifies the issue.

Revision history for this message
Dom (launchpal) wrote :

Hi Krunal

Ah! It could be made much clearer that deadlock counters/logging is only relevent at the local innodb level. I couldn't use Ramesh's steps to invoke deadlock - I just get Lock wait timeout exceeded, but instead I invoked a local deadlock using innotop (commands D, w). And I do now see the deadlock counter increase and the deadlock logs to error log.

So, is there a way to log the cluster deadlocks? This could be made much clearer in the pxc docs/release notes, I'm not the only one who has made this confusion.

Revision history for this message
Krunal Bauskar (krunal-bauskar) wrote :

Good to hear you got it.

Sure, I will ask the doc-team to make it more clear to avoid such confusion in future.

Revision history for this message
Dom (launchpal) wrote :

Is there a way to print cluster deadlocks to file? Otherwise how do we identify and fix them?

Revision history for this message
Krunal Bauskar (krunal-bauskar) wrote :

Dom,

Please check --wsrep_log_conflicts=1

For example when I tried this for the use-case mentioned in Jay's blog I got following information

-------------------------

*** Priority TRANSACTION:
TRANSACTION 2318, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
1 lock struct(s), heap size 376, 0 row lock(s)
MySQL thread id 1, OS thread handle 0x7f6071b97700, query id 24 System lock

*** Victim TRANSACTION:
TRANSACTION 2317, ACTIVE 11 sec
2 lock struct(s), heap size 376, 1 row lock(s), undo log entries 1
MySQL thread id 4, OS thread handle 0x7f60704ce700, query id 22 localhost root cleaning up
*** WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6 page no 3 n bits 72 index `PRIMARY` of table `test`.`autoinc` trx id 2317 lock_mode X locks rec but not gap
2015-11-23 16:03:33 28537 [Note] WSREP: cluster conflict due to high priority abort for threads:
2015-11-23 16:03:33 28537 [Note] WSREP: Winning thread:
   THD: 1, mode: applier, state: executing, conflict: no conflict, seqno: 5
   SQL: (null)
2015-11-23 16:03:33 28537 [Note] WSREP: Victim thread:
   THD: 4, mode: local, state: idle, conflict: no conflict, seqno: -1
   SQL: (null)

---------------

I assume this will surely help solve your purpose.

Revision history for this message
Dom (launchpal) wrote :

Yes, wsrep_log_conflicts works perfectly. Thanks!

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1866

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.