innodb_print_all_deadlocks doesn't work

Bug #1516933 reported by Dom on 2015-11-17
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Invalid
Undecided
Unassigned

Bug Description

Including innodb_print_all_deadlocks with value 'on' or '1' doesn't enable deadlock logging, and the value doesn't show up in 'show global status'.

Percona-XtraDB-Cluster-server-56-5.6.26-25.12.1.el7.x86_64
Percona-XtraDB-Cluster-galera-3-3.12.2-1.rhel7.x86_64

Ramesh Sivaraman (rameshvs02) wrote :

Could not reproduce the issue.

mysql> show variables like '%innodb_print_all_deadlocks%';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| innodb_print_all_deadlocks | ON |
+----------------------------+-------+
1 row in set (0.00 sec)

mysql> select version();
+-----------------------+
| version() |
+-----------------------+
| 5.6.26-74.0-25.12-log |
+-----------------------+
1 row in set (0.00 sec)

mysql> \! grep innodb_print_all_deadlocks /etc/my.cnf
innodb_print_all_deadlocks=1
mysql>

Error log

2015-11-20 04:52:56 8436 [Note] WSREP: (44f4637b, 'tcp://0.0.0.0:4567') turning message relay requesting off
2015-11-20 05:15:39 7f6976732700InnoDB: transactions deadlock detected, dumping detailed information.
2015-11-20 05:15:39 7f6976732700
*** (1) TRANSACTION:
TRANSACTION 2385, ACTIVE 20 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1
MySQL thread id 4, OS thread handle 0x7f6976773700, query id 88 localhost root updating
UPDATE t2 SET i=99 WHERE i=9
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12 page no 3 n bits 72 index `GEN_CLUST_INDEX` of table `test`.`t2` trx id 2385 lock_mode X locks rec but not gap waiting
*** (2) TRANSACTION:
TRANSACTION 2386, ACTIVE 15 sec starting index read
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1
MySQL thread id 6, OS thread handle 0x7f6976732700, query id 89 localhost root updating
UPDATE t1 SET i=11 WHERE i=1
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 12 page no 3 n bits 72 index `GEN_CLUST_INDEX` of table `test`.`t2` trx id 2386 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 11 page no 3 n bits 72 index `GEN_CLUST_INDEX` of table `test`.`t1` trx id 2386 lock_mode X locks rec but not gap waiting
*** WE ROLL BACK TRANSACTION (2)

mysql> show global status like '%Innodb_deadlocks%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| Innodb_deadlocks | 1 |
+------------------+-------+
1 row in set (0.00 sec)

mysql>

Changed in percona-xtradb-cluster:
status: New → Invalid
Dom (launchpal) wrote :

Can you confirm you are using the latest 5.6.26 PXC release? Others are reporting similar issues:
https://www.percona.com/forums/questions-discussions/percona-xtradb-cluster/43147-deadlocks

Krunal Bauskar (krunal-bauskar) wrote :

I tried the use-case on ubuntu-15.04 using binary tarball freshly downloaded from "https://www.percona.com/downloads/Percona-XtraDB-Cluster-56/Percona-XtraDB-Cluster-5.6.26-25.12/binary/tarball/Percona-XtraDB-Cluster-5.6.26-rel74.0-25.12.1.Linux.x86_64.tar.gz" to be consistent.

I couldn't reproduce the issue.

| wsrep_provider_version | 3.12.2(rf3e626d)

mysql>
mysql> select version();
+-------------------+
| version() |
+-------------------+
| 5.6.26-74.0-25.12 |
+-------------------+
1 row in set (0.00 sec)

Do you have any other custom setting in my.cnf that can cause this ?

Ramesh Sivaraman (rameshvs02) wrote :

Used latest PXC version

mysql> show status like 'wsrep_provider_version';
+------------------------+------------------+
| Variable_name | Value |
+------------------------+------------------+
| wsrep_provider_version | 3.12.2(rf3e626d) |
+------------------------+------------------+
1 row in set (0.00 sec)

mysql> select @@version,@@version_comment;
+--------------------+---------------------------------------------------------------------------------------------+
| @@version | @@version_comment |
+--------------------+---------------------------------------------------------------------------------------------+
| 5.6.26-74.0-56-log | Percona XtraDB Cluster (GPL), Release rel74.0, Revision 1, WSREP version 25.12, wsrep_25.12 |
+--------------------+---------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql>

I tried following steps to reproduce the issue

1) Download latest binary tar ball from https://www.percona.com/downloads/Percona-XtraDB-Cluster-56/LATEST/
2) Start PXC nodes with innodb_print_all_deadlocks=1
3) Simulate following deadlock tc to reproduce the issue

CREATE TABLE t1 (i int);
CREATE TABLE t2 (i int);

INSERT t1 SELECT 1;
INSERT t2 SELECT 9;

#session1
START TRANSACTION;
UPDATE t1 SET i=11 WHERE i=1;
select SLEEP(15);
UPDATE t2 SET i=99 WHERE i=9;
COMMIT;

#session 2
START TRANSACTION;
UPDATE t2 SET i=99 WHERE i=9;
select SLEEP(15);
UPDATE t1 SET i=11 WHERE i=1;
COMMIT;

4) Check deadlock status and error log

Can you re-check (probably with the steps I mentioned to be doubly sure)

Dom (launchpal) wrote :
Download full text (4.7 KiB)

So I've just invoked deadlock following guide here:
 https://www.percona.com/blog/2012/08/17/percona-xtradb-cluster-multi-node-writing-and-unexpected-deadlocks/

Here you can see the counter doesn't increase:

mysql> update autoinc set j='node1' where i=2
    -> ;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

mysql> commit;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
mysql> show global status like '%Innodb_deadlocks%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| Innodb_deadlocks | 0 |
+------------------+-------+
1 row in set (0.00 sec)

Nothing is logged to error log.

mysql> show status like 'wsrep_provider_version';
+------------------------+------------------+
| Variable_name | Value |
+------------------------+------------------+
| wsrep_provider_version | 3.12.2(rf3e626d) |
+------------------------+------------------+
1 row in set (0.00 sec)

mysql> select @@version,@@version_comment;
+--------------------+---------------------------------------------------------------------------------------------+
| @@version | @@version_comment |
+--------------------+---------------------------------------------------------------------------------------------+
| 5.6.26-74.0-56-log | Percona XtraDB Cluster (GPL), Release rel74.0, Revision 1, WSREP version 25.12, wsrep_25.12 |
+--------------------+---------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> show variables like '%innodb_print_all_deadlocks%';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| innodb_print_all_deadlocks | ON |
+----------------------------+-------+
1 row in set (0.00 sec)

I've since completely revamped my.cnf to inline with percona online wizard and galera docs, here is current my.cnf:

[mysqld]

######################
# General options
######################

datadir = /var/lib/mysql
tmpdir = /tmp
user = mysql
socket = /var/lib/mysql/mysql.sock
pid-file = mysqld.pid
port = 3306
sysdate-is-now = 1
memlock = 0

# MyISAM
key-buffer-size = 24M
myisam-recover = FORCE,BACKUP

# SAFETY #
max-allowed-packet = 16M
max-connect-errors = 1000000
skip-name-resolve
sql-mode = STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZER
O,NO_ENGINE_SUBSTITUTION,NO_ZERO_DATE,NO_ZERO_IN_DATE,ONLY_FULL_GROUP_BY
innodb = FORCE
innodb-strict-mode = 1

# DATA STORAGE #
datadir = /var/lib/mysql/

# CACHES AND LIMITS #
tmp-table-size = 32M
max-heap-table-size = 32M
query-cache-type = 0
query-cache-size = 0
max-connections = 500
thread-cache-size = 50
open-files-limit = 65535...

Read more...

Dom (launchpal) wrote :

Guess I should also change my sst secret now :)

Krunal Bauskar (krunal-bauskar) wrote :

Dom,

I am not sure why you expect a innodb_deadlock count to increase with the given use-case.

Let me get some background/semantics:

a. innodb_deadlock is used to track the deadlock traced by the innodb engine on the given node.

b. blog use-case is not inducing a deadlock on node level but at cluster level. innodb_deadlock is not meant to capture cluster level deadlock. From the node perspective there wasn't any deadlock or can't be as node-1 has only one active session.
node-1 trx failed due to certification failure at cluster level.

So let's not mixup use of innodb_deadlock to monitor cluster deadlocks. They are 2 different semantics.

Also, note that if the same use-case is tried on same node InnoDB locks will not allow the use-case to proceed but cluster is allowing it due to pessimistic locking protocol (this is even mentioned in blog)

------------

Hope this clarifies the issue.

Dom (launchpal) wrote :

Hi Krunal

Ah! It could be made much clearer that deadlock counters/logging is only relevent at the local innodb level. I couldn't use Ramesh's steps to invoke deadlock - I just get Lock wait timeout exceeded, but instead I invoked a local deadlock using innotop (commands D, w). And I do now see the deadlock counter increase and the deadlock logs to error log.

So, is there a way to log the cluster deadlocks? This could be made much clearer in the pxc docs/release notes, I'm not the only one who has made this confusion.

Krunal Bauskar (krunal-bauskar) wrote :

Good to hear you got it.

Sure, I will ask the doc-team to make it more clear to avoid such confusion in future.

Dom (launchpal) wrote :

Is there a way to print cluster deadlocks to file? Otherwise how do we identify and fix them?

Dom,

Please check --wsrep_log_conflicts=1

For example when I tried this for the use-case mentioned in Jay's blog I got following information

-------------------------

*** Priority TRANSACTION:
TRANSACTION 2318, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
1 lock struct(s), heap size 376, 0 row lock(s)
MySQL thread id 1, OS thread handle 0x7f6071b97700, query id 24 System lock

*** Victim TRANSACTION:
TRANSACTION 2317, ACTIVE 11 sec
2 lock struct(s), heap size 376, 1 row lock(s), undo log entries 1
MySQL thread id 4, OS thread handle 0x7f60704ce700, query id 22 localhost root cleaning up
*** WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6 page no 3 n bits 72 index `PRIMARY` of table `test`.`autoinc` trx id 2317 lock_mode X locks rec but not gap
2015-11-23 16:03:33 28537 [Note] WSREP: cluster conflict due to high priority abort for threads:
2015-11-23 16:03:33 28537 [Note] WSREP: Winning thread:
   THD: 1, mode: applier, state: executing, conflict: no conflict, seqno: 5
   SQL: (null)
2015-11-23 16:03:33 28537 [Note] WSREP: Victim thread:
   THD: 4, mode: local, state: idle, conflict: no conflict, seqno: -1
   SQL: (null)

---------------

I assume this will surely help solve your purpose.

Dom (launchpal) wrote :

Yes, wsrep_log_conflicts works perfectly. Thanks!

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1866

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers