Bug #1657737 “Print locking transaction which cause “Lock wait t...” : Bugs : MySQL Server

Laurynas Biveinis (laurynas-biveinis) on 2017-01-20

tags:

added: upstream

Sveta Smirnova (svetasmirnova) on 2017-10-09

tags:

added: sfr-92

Revision history for this message

Vlad Lesin (vlad-lesin) wrote on 2017-11-10:

#1

The implementation without audit plugin can be found here:

https://github.com/percona/percona-server/pull/1970
https://github.com/percona/percona-server/pull/1971

Revision history for this message

Vlad Lesin (vlad-lesin) wrote on 2017-11-10:

#2

The above implementation is merged to trunk, but the task is not closed because the necessity of audit plugin using as well as necessity of some additional information requesting from I_S and PFS is still under the question.

Revision history for this message

Laurynas Biveinis (laurynas-biveinis) wrote on 2017-12-20:

#3

Vlad, please set the milestones too

Revision history for this message

Vlad Lesin (vlad-lesin) wrote on 2017-12-28:

#4

Laurynas, done.

Revision history for this message

Vlad Lesin (vlad-lesin) wrote on 2017-12-28:

#5

This is my draft implementation with blocking statements history and output with audit log plugin instead off error log: https://github.com/vlad-lesin/percona-server/tree/lwt-bug-lp1657737-5.7

The audit log plugins needs fixes to have the ability to filter out only the certain kinds of events. Currently audit_log_policy is mutually exclusive, except "ALL". It's highly desirable to have the ability to output, for example, QUERIES+LOCK messages.

Revision history for this message

Vlad Lesin (vlad-lesin) wrote on 2017-12-28:

#6

A little FAQ about the current implementation(https://github.com/percona/percona-server/pull/1970, https://github.com/percona/percona-server/pull/1971).

Q: How to turn on the feature output?
A: There is new global boolean variable print_lock_wait_timeout_info which turns on the functionality. It is turned off by default.

Q: How to get query with query id from error log output?
A: The query id can be seen in "show process list" output. Besides query id is logged in general query log which can be turned on with "log=/path_to/mysql_query.log" option. See also https://dev.mysql.com/doc/refman/5.7/en/query-log.html.

Q: What is thread_id in error log output?
A: This is mysql thread id which is shown in "show processlist" results.

Q: What the relation between PFS thread id and thread id which is output to error log by the feature?
A: PFS has their own thread id space. Our thread id corresponds to "PROCESSLIST_ID" field in performance_schema.events_statements_current. If we have blocking thread id then we can get blocking thread statements history with the following query:

   SELECT s.SQL_TEXT FROM performance_schema.events_statements_history s
   INNER JOIN performance_schema.threads t ON t.THREAD_ID = s.THREAD_ID
   WHERE t.PROCESSLIST_ID = %d
   UNION
   SELECT s.SQL_TEXT FROM performance_schema.events_statements_current s
   INNER JOIN performance_schema.threads t ON t.THREAD_ID = s.THREAD_ID
   WHERE t.PROCESSLIST_ID = %d;

Where PROCESSLIST_ID is thread id from error log output.

Revision history for this message

Vlad Lesin (vlad-lesin) wrote on 2017-12-28:

#7

The question is why don't output blocking transaction query string instead of or along with query id.

Let's consider the code from this https://github.com/percona/percona-server/pull/1971 pull request.

This is the place in the code where the list of blocking transactions is obtained:

Thread 28 "mysqld" hit Breakpoint 2, lock_wait_suspend_thread (
thr=0x7fffa4010418)
at ./storage/innobase/lock/lock0wait.cc:323
323 blocking[blocking_count].trx_id = lock_get_trx_id(curr_lock);
(gdb) l
318 lock_queue_iterator_reset(&iter, wait_lock, ULINT_UNDEFINED);
319 for (curr_lock = lock_queue_iterator_get_prev(&iter);
320 curr_lock != NULL;
321 curr_lock = lock_queue_iterator_get_prev(&iter)) {
322 if (lock_has_to_wait(trx->lock.wait_lock, curr_lock)) {
323 blocking[blocking_count].trx_id = lock_get_trx_id(curr_lock);
324 blocking[blocking_count].thread_id =
325 curr_lock->trx->mysql_thd ?
326 thd_get_thread_id(curr_lock->trx->mysql_thd) : 0;
327 blocking[blocking_count].query_id =

If we look at the THD object of blocking transaction we will see that it does not contain query string:

(gdb) p curr_lock->trx->mysql_thd->m_query_string
$2 = {str = 0x0, length = 0}

The question is why? On what exactly step the string was cleaned up? Let's look on the following stack trace:

Thread 27 "mysqld" hit Breakpoint 6, THD::reset_query (this=0x7fffa0128060) at ./sql/sql_class.h:4564
4564 void reset_query() { set_query(LEX_CSTRING()); }
(gdb) bt
#0 THD::reset_query (this=0x7fffa0128060) at ./sql/sql_class.h:4564
#1 0x000000000164747b in dispatch_command (thd=0x7fffa0128060, com_data=0x7ffff11c2de0, command=COM_QUERY)
at ./sql/sql_parse.cc:1946
#2 0x0000000001644b80 in do_command (thd=0x7fffa0128060) at ./sql/sql_parse.cc:1021
#3 0x0000000001796f22 in handle_connection (arg=0x3949180) at ./sql/conn_handler/connection_handler_per_thread.cc:312
#4 0x0000000001e7e5ad in pfs_spawn_thread (arg=0x3962a00) at ./storage/perfschema/pfs.cc:2188
#5 0x00007ffff6f5e6ba in start_thread (arg=0x7ffff11c3700) at pthread_create.c:333
#6 0x00007ffff63f33dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

The query string is cleaned up after the statement result is sent. So it can't be accessed from the place where the list of blocking transactions is obtained.

The query string can be obtained from performance_schema.events_statements_current table. This requires additional code to execute query inside of the server(what is preferable for better portability) or get access to internal PFS data directly, what is easier, but can require PFS code patching and increases code cohesion.

The question is why don't output blocking transaction query string instead of or along with query id.

Let's consider the code from this https://github.com/percona/percona-server/pull/1971 pull request.

This is the place in the code where the list of blocking transactions is obtained:

Thread 28 "mysqld" hit Breakpoint 2, lock_wait_suspend_thread (
    thr=0x7fffa4010418)
    at ./storage/innobase/lock/lock0wait.cc:323
323                                             blocking[blocking_count].trx_id = lock_get_trx_id(curr_lock);
(gdb) l
318                             lock_queue_iterator_reset(&iter, wait_lock, ULINT_UNDEFINED);
319                             for (curr_lock = lock_queue_iterator_get_prev(&iter);
320                                     curr_lock != NULL;
321                                     curr_lock = lock_queue_iterator_get_prev(&iter)) {
322                                     if (lock_has_to_wait(trx->lock.wait_lock, curr_lock)) {
323                                             blocking[blocking_count].trx_id = lock_get_trx_id(curr_lock);
324                                             blocking[blocking_count].thread_id =
325                                                     curr_lock->trx->mysql_thd ?
326                                                     thd_get_thread_id(curr_lock->trx->mysql_thd) : 0;
327                                             blocking[blocking_count].query_id =

If we look at the THD object of blocking transaction we will see that it does not contain query string:

(gdb) p curr_lock->trx->mysql_thd->m_query_string
$2 = {str = 0x0, length = 0}

The question is why? On what exactly step the string was cleaned up? Let's look on the following stack trace:

Thread 27 "mysqld" hit Breakpoint 6, THD::reset_query (this=0x7fffa0128060) at ./sql/sql_class.h:4564
4564      void reset_query() { set_query(LEX_CSTRING()); }
(gdb) bt
#0  THD::reset_query (this=0x7fffa0128060) at ./sql/sql_class.h:4564
#1  0x000000000164747b in dispatch_command (thd=0x7fffa0128060, com_data=0x7ffff11c2de0, command=COM_QUERY)
    at ./sql/sql_parse.cc:1946
#2  0x0000000001644b80 in do_command (thd=0x7fffa0128060) at ./sql/sql_parse.cc:1021
#3  0x0000000001796f22 in handle_connection (arg=0x3949180) at ./sql/conn_handler/connection_handler_per_thread.cc:312
#4  0x0000000001e7e5ad in pfs_spawn_thread (arg=0x3962a00) at ./storage/perfschema/pfs.cc:2188
#5  0x00007ffff6f5e6ba in start_thread (arg=0x7ffff11c3700) at pthread_create.c:333
#6  0x00007ffff63f33dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

The query string is cleaned up after the statement result is sent. So it can't be accessed from the place where the list of blocking transactions is obtained.

The query string can be obtained from performance_schema.events_statements_current table. This requires additional code to execute query inside of the server(what is preferable for better portability) or get access to internal PFS data directly, what is easier, but can require PFS code patching and increases code cohesion.

Revision history for this message

Vlad Lesin (vlad-lesin) wrote on 2018-01-04:

#8

In the above FAQ please change print_lock_wait_timeout_info to innodb_print_lock_wait_timeout_info.

Revision history for this message

Shahriyar Rzayev (rzayev-sehriyar) wrote on 2018-01-25:

#9

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-255

	Status	Importance	Assigned to	Milestone
MySQL Server	Unknown	Unknown	mysql-bugs #84563
Percona Server moved to https://jira.percona.com/projects/PS	Status tracked in 5.7
5.5	Won't Fix	Undecided	Unassigned
5.6	Fix Released	Wishlist	Vlad Lesin	Percona Server moved to https://jira.percona.com/projects/PS 5.6.38-83.0
5.7	Fix Released	Wishlist	Vlad Lesin	Percona Server moved to https://jira.percona.com/projects/PS 5.7.20-18

MySQL Server

Print locking transaction which cause "Lock wait timeout" errors into log file

Bug Description

Other bug subscribers

Remote bug watches