Percona Server with XtraDB

percona_innodb_kill_idle_trx_locks sporadically fails

Reported by Stewart Smith on 2011-10-10
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona Server
High
Stewart Smith
5.1
High
Stewart Smith
5.5
High
Stewart Smith

Bug Description

e.g. http://jenkins.percona.com/view/Percona%20Server%205.5/job/percona-server-5.5-trunk/65/BUILD_TYPE=debug,Host=ubuntu-lucid-32bit/testReport/junit/(root)/main/percona_innodb_kill_idle_trx_locks/

_StringException: Text attachment: traceback
------------
Comment: Server [mysqld.1 - pid: 8455, winpid: 8455, exit: 256] failed during test run
Server log from this test:
----------SERVER LOG START-----------
InnoDB: DEBUG: update_statistics for test/t1.
InnoDB: DEBUG: update_statistics for test/t1.
111010 11:42:12 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=4
max_threads=151
thread_count=1
connection_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 60178 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = (nil) thread_stack 0x30000
/mnt/workspace/percona-server-5.5-trunk/BUILD_TYPE/debug/Host/ubuntu-lucid-32bit/Percona-Server-5.5.15-rel21.0/sql/mysqld(my_print_stacktrace+0x33)[0x8463da4]
/mnt/workspace/percona-server-5.5-trunk/BUILD_TYPE/debug/Host/ubuntu-lucid-32bit/Percona-Server-5.5.15-rel21.0/sql/mysqld(handle_segfault+0x2ee)[0x813cb0c]
[0xb7783400]
/mnt/workspace/percona-server-5.5-trunk/BUILD_TYPE/debug/Host/ubuntu-lucid-32bit/Percona-Server-5.5.15-rel21.0/sql/mysqld(innobase_thd_kill+0x1d)[0x846dcf5]
/mnt/workspace/percona-server-5.5-trunk/BUILD_TYPE/debug/Host/ubuntu-lucid-32bit/Percona-Server-5.5.15-rel21.0/sql/mysqld[0x84c0d45]
/lib/tls/i686/nosegneg/libpthread.so.0(+0x5985)[0xb7761985]
/lib/tls/i686/nosegneg/libc.so.6(clone+0x5e)[0xb756f13e]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file
----------SERVER LOG END-------------
mysqltest failed but provided no output
The result from queries just before the failure was:
< snip >
BEGIN;
SELECT * FROM t1 FOR UPDATE;
a
1
2
3
UPDATE t1 set a=4;
SELECT * FROM t1;
a
4
4
4
DROP TABLE t1;
SET GLOBAL innodb_kill_idle_transaction=0;
SHOW GLOBAL VARIABLES LIKE 'innodb_kill_idle_transaction';
Variable_name Value
innodb_kill_idle_transaction 0
SELECT * FROM INFORMATION_SCHEMA.GLOBAL_VARIABLES WHERE VARIABLE_NAME='innodb_kill_idle_transaction';
VARIABLE_NAME VARIABLE_VALUE
INNODB_KILL_IDLE_TRANSACTION 0

Logfile:
CURRENT_TEST: main.percona_innodb_kill_idle_trx_locks
------------

Stewart Smith (stewart) wrote :
Download full text (3.8 KiB)

Also in 5.1: e.g. http://jenkins.percona.com/view/Releases/job/percona-server-5.1-trunk/lastCompletedBuild/BUILD_TYPE=debug-no-Werror,Host=debian6-32/testReport/(root)/main/percona_innodb_kill_idle_trx/

_StringException: Text attachment: traceback
------------
Comment: Server [mysqld.1 - pid: 30471, winpid: 30471, exit: 256] failed during test run
Server log from this test:
----------SERVER LOG START-----------
Got error 22 when trying to lock mutex at sql_class.cc, line 489
111009 13:02:14 - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=4
max_threads=151
threads_connected=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 60053 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = (nil) thread_stack 0x30000
/home/jenkins/workspace/percona-server-5.1-trunk/BUILD_TYPE/debug-no-Werror/Host/debian6-32/Percona-Server-5.1.59-rel12.9/sql/mysqld(my_print_stacktrace+0x22) [0x8532f3c]
/home/jenkins/workspace/percona-server-5.1-trunk/BUILD_TYPE/debug-no-Werror/Host/debian6-32/Percona-Server-5.1.59-rel12.9/sql/mysqld(handle_segfault+0x25a) [0x81ff7df]
[0xb780d400]
/lib/i686/cmov/libc.so.6(abort+0x182) [0xb7540b82]
/home/jenkins/workspace/percona-server-5.1-trunk/BUILD_TYPE/debug-no-Werror/Host/debian6-32/Percona-Server-5.1.59-rel12.9/sql/mysqld(safe_mutex_lock+0x150) [0x853b72f]
/home/jenkins/workspace/percona-server-5.1-trunk/BUILD_TYPE/debug-no-Werror/Host/debian6-32/Percona-Server-5.1.59-rel12.9/sql/mysqld(thd_kill+0x35) [0x81f03b5]
/home/jenkins/workspace/percona-server-5.1-trunk/BUILD_TYPE/debug-no-Werror/Host/debian6-32/Percona-Server-5.1.59-rel12.9/sql/mysqld(innobase_thd_kill+0x11) [0x83ddb10]
/home/jenkins/workspace/percona-server-5.1-trunk/BUILD_TYPE/debug-no-Werror/Host/debian6-32/Percona-Server-5.1.59-rel12.9/sql/mysqld() [0x84470e5]
/lib/i686/cmov/libpthread.so.0(+0x5955) [0xb77ee955]
/lib/i686/cmov/libc.so.6(clone+0x5e) [0xb75dee7e]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file
----------SERVER LOG END-------------

Logfile:
CURRENT_TEST: main.percona_innodb_kill_idle_trx
mysqltest: At line 25: query 'SELECT * FROM t1' failed: 2013: Lost connection to MySQL server at 'reading initial communication packet', system error: 104

The result from queries just before the failure was:
< snip >
INNODB_KILL_IDLE_TRANSACTION 0
SET GLOBAL innodb_kill_idle_transaction=1;
SHOW GLOBAL VARIABLES LIKE 'innodb_kill_i...

Read more...

Changed in percona-server:
importance: Undecided → High
status: New → Triaged
assignee: nobody → Yasufumi Kinoshita (yasufumi-kinoshita)

It should be race condition about trx->mysql_thd.
*THD base operation might not be safe if we release kernel_mutex temporally to call kill_thd for latching order.
I will rewrite based on thd->thread_id based kill_thd operation.

I never can reproduce the crash...

The fix was pushed.
Please test again.

Stewart Smith (stewart) wrote :

Please run jenkins param-build and do merge proposal.

Stewart Smith (stewart) wrote :

I do not think this is fix released - no merge proposal of new changes.

Lixun Peng (P.Linux) (plx) wrote :

I think this function will not kill transaction that only have query "SELECT", because "SELECT" command needn't lock.
Only if the transaction contain "UPDATE/DELTE/INSERT" and not type "COMMT" a long time, it must be killed.

I modified source code on our system.

I add "enum enum_sql_command last_sql_command;" on class THD, it record the last Query's command type, default value is "SQLCOM_SELECT". I init it on THD::THD, add "last_sql_command = SQLCOM_SELECT;".

And I update this value on function mysql_execute_command, only if the query is "UPDATE/DELTE/INSERT", I update the last_sql_command. Then if the last_sql_command is not select, I can consider the thread is contain MODIFY query.
  if (lex->sql_command == SQLCOM_UPDATE
          || thd->last_sql_command == SQLCOM_INSERT
          || thd->last_sql_command == SQLCOM_DELETE
          || thd->last_sql_command == SQLCOM_INSERT_SELECT ) {
    thd->last_sql_command = lex->sql_command;
  } else if (lex->sql_command == SQLCOM_COMMIT) {
      thd->last_sql_command = SQLCOM_SELECT;
  }

And in function thd_kill, if found the transaction only contain SELECT, it will not be killed.

while ((tmp=it++))
    {
        if (tmp->command == COM_DAEMON || tmp->last_sql_command == SQLCOM_SELECT )
            continue;
        if (tmp->thread_id == id)
        {
            pthread_mutex_lock(&tmp->LOCK_thd_data);
            break;
        }
    }

Alexey Kopytov (akopytov) wrote :

I think innodb_kill_idle_trx works for transactions consisting of only SELECT queries. Feel free to open a separate bug if it doesn't.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers