tmp->start_utime > now_utime impossibility leading to negative TIME_MS values in processlist

Bug #1379582 reported by Roel Van de Paar on 2014-10-10
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
Medium
Sergei Glushchenko
5.1
Won't Fix
Undecided
Unassigned
5.5
Fix Released
Medium
Sergei Glushchenko
5.6
Fix Released
Medium
Sergei Glushchenko

Bug Description

(gdb) bt
#0 0x00007fd5817e9771 in pthread_kill () from /lib64/libpthread.so.0
#1 0x00000000006beafd in handle_fatal_signal (sig=6) at /bzr/5.6.hacked_dbg/sql/signal_handler.cc:236
#2 <signal handler called>
#3 0x00007fd57fd9c5c9 in raise () from /lib64/libc.so.6
#4 0x00007fd57fd9dcd8 in abort () from /lib64/libc.so.6
#5 0x00007fd57fd95536 in __assert_fail_base () from /lib64/libc.so.6
#6 0x00007fd57fd955e2 in __assert_fail () from /lib64/libc.so.6
#7 0x00000000007a5ee2 in fill_schema_processlist (thd=thd@entry=0x25ab330, tables=tables@entry=0x7fd500004e90, cond=<optimized out>) at /bzr/5.6.hacked_dbg/sql/sql_show.cc:2304
#8 0x00000000007a447a in do_fill_table (join_table=0x7fd50000b948, table_list=0x7fd500004e90, thd=0x25ab330) at /bzr/5.6.hacked_dbg/sql/sql_show.cc:7904
#9 get_schema_tables_result (join=join@entry=0x7fd500006280, executed_place=executed_place@entry=PROCESSED_BY_JOIN_EXEC) at /bzr/5.6.hacked_dbg/sql/sql_show.cc:8005
#10 0x000000000077f195 in JOIN::prepare_result (this=this@entry=0x7fd500006280, columns_list=columns_list@entry=0x7fd55cf752c0) at /bzr/5.6.hacked_dbg/sql/sql_select.cc:823
#11 0x00000000007316e0 in JOIN::exec (this=0x7fd500006280) at /bzr/5.6.hacked_dbg/sql/sql_executor.cc:116
#12 0x0000000000785125 in mysql_execute_select (free_join=true, select_lex=0x25adf90, thd=0x25ab330) at /bzr/5.6.hacked_dbg/sql/sql_select.cc:1100
#13 mysql_select (thd=thd@entry=0x25ab330, tables=0x7fd500004e90, wild_num=0, fields=..., conds=0x0, order=order@entry=0x25ae158, group=group@entry=0x25ae090, having=0x0, select_options=2684619520, result=result@entry=0x7fd500006258, unit=unit@entry=0x25ad948, select_lex=select_lex@entry=0x25adf90) at /bzr/5.6.hacked_dbg/sql/sql_select.cc:1221
#14 0x0000000000785bd6 in handle_select (thd=thd@entry=0x25ab330, result=result@entry=0x7fd500006258, setup_tables_done_option=setup_tables_done_option@entry=0) at /bzr/5.6.hacked_dbg/sql/sql_select.cc:110
#15 0x000000000058680c in execute_sqlcom_select (thd=thd@entry=0x25ab330, all_tables=0x7fd500004e90) at /bzr/5.6.hacked_dbg/sql/sql_parse.cc:5597
#16 0x000000000075be71 in mysql_execute_command (thd=thd@entry=0x25ab330) at /bzr/5.6.hacked_dbg/sql/sql_parse.cc:2958
#17 0x0000000000760898 in mysql_parse (thd=thd@entry=0x25ab330, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x7fd55cf76530)at /bzr/5.6.hacked_dbg/sql/sql_parse.cc:6773
#18 0x00000000007625f4 in dispatch_command (command=<optimized out>, command@entry=COM_QUERY, thd=thd@entry=0x25ab330, packet=packet@entry=0x25af841 "SELECT TIME_MS FROM INFORMATION_SCHEMA.PROCESSLIST;", packet_length=packet_length@entry=51) at /bzr/5.6.hacked_dbg/sql/sql_parse.cc:1432
#19 0x000000000076408e in do_command (thd=0x25ab330) at /bzr/5.6.hacked_dbg/sql/sql_parse.cc:1049
#20 0x0000000000727fd2 in do_handle_one_connection (thd_arg=thd_arg@entry=0x25ab330) at /bzr/5.6.hacked_dbg/sql/sql_connect.cc:1532
#21 0x00000000007280c0 in handle_one_connection (arg=arg@entry=0x25ab330) at /bzr/5.6.hacked_dbg/sql/sql_connect.cc:1443
#22 0x00000000009fd663 in pfs_spawn_thread (arg=0x25ab130) at /bzr/5.6.hacked_dbg/storage/perfschema/pfs.cc:1860
#23 0x00007fd5817e4df3 in start_thread () from /lib64/libpthread.so.0
#24 0x00007fd57fe5d01d in clone () from /lib64/libc.so.6
(gdb) f 7
#7 0x00000000007a5ee2 in fill_schema_processlist (thd=thd@entry=0x25ab330, tables=tables@entry=0x7fd500004e90, cond=<optimized out>) at /bzr/5.6.hacked_dbg/sql/sql_show.cc:2304
2304 /bzr/5.6.hacked_dbg/sql/sql_show.cc: No such file or directory.
(gdb) p (((now_utime - tmp->start_utime) / 1000) > 100000 )
+p (((now_utime - tmp->start_utime) / 1000) > 100000 )
$1 = true
(gdb) p ((now_utime - tmp->start_utime) / 1000)
+p ((now_utime - tmp->start_utime) / 1000)
$2 = 18446744073709547
(gdb) p now_utime
+p now_utime
$3 = 1412938018519405
(gdb) p tmp->start_utime
+p tmp->start_utime
$4 = 1412938018523469
(gdb) # started in the future, impossible

Roel Van de Paar (roel11) wrote :

===================== Testcase
1. Patch code

$ bzr diff
=== modified file 'sql/sql_show.cc'
--- sql/sql_show.cc 2014-09-26 04:43:24 +0000
+++ sql/sql_show.cc 2014-10-10 10:15:06 +0000
@@ -2301,6 +2301,8 @@
       table->field[8]->store(((tmp->start_utime ?
                                now_utime - tmp->start_utime : 0)/ 1000));

+DBUG_ASSERT( (tmp->start_utime != 0) && (((now_utime - tmp->start_utime) / 1000) > 100000 ) );
+
       mysql_mutex_lock(&tmp->LOCK_thd_data);
       /* ROWS_SENT */
       table->field[9]->store((ulonglong) tmp->get_sent_row_count());

2. Recompile

3. Modify uploaded pquery-run-hack.sh to set correct BASEDIR (point to modded build) and INFILE;
BASEDIR=/sda/Percona-Server-5.6.21-rel69.0-676-debug.Linux.x86_64
INFILE=/sda/814189/high_time0.sql # See upload

===================== core/mysqld/ldd
See upload

Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :
Roel Van de Paar (roel11) wrote :

Also propose to add debug assert for (now_utime < tmp->start_utime) which seeems impossible - correct?

Roel Van de Paar (roel11) wrote :

The values are not "huge", they are actually "negative".

Some further debugging. Add this just before the debug assert;

fprintf(stderr, "tmp->start_utime = %llu, now_utime = %llu, offset= %llu, div=%llu, 2ndassertpart=%d, fullassert=%d\n", tmp->start_utime, now_utime, (now_utime - tmp->start_utime), ((now_utime - tmp->start_utime) / 1000), (((now_utime - tmp->start_utime) / 1000) > 100000 ), ( (tmp->start_utime != 0) && (((now_utime - tmp->start_utime) / 1000) > 100000 ) ));

And re-run, then check error log;

tmp->start_utime = 1412966213142460, now_utime = 1412966213143724, offset= 1264, div=1, 2ndassertpart=0, fullassert=0
tmp->start_utime = 1412966213142076, now_utime = 1412966213143724, offset= 1648, div=1, 2ndassertpart=0, fullassert=0
tmp->start_utime = 1412966213142355, now_utime = 1412966213143724, offset= 1369, div=1, 2ndassertpart=0, fullassert=0
tmp->start_utime = 1412966213143788, now_utime = 1412966213143724, offset= 18446744073709551552, div=18446744073709551, 2ndassertpart=1, fullassert=1 <------------- bingo.
tmp->start_utime = 1412966213144450, now_utime = 1412966213143724, offset= 18446744073709550890, div=18446744073709550, 2ndassertpart=1, fullassert=1
tmp->start_utime = 1412966213142266, now_utime = 1412966213143724, offset= 1458, div=1, 2ndassertpart=0, fullassert=0
tmp->start_utime = 1412966213144002, now_utime = 1412966213143724, offset= 18446744073709551338, div=18446744073709551, 2ndassertpart=1, fullassert=1
tmp->start_utime = 1412966213142439, now_utime = 1412966213143724, offset= 1285, div=1, 2ndassertpart=0, fullassert=0
tmp->start_utime = 1412966213143210, now_utime = 1412966213143724, offset= 514, div=0, 2ndassertpart=0, fullassert=0
tmp->start_utime = 1412966213143469, now_utime = 1412966213143724, offset= 255, div=0, 2ndassertpart=0, fullassert=0

summary: - tmp->start_utime > now_utime impossibility leading to huge TIME_MS
+ tmp->start_utime > now_utime impossibility leading to negative TIME_MS
values in processlist
Roel Van de Paar (roel11) wrote :

DBUG_ASSERT( (now_utime - tmp->start_utime) < 0 ); /* Is a better option */

Roel Van de Paar (roel11) wrote :

or simply as mentioned above; DBUG_ASSERT( now_utime < tmp->start_utime);

Roel Van de Paar (roel11) wrote :

Btw, to use pquery-run-hack.sh, copy it into percona-qa folder obtained via bzr branch lp:percona-qa first.

tags: added: upstream
Roel Van de Paar (roel11) wrote :

No NTPD involvement.

tags: added: i48583
tags: added: i51284

fill_schema_processlist works as following:

1. Remember current time in now_utime
2. Lock LOCK_thd_remove, so threads will not be removed from the list of threads, but new threads can be added
3. For each thread calculate TIME_MS as now_utime - tmp->start_utime
4. Unlock LOCK_thd_remove

It is possible for the new thread to be created while we doing 3. Natural fix would be to set TIME_MS=0 when now_utime < tmp->start_utime.

Removing the upstream bug as we fix a case not related to the non-monotonic clocks.

tags: removed: upstream
no longer affects: mysql-server

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1544

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.