Assertion `user_stats->concurrent_connections > 0' failed.

Bug #1661488 reported by Roel Van de Paar on 2017-02-03
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Status tracked in 5.7
5.5
Fix Released
High
Nickolay Ihalainen
5.6
Fix Released
High
Nickolay Ihalainen
5.7
Fix Released
High
Nickolay Ihalainen

Bug Description

2017-02-02T20:32:10.706414Z 0 [Note] /sda/PS300117-percona-server-5.7.17-11-linux-x86_64-debug/bin/mysqld: ready for connections.
Version: '5.7.17-11-debug' socket: '/sda/PS300117-percona-server-5.7.17-11-linux-x86_64-debug/socket.sock' port: 12438 MySQL Community Server (GPL)
mysqld: /git/PS-5.7.17_dbg/sql/sql_connect.cc:575: void update_global_user_stats_with_user(THD*, USER_STATS*, time_t): Assertion `user_stats->concurrent_connections > 0' failed.
20:32:38 UTC - mysqld got signal 6 ;

1. Start server (no special arguments)
2. open 2 client connection to same server (c1, c2)
3. Through one client connection (say c1)
   execute "set global userstat=1; FLUSH CLIENT_STATISTICS;"
4. Now quit the other client connection (c2)
6. Now quit the client c1 and see server crash.

Credit for testcase & original bug find goes to Krunal

Core was generated by `/sda/PS300117-percona-server-5.7.17-11-linux-x86_64-debug/bin/mysqld --no-defau'.
Program terminated with signal 6, Aborted.
#0 0x00007fbe74b17741 in __pthread_kill (threadid=<optimized out>, signo=6) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:61
61 val = INTERNAL_SYSCALL (tgkill, err, 3, THREAD_GETMEM (THREAD_SELF, pid),
(gdb) bt
#0 0x00007fbe74b17741 in __pthread_kill (threadid=<optimized out>, signo=6) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:61
#1 0x000000000185027e in my_write_core (sig=6) at /git/PS-5.7.17_dbg/mysys/stacktrace.c:249
#2 0x0000000000e81bc1 in handle_fatal_signal (sig=6) at /git/PS-5.7.17_dbg/sql/signal_handler.cc:223
#3 <signal handler called>
#4 0x00007fbe72eab1d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#5 0x00007fbe72eac8c8 in __GI_abort () at abort.c:90
#6 0x00007fbe72ea4146 in __assert_fail_base (fmt=0x7fbe72ff53a8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
    assertion=assertion@entry=0x203ac80 "user_stats->concurrent_connections > 0", file=file@entry=0x203ab18 "/git/PS-5.7.17_dbg/sql/sql_connect.cc", line=line@entry=575,
    function=function@entry=0x203b080 <update_global_user_stats_with_user(THD*, st_user_stats*, long)::__PRETTY_FUNCTION__> "void update_global_user_stats_with_user(THD*, USER_STATS*, time_t)") at assert.c:92
#7 0x00007fbe72ea41f2 in __GI___assert_fail (assertion=0x203ac80 "user_stats->concurrent_connections > 0", file=0x203ab18 "/git/PS-5.7.17_dbg/sql/sql_connect.cc", line=575,
    function=0x203b080 <update_global_user_stats_with_user(THD*, st_user_stats*, long)::__PRETTY_FUNCTION__> "void update_global_user_stats_with_user(THD*, USER_STATS*, time_t)")
    at assert.c:101
#8 0x00000000014c4ebf in update_global_user_stats_with_user (thd=0x7fbe2f419000, user_stats=0x7fbe2f4223a0, now=1486067558) at /git/PS-5.7.17_dbg/sql/sql_connect.cc:575
#9 0x00000000014c5240 in update_global_user_stats (thd=0x7fbe2f419000, create_user=false, now=1486067558) at /git/PS-5.7.17_dbg/sql/sql_connect.cc:628
#10 0x00000000014c693a in end_connection (thd=0x7fbe2f419000) at /git/PS-5.7.17_dbg/sql/sql_connect.cc:1355
#11 0x0000000001656e20 in handle_connection (arg=0x7fbe5981f6a0) at /git/PS-5.7.17_dbg/sql/conn_handler/connection_handler_per_thread.cc:315
#12 0x000000000187ee4d in pfs_spawn_thread (arg=0x7fbe68f91420) at /git/PS-5.7.17_dbg/storage/perfschema/pfs.cc:2188
#13 0x00007fbe74b12dc5 in start_thread (arg=0x7fbe75104700) at pthread_create.c:308
#14 0x00007fbe72f6d73d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Changed in percona-server:
assignee: Laurynas Biveinis (laurynas-biveinis) → nobody

This is a regression from the fix for bug 728082. It should be present in 5.5 too.

tags: added: regression
Nickolay Ihalainen (ihanick) wrote :

REFRESH_USER_STATS (FLUSH CLIENT_STATISTICS)
dropping global storage for statistics:
      free_global_user_stats();
      init_global_user_stats();

Thus it's not possible to track current connections for existing connections after FLUSH CLIENT_STATISTICS.

Simplest fix, but require documentation note that concurrent_connections could not include active threads after flush client_statistics:
- if (thd->diff_disconnects && thd->diff_denied_connections == 0) {

+ if (thd->diff_disconnects && thd->diff_denied_connections == 0 && user_stats->concurrent_connections) {
- DBUG_ASSERT(user_stats->concurrent_connections > 0);
    user_stats->concurrent_connections-= thd->diff_disconnects;
  }

Logically correct (but introducing redundant overhead) could be clean all counters in each user stat structure except for current connections number.

Nickolay, what do MariaDB and Facebook MySQL do?

Nickolay Ihalainen (ihanick) wrote :

There is no userstats patch in webscalesql-5.6,
728082 is not fixed in mariadb (concurrent_connections not tracked at all), so it's not affected by 1661488.

Original work:
https://github.com/google/mysql/blob/mysql-5.1.63/12-user-stats/sql/sql_stats.cc#L726

Scan whole userstats table
if entry->concurrent_connections non-zero, clear all counters
if there is no active connections (concurrent_connections = 0), free this particular hash table item.

Nickolay, please check Facebook MySQL not WebscaleSQL?

Nickolay Ihalainen (ihanick) wrote :

https://github.com/facebook/mysql-5.6.git

reload_acl_and_cache => if (options & REFRESH_STATISTICS) => reset_global_user_stats https://github.com/facebook/mysql-5.6/blob/9cbb76e8505ee343b91de537a8212e8a384cec9b/sql/sql_connect.cc#L1234

Loops against all user connections and resets stats.

User statistics in facebook mysql is a part of user resources, thus it's not working in the same way as in original google patches or percona/mariadb, e.g. number of concurrent connections obtained from user resources.

But anyway both google and facebook implementations looping. So I can:
- loop accross all hash items (my_hash_element for(0;<global_user_stats.records))
- my_hash_delete item if concurrent_connections == 0
- cut user_stats values from init_user_stats to separate reset_user_stats function (it could be possible to use init_user_stats to reset values, but it uses strncpy(user_stats->user, user, sizeof(user_stats->user)); and if user_stats->user will be the same pointer as user it should be UB, because strncpy defined with restrict: extern char *strncpy (char *__restrict __dest, const char *__restrict __src, size_t __n) ) or allow to have init_user_stats user=NULL and do not use strncpy if user parameter is null to re-init already existing user stats.

Sounds good, please proceed

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1057

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers