deadlock on server shutdown

Bug #1284670 reported by Teemu Ollakka on 2014-02-25
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Status tracked in 5.6
5.5
High
Teemu Ollakka
5.6
High
Teemu Ollakka
Percona XtraDB Cluster
Status tracked in 5.6
5.5
Undecided
Unassigned
5.6
Undecided
Unassigned

Bug Description

While running seesaw test servers started to hang consistently during shutdown. Further investigation showed that in the case of hangs there were two threads waiting for same condition variable:

Thread 2 (Thread 0x7f5fd30ea700 (LWP 5373)):
#0 pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x0000000000526572 in inline_mysql_cond_wait (
    that=0x10108e0 <COND_thread_count>, mutex=0x1010de0 <LOCK_thread_count>,
    src_file=0x9bad80 "/home/teemu/work/bzr/codership-mysql/5.5/sql/mysqld.cc",
    src_line=4785)
    at /home/teemu/work/bzr/codership-mysql/5.5/include/mysql/psi/mysql_thread.h:980
#2 wsrep_wait_appliers_close (thd=thd@entry=0x0)
    at /home/teemu/work/bzr/codership-mysql/5.5/sql/mysqld.cc:4785
#3 0x00000000006571b1 in wsrep_stop_replication (thd=thd@entry=0x0)
    at /home/teemu/work/bzr/codership-mysql/5.5/sql/wsrep_mysqld.cc:706
#4 0x00000000005252ec in kill_server (sig_ptr=0x0)
    at /home/teemu/work/bzr/codership-mysql/5.5/sql/mysqld.cc:1391
#5 0x000000000052536e in kill_server_thread (arg=<optimized out>)
    at /home/teemu/work/bzr/codership-mysql/5.5/sql/mysqld.cc:1425
#6 0x00007f6004eb2f6e in start_thread (arg=0x7f5fd30ea700) at pthread_create.c:311
#7 0x00007f6003f9b9cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb) p wsrep_running_threads
$1 = 0

Code corresponding to thread 2 frame 2:

  while (wsrep_running_threads > 0)
  {
    mysql_cond_wait(&COND_thread_count,&LOCK_thread_count);
    DBUG_PRINT("quit",("One thread died (count=%u)",thread_count));
  }

Thread 1 (Thread 0x7f6005b1d740 (LWP 3426)):
#0 pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x0000000000528e76 in inline_mysql_cond_wait (
    that=0x10108e0 <COND_thread_count>, mutex=0x1010de0 <LOCK_thread_count>,
    src_file=0x9bad80 "/home/teemu/work/bzr/codership-mysql/5.5/sql/mysqld.cc",
    src_line=5392)
    at /home/teemu/work/bzr/codership-mysql/5.5/include/mysql/psi/mysql_thread.h:980
#2 mysqld_main (argc=54, argv=0x2a982c8)
    at /home/teemu/work/bzr/codership-mysql/5.5/sql/mysqld.cc:5392
#3 0x00007f6003ec2de5 in __libc_start_main (main=0x50a520 <main(int, char**)>,
    argc=15, ubp_av=0x7fff8ac89f98, init=<optimized out>, fini=<optimized out>,
    rtld_fini=<optimized out>, stack_end=0x7fff8ac89f88) at libc-start.c:260
#4 0x000000000051c0bd in _start ()
(gdb) p ready_to_exit
$2 = false

Code corresponding to thread 1 frame 2:

  mysql_mutex_lock(&LOCK_thread_count);
  while (!ready_to_exit)
    mysql_cond_wait(&COND_thread_count, &LOCK_thread_count);
  mysql_mutex_unlock(&LOCK_thread_count);

Probable reason for this is that start_wsrep_THD() uses mysql_cond_signal() instead of mysql_cond_broadcast() when decrementing wsrep_running_threads and thread 1 will be woken up instead of thread 2 when rollbacker thread exits.

description: updated
Teemu Ollakka (teemu-ollakka) wrote :

"Fixed" in 5.6 branch by completely removing unused wsrep_running_threads and cond signalling around it: http://bazaar.launchpad.net/~codership/codership-mysql/5.6/revision/4047

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers