InnoDB: Warning: a long semaphore wait

Bug #787667 reported by Teemu Ollakka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
codership-maria
In Progress
High
Unassigned

Bug Description

One of the nodes hung with following message (actually many of them):

InnoDB: Warning: a long semaphore wait:
--Thread 2294283152 has waited at lock/lock0lock.c line 4127 for 241.00 seconds the semaphore:
Mutex at 0x8770768 '&kernel_mutex', lock var 1
waiters flag 1

Appeared that BF applier was hung in wsrep_innobase_kill_one_trx() trying to lock victim thd with wsrep_thd_LOCK() while holding LOCK_open. Victim thd was trying to lock LOCK_open while holding thd->LOCK_wsrep_thd in:

#0 0xb7f9a424 in __kernel_vsyscall ()
#1 0xb7f36c99 in __lll_lock_wait () from /lib/i686/cmov/libpthread.so.0
#2 0xb7f32104 in _L_lock_936 () from /lib/i686/cmov/libpthread.so.0
#3 0xb7f3202e in pthread_mutex_lock () from /lib/i686/cmov/libpthread.so.0
#4 0x08237e0d in close_thread_tables (thd=0x895c0ce0) at sql_base.cc:1202
#5 0x081fdf9a in dispatch_command (command=COM_QUERY, thd=0x895c0ce0,
    packet=0x89525019 "", packet_length=194) at sql_parse.cc:2027
#6 0x082004b2 in do_command (thd=0x895c0ce0) at sql_parse.cc:1040
#7 0x081eae6e in handle_one_connection (arg=0x895c0ce0) at sql_connect.cc:1170
#8 0xb7f304c0 in start_thread () from /lib/i686/cmov/libpthread.so.0
#9 0xb7d8e6de in clone () from /lib/i686/cmov/libc.so.6

Inspecting the code revealed that thd->LOCK_wsrep_thd is not released in dispatch_command() before entering close_thread_tables(), leading to deadlock.

Changed in codership-maria:
status: New → In Progress
importance: Undecided → High
assignee: nobody → Teemu Ollakka (teemu-ollakka)
assignee: Teemu Ollakka (teemu-ollakka) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.