deadlock between PurgeAndDiscard() and apply_trx()

Reported by Teemu Ollakka on 2013-06-10
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Galera
High
Teemu Ollakka
Percona XtraDB Cluster
Undecided
Unassigned

Bug Description

Galera provider may deadlock if applier thread is still executing apply_trx() while processing commit cut causes corresponding trx to be purged from cert index. Threads try to lock cert index and trx in different order.

Thread backtraces:

Thread 28 (Thread 0x7f995d2ef700 (LWP 27500)):
#0 0x00007fb6c447389c in __lll_lock_wait () from
/lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00007fb6c446f065 in _L_lock_858 () from
/lib/x86_64-linux-gnu/libpthread.so.0
#2 0x00007fb6c446eeba in pthread_mutex_lock () from
/lib/x86_64-linux-gnu/libpthread.so.0
#3 0x00007fb6c24d6457 in gu::Lock::Lock(gu::Mutex const&) () from
/usr/lib/galera/libgalera_smm.so
#4 0x00007fb6c25e7300 in galera::ReplicatorSMM::apply_trx(void*,
galera::TrxHandle*) () from /usr/lib/galera/libgalera_smm.so
#5 0x00007fb6c25e77c5 in galera::ReplicatorSMM::process_trx(void*,
galera::TrxHandle*) () from /usr/lib/galera/libgalera_smm.so
#6 0x00007fb6c25be124 in galera::GcsActionSource::dispatch(void*,
gcs_action const&) () from /usr/lib/galera/libgalera_smm.so
#7 0x00007fb6c25be39a in galera::GcsActionSource::process(void*) ()
from /usr/lib/galera/libgalera_smm.so
#8 0x00007fb6c25df175 in galera::ReplicatorSMM::async_recv(void*) ()
from /usr/lib/galera/libgalera_smm.so
#9 0x00007fb6c25f6f93 in galera_recv () from
/usr/lib/galera/libgalera_smm.so
#10 0x00007fb6c5ab1ab1 in wsrep_replication_process(THD*) ()
#11 0x00007fb6c5a2d71b in start_wsrep_THD ()
#12 0x00007fb6c446ce9a in start_thread () from
/lib/x86_64-linux-gnu/libpthread.so.0
#13 0x00007fb6c3b9acbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#14 0x0000000000000000 in ?? ()

Thread 21 (Thread 0x7f99d88eb700 (LWP 27507)):
#0 0x00007fb6c447389c in __lll_lock_wait () from
/lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00007fb6c446f065 in _L_lock_858 () from
/lib/x86_64-linux-gnu/libpthread.so.0
#2 0x00007fb6c446eeba in pthread_mutex_lock () from
/lib/x86_64-linux-gnu/libpthread.so.0
#3 0x00007fb6c25b9c4d in
galera::Certification::PurgeAndDiscard::operator()(std::pair<long const,
galera::TrxHandle*>&) const () from /usr/lib/galera/libgalera_smm.so
#4 0x00007fb6c25b429c in galera::Certification::purge_trxs_upto_(long)
() from /usr/lib/galera/libgalera_smm.so
#5 0x00007fb6c25de292 in
galera::ReplicatorSMM::process_commit_cut(long, long) () from
/usr/lib/galera/libgalera_smm.so
#6 0x00007fb6c25be161 in galera::GcsActionSource::dispatch(void*,
gcs_action const&) () from /usr/lib/galera/libgalera_smm.so
#7 0x00007fb6c25be39a in galera::GcsActionSource::process(void*) ()
from /usr/lib/galera/libgalera_smm.so
#8 0x00007fb6c25df175 in galera::ReplicatorSMM::async_recv(void*) ()
from /usr/lib/galera/libgalera_smm.so
#9 0x00007fb6c25f6f93 in galera_recv () from
/usr/lib/galera/libgalera_smm.so
#10 0x00007fb6c5ab1ab1 in wsrep_replication_process(THD*) ()
#11 0x00007fb6c5a2d71b in start_wsrep_THD ()
#12 0x00007fb6c446ce9a in start_thread () from
/lib/x86_64-linux-gnu/libpthread.so.0
#13 0x00007fb6c3b9acbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#14 0x0000000000000000 in ?? ()

Changed in galera:
milestone: none → 23.2.6
assignee: nobody → Teemu Ollakka (teemu-ollakka)
Changed in percona-xtradb-cluster:
milestone: none → 5.5.31-24.8
Changed in galera:
importance: Undecided → High
status: New → Confirmed
Alex Yurchenko (ayurchen) wrote :

Potential fix committed in r152

Changed in galera:
status: Confirmed → In Progress
status: In Progress → Fix Committed
Changed in percona-xtradb-cluster:
status: New → Fix Released
Changed in galera:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers