Exceptions that happen before monitor release must be handled internally.

Bug #1307741 reported by Alex Yurchenko on 2014-04-14
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Galera
Medium
Alex Yurchenko
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.5
Invalid
Undecided
Unassigned
5.6
New
Undecided
Unassigned

Bug Description

...and either release/cancel the monitor or abort the process, as otherwise the caller has no ability to wake up and cancel remaining threads waiting to enter the monitor.

E.g. mysqld would hang with:

#0 0x00002b84b88fc2d4 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00000000005a3745 in inline_mysql_cond_wait (src_line=5818, mutex=0x1311a60, that=0x1312160, src_file=<optimized out>) at /mnt/workspace/percona-xtradb-cluster-5.6-binary/label_exp/centos5-64/target/percona-build.X13540/src/Percona-XtraDB-Cluster-5.6.15/include/mysql/psi/mysql_thread.h:1162
#2 wsrep_wait_appliers_close (thd=0x0) at /mnt/workspace/percona-xtradb-cluster-5.6-binary/label_exp/centos5-64/target/percona-build.X13540/src/Percona-XtraDB-Cluster-5.6.15/sql/mysqld.cc:5818
#3 0x00000000005aea41 in wsrep_stop_replication (thd=0x0) at /mnt/workspace/percona-xtradb-cluster-5.6-binary/label_exp/centos5-64/target/percona-build.X13540/src/Percona-XtraDB-Cluster-5.6.15/sql/wsrep_mysqld.cc:719
#4 0x00000000005ab627 in kill_server (sig_ptr=0x0) at /mnt/workspace/percona-xtradb-cluster-5.6-binary/label_exp/centos5-64/target/percona-build.X13540/src/Percona-XtraDB-Cluster-5.6.15/sql/mysqld.cc:1761
#5 0x00000000005ab87e in kill_server_thread (arg=<optimized out>) at /mnt/workspace/percona-xtradb-cluster-5.6-binary/label_exp/centos5-64/target/percona-build.X13540/src/Percona-XtraDB-Cluster-5.6.15/sql/mysqld.cc:1795
#6 0x0000000000af703a in pfs_spawn_thread (arg=0x38aab2d0) at /mnt/workspace/percona-xtradb-cluster-5.6-binary/label_exp/centos5-64/target/percona-build.X13540/src/Percona-XtraDB-Cluster-5.6.15/storage/perfschema/pfs.cc:1858
#7 0x00002b84b88f7b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#8 0x00002b84ba43b0ed in clone () from /lib/x86_64-linux-gnu/libc.so.6
#9 0x0000000000000000 in ?? ()

while slave threads would stay in:

#0 0x00002b84b88fc2d4 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00002b8540860a40 in galera::Monitor<galera::ReplicatorSMM::LocalOrder>::enter(galera::ReplicatorSMM::LocalOrder&) () from /usr/local/mysql/lib/libgalera_smm.so
#2 0x00002b8540854999 in galera::ReplicatorSMM::cert (this=0x3041e10, trx=0x2bcb31356de0) at galera/src/replicator_smm.cpp:1655
#3 0x00002b854085c561 in galera::ReplicatorSMM::process_trx (this=0x3041e10, recv_ctx=0x2bc4e6e6ba70, trx=0x2bcb31356de0) at galera/src/replicator_smm.cpp:1203
#4 0x00002b85408375b9 in galera::GcsActionSource::dispatch (this=0x30423f8, recv_ctx=0x2bc4e6e6ba70, act=..., exit_loop=@0x2bc4d925814f: false) at galera/src/gcs_action_source.cpp:118
#5 0x00002b8540837a93 in galera::GcsActionSource::process (this=0x30423f8, recv_ctx=0x2bc4e6e6ba70, exit_loop=@0x2bc4d925814f: false) at galera/src/gcs_action_source.cpp:177
#6 0x00002b8540856883 in galera::ReplicatorSMM::async_recv (this=0x3041e10, recv_ctx=0x2bc4e6e6ba70) at galera/src/replicator_smm.cpp:354
#7 0x00002b854086b573 in galera_recv (gh=<optimized out>, recv_ctx=<optimized out>) at galera/src/wsrep_provider.cpp:231
#8 0x00000000005baf2f in wsrep_replication_process (thd=0x2bc4e6e6ba70) at /mnt/workspace/percona-xtradb-cluster-5.6-binary/label_exp/centos5-64/target/percona-build.X13540/src/Percona-XtraDB-Cluster-5.6.15/sql/wsrep_thd.cc:309
#9 0x00000000005abc7e in start_wsrep_THD (arg=0x5baee0) at /mnt/workspace/percona-xtradb-cluster-5.6-binary/label_exp/centos5-64/target/percona-build.X13540/src/Percona-XtraDB-Cluster-5.6.15/sql/mysqld.cc:5502
#10 0x00002b84b88f7b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#11 0x00002b84ba43b0ed in clone () from /lib/x86_64-linux-gnu/libc.so.6

Alex Yurchenko (ayurchen) wrote :

This should be fixed during 4.x refactoring.

Changed in galera:
assignee: nobody → Alex Yurchenko (ayurchen)
importance: Undecided → Medium
milestone: none → 4.0
status: New → Confirmed
Ovais Tariq (ovais-tariq) wrote :

Alex,

Attached is the output of:
set print pretty
thr 30
f 6
p this->cert_

tags: added: i38783
Alex Yurchenko (ayurchen) wrote :

Ovais, this gdb dump is actually unrelated to this bug. Anyways, it does not offer much insight on why pthread_create() returned ENOMEM.

Ovais Tariq (ovais-tariq) wrote :

Ahh ok

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1666

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments