Bug #1264809 “regression test for lp:1019473 causes server abor...” : Bugs : MySQL patches by Codership

Seppo Jaakola (seppo-jaakola) on 2013-12-29

Changed in codership-mysql:
status:	New → In Progress
assignee:	nobody → Seppo Jaakola (seppo-jaakola)
importance:	Undecided → High
milestone:	none → 5.5.35-25.10

Revision history for this message

Seppo Jaakola (seppo-jaakola) wrote on 2013-12-29:

#1

The problem may relate to fix in: lp:1100496

The anatomy of the issue is as follows:

If a table does not have a primary key, nor any other unique, but has some non-unique keys defined, then DML on the table will result in taking X locks on the non-unique key. Because, the key is non-unique, there will be excessive record X locks on the table, and during parallel replication, some of the excessive X locks, granted for separate transactions, may conflict.

Galera does parallel applying control based on MD5 hash sum on the whole tuple, and it can be that two slave appliers are updating separate rows, but still end up in lock conflict due to these non-unique keys. (if mysql would use full table scan, we would not hit this issue)

Revision history for this message

Seppo Jaakola (seppo-jaakola) wrote on 2013-12-29:

#2

A fix trial was pushed into wsrep-5.5, in revision: http://bazaar.launchpad.net/~codership/codership-mysql/wsrep-5.5/revision/3935

Here, we populate all non-unique keys for writes on table which does not have PK (nor other unique key), but has some non-unique keys. This may end up in big number of excessive keys and BF aborts on writes to such tables. But experiments will tell more.

Revision history for this message

Teemu Ollakka (teemu-ollakka) wrote on 2014-02-25:

#3

Merged proposed fix from 5.5 r 3935 to 5.6: http://bazaar.launchpad.net/~codership/codership-mysql/5.6/revision/4044

Ovais Tariq (ovais-tariq) on 2014-03-13

tags:

added: i40190

Revision history for this message

Seppo Jaakola (seppo-jaakola) wrote on 2014-03-13:

#4

Fix for the wsrep-5.6 made it to the release: https://launchpad.net/codership-mysql/5.6/5.6.16-25.5
It was not possible to set target milestone anymore to that release

Revision history for this message

Ovais Tariq (ovais-tariq) wrote on 2014-03-13:

#5

Raghu,

When should we expect the fix for this bug to be merged into PXC and available as a release? I have a customer affected by this bug.

Revision history for this message

Ovais Tariq (ovais-tariq) wrote on 2014-03-24:

#6

Download full text (3.2 KiB)

I have a crash similar to the one reported in the bug even after upgrade to 5.6.15-25.5:

BF-BF X lock conflict
RECORD LOCKS space id 6910 page no 97 n bits 160 index `GEN_CLUST_INDEX` of table `db`.`tbl_without_pk` trx id 3254927279 lock_mode X locks rec but not gap
08:12:36 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=246
max_threads=1002
thread_count=46
connection_count=46
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 407967 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7fb760000990
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fb7a6ef8d88 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x902645]
/usr/sbin/mysqld(handle_fatal_signal+0x4c4)[0x6801e4]
/lib64/libpthread.so.0(+0xfae0)[0x7fcec84c2ae0]
/lib64/libc.so.6(gsignal+0x35)[0x7fcec6934ba5]
/lib64/libc.so.6(abort+0x17b)[0x7fcec69364bb]
/usr/sbin/mysqld[0x9950db]
/usr/sbin/mysqld[0x99973f]
/usr/sbin/mysqld[0x999ae3]
/usr/sbin/mysqld[0xa0b4ff]
/usr/sbin/mysqld[0xa11679]
/usr/sbin/mysqld[0x968857]
/usr/sbin/mysqld(_ZN7handler11ha_rnd_nextEPh+0x9c)[0x5c087c]
/usr/sbin/mysqld(_ZN14Rows_log_event24do_table_scan_and_updateEPK14Relay_log_info+0x198)[0x89fec8]
/usr/sbin/mysqld(_ZN14Rows_log_event14do_apply_eventEPK14Relay_log_info+0xd8f)[0x8a59df]
/usr/sbin/mysqld(_ZN9Log_event11apply_eventEP14Relay_log_info+0x68)[0x8a6c38]
/usr/sbin/mysqld(_Z14wsrep_apply_cbPvPKvmjPK14wsrep_trx_meta+0x5ce)[0x5bd38e]
/usr/lib64/libgalera_smm.so(_ZNK6galera9TrxHandle5applyEPvPF15wsrep_cb_statusS1_PKvmjPK14wsrep_trx_metaERS6_+0xb1)[0x7fce5c60e211]
/usr/lib64/libgalera_smm.so(+0x1ab345)[0x7fce5c647345]
/usr/lib64/libgalera_smm.so(_ZN6galera13ReplicatorSMM9apply_trxEPvPNS_9TrxHandleE+0x283)[0x7fce5c6481b3]
/usr/lib64/libgalera_smm.so(_ZN6galera13ReplicatorSMM11process_trxEPvPNS_9TrxHandleE+0x45)[0x7fce5c648aa5]
/usr/lib64/libgalera_smm.so(_ZN6galera15GcsActionSource8dispatchEPvRK10gcs_actionRb+0x2c9)[0x7fce5c623a89]
/usr/lib64/libgalera_smm.so(_ZN6galera15GcsActionSource7processEPvRb+0x63)[0x7fce5c623f63]
/usr/lib64/libgalera_smm.so(_ZN6galera13ReplicatorSMM10async_recvEPv+0x93)[0x7fce5c642db3]
/usr/lib64/libgalera_smm.so(galera_recv+0x23)[0x7fce5c657aa3]
/usr/sbin/mysqld[0x5be11f]
/usr/sbin/mysqld(start_wsrep_THD+0x3fe)[0x5ae42e]
/lib64/libpthread.so.0(+0x7ddb)[0x7fcec84baddb]
/lib64/libc.so.6(clone+0x6d)[0x7fcec69e2a1d]

Trying to get some variables.
Some pointers may be invalid an...

I have a crash similar to the one reported in the bug even after upgrade to 5.6.15-25.5:

BF-BF X lock conflict
RECORD LOCKS space id 6910 page no 97 n bits 160 index `GEN_CLUST_INDEX` of table `db`.`tbl_without_pk` trx id 3254927279 lock_mode X locks rec but not gap
08:12:36 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=246
max_threads=1002
thread_count=46
connection_count=46
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 407967 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7fb760000990
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fb7a6ef8d88 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x902645]
/usr/sbin/mysqld(handle_fatal_signal+0x4c4)[0x6801e4]
/lib64/libpthread.so.0(+0xfae0)[0x7fcec84c2ae0]
/lib64/libc.so.6(gsignal+0x35)[0x7fcec6934ba5]
/lib64/libc.so.6(abort+0x17b)[0x7fcec69364bb]
/usr/sbin/mysqld[0x9950db]
/usr/sbin/mysqld[0x99973f]
/usr/sbin/mysqld[0x999ae3]
/usr/sbin/mysqld[0xa0b4ff]
/usr/sbin/mysqld[0xa11679]
/usr/sbin/mysqld[0x968857]
/usr/sbin/mysqld(_ZN7handler11ha_rnd_nextEPh+0x9c)[0x5c087c]
/usr/sbin/mysqld(_ZN14Rows_log_event24do_table_scan_and_updateEPK14Relay_log_info+0x198)[0x89fec8]
/usr/sbin/mysqld(_ZN14Rows_log_event14do_apply_eventEPK14Relay_log_info+0xd8f)[0x8a59df]
/usr/sbin/mysqld(_ZN9Log_event11apply_eventEP14Relay_log_info+0x68)[0x8a6c38]
/usr/sbin/mysqld(_Z14wsrep_apply_cbPvPKvmjPK14wsrep_trx_meta+0x5ce)[0x5bd38e]
/usr/lib64/libgalera_smm.so(_ZNK6galera9TrxHandle5applyEPvPF15wsrep_cb_statusS1_PKvmjPK14wsrep_trx_metaERS6_+0xb1)[0x7fce5c60e211]
/usr/lib64/libgalera_smm.so(+0x1ab345)[0x7fce5c647345]
/usr/lib64/libgalera_smm.so(_ZN6galera13ReplicatorSMM9apply_trxEPvPNS_9TrxHandleE+0x283)[0x7fce5c6481b3]
/usr/lib64/libgalera_smm.so(_ZN6galera13ReplicatorSMM11process_trxEPvPNS_9TrxHandleE+0x45)[0x7fce5c648aa5]
/usr/lib64/libgalera_smm.so(_ZN6galera15GcsActionSource8dispatchEPvRK10gcs_actionRb+0x2c9)[0x7fce5c623a89]
/usr/lib64/libgalera_smm.so(_ZN6galera15GcsActionSource7processEPvRb+0x63)[0x7fce5c623f63]
/usr/lib64/libgalera_smm.so(_ZN6galera13ReplicatorSMM10async_recvEPv+0x93)[0x7fce5c642db3]
/usr/lib64/libgalera_smm.so(galera_recv+0x23)[0x7fce5c657aa3]
/usr/sbin/mysqld[0x5be11f]
/usr/sbin/mysqld(start_wsrep_THD+0x3fe)[0x5ae42e]
/lib64/libpthread.so.0(+0x7ddb)[0x7fcec84baddb]
/lib64/libc.so.6(clone+0x6d)[0x7fcec69e2a1d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 12
Status: NOT_KILLED

Revision history for this message

Ovais Tariq (ovais-tariq) wrote on 2014-03-24:

#7

Seppo you mentioned here https://bugs.launchpad.net/codership-mysql/+bug/1264809/comments/1 that "if mysql would use full table scan, we would not hit this issue". In this case based on the stack trace it looks as though MySQL indeed was doing a full table scan.

Revision history for this message

Ovais Tariq (ovais-tariq) wrote on 2014-03-24:

#8

gdb-40190-1264809-std.txt Edit (78.2 KiB, text/plain)

Attaching the standard backtrace taken using `thread apply all bt`

Revision history for this message

Ovais Tariq (ovais-tariq) wrote on 2014-03-24:

#9

gdb-40190-1264809-full.txt Edit (217.6 KiB, text/plain)

Attaching the full backtrace taken using `thread apply all bt full`

Revision history for this message

Ovais Tariq (ovais-tariq) wrote on 2014-03-24:

#10

Download full text (10.4 KiB)

Looks like thread 38 and 33 have contending locks and both are doing a full table scan:

--
Thread 38 (Thread 0x7ff571abe700 (LWP 6433)):
+bt
#0 0x00007ffff7bcda5c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00000000009baee3 in os_cond_wait (event=0x525fa580, reset_sig_count=1) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/os/os0sync.cc:214
#2 os_event_wait_low (event=0x525fa580, reset_sig_count=1) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/os/os0sync.cc:610
#3 0x0000000000a2b00a in sync_array_wait_event (arr=0x1504d80, index=0) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/sync/sync0arr.cc:433
#4 0x0000000000a2d461 in mutex_spin_wait (_mutex=0x7fe8e95f3088, high_priority=false, file_name=0xc2b610 "/mnt/workspace/percona-xtradb-cluster-5.6-rpms/label_exp/centos6-64/target/BUILD/Percona-XtraDB-Cluster-5.6.15/storage/innobase/lock/lock0lock.cc", line=6478) at /us
r/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/sync/sync0sync.cc:671
#5 0x0000000000992abc in mutex_enter_func (mutex=0x7fe8e95f3088, file_name=0xc2b610 "/mnt/workspace/percona-xtradb-cluster-5.6-rpms/label_exp/centos6-64/target/BUILD/Percona-XtraDB-Cluster-5.6.15/storage/innobase/lock/lock0lock.cc", line=6478) at /usr/src/debug/Percona-X
traDB-Cluster-5.6.15/storage/innobase/include/sync0sync.ic:273
#6 pfs_mutex_enter_func (mutex=0x7fe8e95f3088, file_name=0xc2b610 "/mnt/workspace/percona-xtradb-cluster-5.6-rpms/label_exp/centos6-64/target/BUILD/Percona-XtraDB-Cluster-5.6.15/storage/innobase/lock/lock0lock.cc", line=6478) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.
15/storage/innobase/include/sync0sync.ic:350
#7 0x0000000000996890 in lock_rec_convert_impl_to_expl (block=0x7fec37f04c00, rec=0x7fedb7178082 "", index=0x7fe8c801f238, offsets=0x7ff571ab9810) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/lock/lock0lock.cc:6478
#8 0x0000000000999ab1 in lock_clust_rec_read_check_and_lock (flags=<value optimized out>, block=0x7fec37f04c00, rec=<value optimized out>, index=0x7fe8c801f238, offsets=<value optimized out>, mode=<value optimized out>, gap_mode=1024, thr=0x7fe89c014170) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/lock/lock0lock.cc:6820
#9 0x0000000000a0b4ff in sel_set_rec_lock (block=0x7fec37f04c00, rec=0x7fedb7178082 "", index=0x7fe8c801f238, offsets=0x7ff571ab9810, mode=<value optimized out>, type=<value optimized out>, thr=0x7fe89c014170) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/row/row0sel.cc:1012
#10 0x0000000000a11679 in row_search_for_mysql (buf=0x7fe89c0127b0 "\376\rafrica\\dariogusv", mode=1, prebuilt=0x7fe89c013a78, match_mode=<value optimized out>, direction=<value optimized out>) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/row/row0sel.cc:4506
#11 0x0000000000968857 in ha_innobase::general_fetch (this=0x7fe89c010a80, buf=0x7fe89c0127b0 "\376\rafrica\\dariogusv", direction=1, match_mode=0) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/handler/ha_innodb.cc:8969
#12 0x00000000005c087c in handler::ha_rnd_next (this=0x7fe89c010a80, buf=0x7fe89c0127b0 "\376\rafrica\\dar...

Looks like thread 38 and 33 have contending locks and both are doing a full table scan:

--
Thread 38 (Thread 0x7ff571abe700 (LWP 6433)):
+bt
#0  0x00007ffff7bcda5c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000009baee3 in os_cond_wait (event=0x525fa580, reset_sig_count=1) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/os/os0sync.cc:214
#2  os_event_wait_low (event=0x525fa580, reset_sig_count=1) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/os/os0sync.cc:610
#3  0x0000000000a2b00a in sync_array_wait_event (arr=0x1504d80, index=0) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/sync/sync0arr.cc:433
#4  0x0000000000a2d461 in mutex_spin_wait (_mutex=0x7fe8e95f3088, high_priority=false, file_name=0xc2b610 "/mnt/workspace/percona-xtradb-cluster-5.6-rpms/label_exp/centos6-64/target/BUILD/Percona-XtraDB-Cluster-5.6.15/storage/innobase/lock/lock0lock.cc", line=6478) at /us
r/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/sync/sync0sync.cc:671
#5  0x0000000000992abc in mutex_enter_func (mutex=0x7fe8e95f3088, file_name=0xc2b610 "/mnt/workspace/percona-xtradb-cluster-5.6-rpms/label_exp/centos6-64/target/BUILD/Percona-XtraDB-Cluster-5.6.15/storage/innobase/lock/lock0lock.cc", line=6478) at /usr/src/debug/Percona-X
traDB-Cluster-5.6.15/storage/innobase/include/sync0sync.ic:273
#6  pfs_mutex_enter_func (mutex=0x7fe8e95f3088, file_name=0xc2b610 "/mnt/workspace/percona-xtradb-cluster-5.6-rpms/label_exp/centos6-64/target/BUILD/Percona-XtraDB-Cluster-5.6.15/storage/innobase/lock/lock0lock.cc", line=6478) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.
15/storage/innobase/include/sync0sync.ic:350
#7  0x0000000000996890 in lock_rec_convert_impl_to_expl (block=0x7fec37f04c00, rec=0x7fedb7178082 "", index=0x7fe8c801f238, offsets=0x7ff571ab9810) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/lock/lock0lock.cc:6478
#8  0x0000000000999ab1 in lock_clust_rec_read_check_and_lock (flags=<value optimized out>, block=0x7fec37f04c00, rec=<value optimized out>, index=0x7fe8c801f238, offsets=<value optimized out>, mode=<value optimized out>, gap_mode=1024, thr=0x7fe89c014170) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/lock/lock0lock.cc:6820
#9  0x0000000000a0b4ff in sel_set_rec_lock (block=0x7fec37f04c00, rec=0x7fedb7178082 "", index=0x7fe8c801f238, offsets=0x7ff571ab9810, mode=<value optimized out>, type=<value optimized out>, thr=0x7fe89c014170) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/row/row0sel.cc:1012
#10 0x0000000000a11679 in row_search_for_mysql (buf=0x7fe89c0127b0 "\376\rafrica\\dariogusv", mode=1, prebuilt=0x7fe89c013a78, match_mode=<value optimized out>, direction=<value optimized out>) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/row/row0sel.cc:4506
#11 0x0000000000968857 in ha_innobase::general_fetch (this=0x7fe89c010a80, buf=0x7fe89c0127b0 "\376\rafrica\\dariogusv", direction=1, match_mode=0) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/handler/ha_innodb.cc:8969
#12 0x00000000005c087c in handler::ha_rnd_next (this=0x7fe89c010a80, buf=0x7fe89c0127b0 "\376\rafrica\\dariogusv") at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/sql/handler.cc:2798
#13 0x000000000089fec8 in Rows_log_event::do_table_scan_and_update (this=0x7fe89c00fd80, rli=0x7fe89c00bed0) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/sql/log_event.cc:10862
#14 0x00000000008a59df in Rows_log_event::do_apply_event (this=0x7fe89c00fd80, rli=0x7fe89c00bed0) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/sql/log_event.cc:11247
#15 0x00000000008a6c38 in Log_event::apply_event (this=0x7fe89c00fd80, rli=0x7fe89c00bed0) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/sql/log_event.cc:2997
#16 0x00000000005bd38e in wsrep_apply_events (ctx=0x7fe89c000990, buf=0x7ff56800de0f, buf_len=444, flags=<value optimized out>, meta=<value optimized out>) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/sql/wsrep_applier.cc:150
#17 wsrep_apply_cb (ctx=0x7fe89c000990, buf=0x7ff56800de0f, buf_len=444, flags=<value optimized out>, meta=<value optimized out>) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/sql/wsrep_applier.cc:226
#18 0x00007fff8bcfa291 in galera::TrxHandle::apply (this=0x7ff56800d9d0, recv_ctx=0x7fe89c000990, apply_cb=0x5bcdc0 <wsrep_apply_cb(void*, void const*, size_t, uint32_t, wsrep_trx_meta_t const*)>, meta=...) at galera/src/trx_handle.cpp:304
#19 0x00007fff8bd333f5 in apply_trx_ws (recv_ctx=0x7fe89c000990, apply_cb=0x5bcdc0 <wsrep_apply_cb(void*, void const*, size_t, uint32_t, wsrep_trx_meta_t const*)>, commit_cb=0x5bc9a0 <wsrep_commit_cb(void*, uint32_t, wsrep_trx_meta_t const*, wsrep_bool_t*, bool)>, trx=..., meta=...) at galera/src/replicator_smm.cpp:39
#20 0x00007fff8bd34263 in galera::ReplicatorSMM::apply_trx (this=0x1471b50, recv_ctx=0x7fe89c000990, trx=0x7ff56800d9d0) at galera/src/replicator_smm.cpp:419
#21 0x00007fff8bd3efa2 in galera::ReplicatorSMM::recv_IST (this=0x1471b50, recv_ctx=0x7fe89c000990) at galera/src/replicator_str.cpp:760
#22 0x00007fff8bd2f10f in galera::ReplicatorSMM::async_recv (this=0x1471b50, recv_ctx=0x7fe89c000990) at galera/src/replicator_smm.cpp:357
#23 0x00007fff8bd43ba3 in galera_recv (gh=<value optimized out>, recv_ctx=<value optimized out>) at galera/src/wsrep_provider.cpp:231
#24 0x00000000005be11f in wsrep_replication_process (thd=0x7fe89c000990) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/sql/wsrep_thd.cc:309
#25 0x00000000005ae42e in start_wsrep_THD (arg=0x5be0d0) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/sql/mysqld.cc:5502
#26 0x00007ffff7bc9ddb in start_thread () from /lib64/libpthread.so.0
#27 0x00007ffff60f1a1d in clone () from /lib64/libc.so.6

Thread 33 (Thread 0x7ff571c03700 (LWP 6428)):
+bt
#0  0x00007ffff6043ba5 in raise () from /lib64/libc.so.6
#1  0x00007ffff60454bb in abort () from /lib64/libc.so.6
#2  0x00000000009950db in lock_rec_has_to_wait (mode=1027, block=<value optimized out>, heap_no=3, trx=0x7fe8c8017be8) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/lock/lock0lock.cc:1051
#3  lock_rec_other_has_conflicting (mode=1027, block=<value optimized out>, heap_no=3, trx=0x7fe8c8017be8) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/lock/lock0lock.cc:1702
#4  0x000000000099973f in lock_rec_lock_slow (impl=0, mode=1027, block=0x7fec37f04d80, heap_no=3, index=0x7fe8c801f238, thr=0x7fe8c8017390) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/lock/lock0lock.cc:2484
#5  lock_rec_lock (impl=0, mode=1027, block=0x7fec37f04d80, heap_no=3, index=0x7fe8c801f238, thr=0x7fe8c8017390) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/lock/lock0lock.cc:2578
#6  0x0000000000999ae3 in lock_clust_rec_read_check_and_lock (flags=<value optimized out>, block=0x7fec37f04d80, rec=<value optimized out>, index=0x7fe8c801f238, offsets=<value optimized out>, mode=<value optimized out>, gap_mode=1024, thr=0x7fe8c8017390) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/lock/lock0lock.cc:6831
#7  0x0000000000a0b4ff in sel_set_rec_lock (block=0x7fec37f04d80, rec=0x7fedb717c127 "", index=0x7fe8c801f238, offsets=0x7ff571bfe810, mode=<value optimized out>, type=<value optimized out>, thr=0x7fe8c8017390) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/row/row0sel.cc:1012
#8  0x0000000000a11679 in row_search_for_mysql (buf=0x7fe8c8015dc0 "\376\020australia\\kevindutler", mode=1, prebuilt=0x7fe8c8016c98, match_mode=<value optimized out>, direction=<value optimized out>) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/row/row0sel.cc:4506
#9  0x0000000000968857 in ha_innobase::general_fetch (this=0x7fe8c8014090, buf=0x7fe8c8015dc0 "\376\020australia\\kevindutler", direction=1, match_mode=0) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/storage/innobase/handler/ha_innodb.cc:8969
#10 0x00000000005c087c in handler::ha_rnd_next (this=0x7fe8c8014090, buf=0x7fe8c8015dc0 "\376\020australia\\kevindutler") at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/sql/handler.cc:2798
#11 0x000000000089fec8 in Rows_log_event::do_table_scan_and_update (this=0x7fe8c800fd80, rli=0x7fe8c800bed0) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/sql/log_event.cc:10862
#12 0x00000000008a59df in Rows_log_event::do_apply_event (this=0x7fe8c800fd80, rli=0x7fe8c800bed0) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/sql/log_event.cc:11247
#13 0x00000000008a6c38 in Log_event::apply_event (this=0x7fe8c800fd80, rli=0x7fe8c800bed0) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/sql/log_event.cc:2997
#14 0x00000000005bd38e in wsrep_apply_events (ctx=0x7fe8c8000990, buf=0x7ff56800b61f, buf_len=454, flags=<value optimized out>, meta=<value optimized out>) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/sql/wsrep_applier.cc:150
#15 wsrep_apply_cb (ctx=0x7fe8c8000990, buf=0x7ff56800b61f, buf_len=454, flags=<value optimized out>, meta=<value optimized out>) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/sql/wsrep_applier.cc:226
#16 0x00007fff8bcfa291 in galera::TrxHandle::apply (this=0x7ff56800b1e0, recv_ctx=0x7fe8c8000990, apply_cb=0x5bcdc0 <wsrep_apply_cb(void*, void const*, size_t, uint32_t, wsrep_trx_meta_t const*)>, meta=...) at galera/src/trx_handle.cpp:304
#17 0x00007fff8bd333f5 in apply_trx_ws (recv_ctx=0x7fe8c8000990, apply_cb=0x5bcdc0 <wsrep_apply_cb(void*, void const*, size_t, uint32_t, wsrep_trx_meta_t const*)>, commit_cb=0x5bc9a0 <wsrep_commit_cb(void*, uint32_t, wsrep_trx_meta_t const*, wsrep_bool_t*, bool)>, trx=..., meta=...) at galera/src/replicator_smm.cpp:39
#18 0x00007fff8bd34263 in galera::ReplicatorSMM::apply_trx (this=0x1471b50, recv_ctx=0x7fe8c8000990, trx=0x7ff56800b1e0) at galera/src/replicator_smm.cpp:419
#19 0x00007fff8bd3efa2 in galera::ReplicatorSMM::recv_IST (this=0x1471b50, recv_ctx=0x7fe8c8000990) at galera/src/replicator_str.cpp:760
#20 0x00007fff8bd2f10f in galera::ReplicatorSMM::async_recv (this=0x1471b50, recv_ctx=0x7fe8c8000990) at galera/src/replicator_smm.cpp:357
#21 0x00007fff8bd43ba3 in galera_recv (gh=<value optimized out>, recv_ctx=<value optimized out>) at galera/src/wsrep_provider.cpp:231
#22 0x00000000005be11f in wsrep_replication_process (thd=0x7fe8c8000990) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/sql/wsrep_thd.cc:309
#23 0x00000000005ae42e in start_wsrep_THD (arg=0x5be0d0) at /usr/src/debug/Percona-XtraDB-Cluster-5.6.15/sql/mysqld.cc:5502
#24 0x00007ffff7bc9ddb in start_thread () from /lib64/libpthread.so.0
#25 0x00007ffff60f1a1d in clone () from /lib64/libc.so.6

The index is a clustered index created automatically by InnoDB for tables that do not have implicit primary key

Revision history for this message

Ovais Tariq (ovais-tariq) wrote on 2014-05-09:

#11

This was definitely not fixed in 5.6.15-25.5.

See my comment https://bugs.launchpad.net/codership-mysql/+bug/1264809/comments/6 which shows that I had a crash even after upgrading to 5.6.15-25.5

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2014-05-13:

#12

@Ovais,

I tested the latest PXC build (with galera 3.5, which was also not available when you tested) and was not able to crash the server.

I used a simplified form of https://github.com/codership/galera/blob/3.x/tests/regressions/lp1019473/run.sh here: https://gist.github.com/fc7f0a3a40c0823d8624

I used the binary equivalent of http://www.percona.com/downloads/TESTING/Percona-XtraDB-Cluster-56/5.6.15-25.5/5.6/771/ and http://www.percona.com/downloads/TESTING/Percona-XtraDB-Cluster-galera-56/galera-3.x/216/

I tested with upto 16 slave threads.

If the test doesn't fit the pattern that you were testing, we can open a
separate bug (since this bug is about regression of lp:1019473), and provide a
testcase if any.

Revision history for this message

Ovais Tariq (ovais-tariq) wrote on 2014-05-14:

#13

The test case does fit the patter. Could galera 3.5 have anything to do with preventing the crash? And you are right, when I had tested galera 3.5 was not available.

Revision history for this message

Alex Yurchenko (ayurchen) wrote on 2014-05-20:

#14

This is unlikely to do anything with Galera version. Could it be that you tested with wrong MySQL-wsrep or PXC release?

Revision history for this message

Ovais Tariq (ovais-tariq) wrote on 2014-05-20:

#15

@Alex,

So initially the customer was running Percona XtraDB Cluster 5.6.15-25.4 when he hit this bug, this is before my comment #6. After Percona XtraDB Cluster 5.6.15-25.5 was released on 3/20, I upgraded the nodes to 5.6.15-25.5 and still had the crash.
Comment #6 and following comments are from crashes on 5.6.15-25.5.

This bug is being hit perhaps in a different way?
In comment #1 Seppo indicated that "if mysql would use full table scan, we would not hit this issue".
However, based on the crash that I had and the stack traces in comment #8 and #9, we can see that MySQL is doing a full table scan, but still hits this issue.

Unfortunately, I do not have the specific test case anymore. The customer added primary keys to the tables that were hit with the issue, so I won't be able to reproduce it anymore.

I suppose you might get some required information from the attached gdb back traces?

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2014-05-22:

#16

One more thing, the PXC that I used for regression test was slightly ahead of
25.5 release (with a few fixes for unique keys both in PXC and Galera), so you may want to give it a try, preferrably
on test environment (if any) wherein it was crashing earlier.

Revision history for this message

Shahriyar Rzayev (rzayev-sehriyar) wrote on 2018-01-18:

#17

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1565

	Status	Importance	Assigned to	Milestone
MySQL patches by Codership	Status tracked in 5.6
5.5	Fix Released	High	Seppo Jaakola	MySQL patches by Codership 5.5.37-25.10
5.6	Fix Released	High	Seppo Jaakola	MySQL patches by Codership 5.6.16-25.5
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC	Status tracked in 5.6
5.5	Fix Released	Undecided	Unassigned	Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC 5.5.37-25.10
5.6	Fix Released	Undecided	Unassigned	Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC 5.6.15-25.5

MySQL patches by Codership

regression test for lp:1019473 causes server abort

Bug Description

Related branches

Other bug subscribers

Bug attachments

Remote bug watches