exclusive key does not depend on shared key

Bug #1036774 reported by Teemu Ollakka on 2012-08-14
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Galera
High
Teemu Ollakka
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Fix Released
Undecided
Unassigned

Bug Description

When exclusive key matches a key in certification index it should always check for shared reference (even if exclusive reference exist) and update dependency accordingly.

This bug causes some failures in regression tests for https://bugs.launchpad.net/codership-mysql/+bug/1013978

Related branches

Changed in galera:
milestone: none → 23.2.2
importance: Undecided → High
Teemu Ollakka (teemu-ollakka) wrote :
Changed in galera:
assignee: nobody → Teemu Ollakka (teemu-ollakka)
status: New → Fix Committed
Changed in percona-xtradb-cluster:
milestone: none → percona-xtradb-cluster-5.5.27
status: New → Fix Released

Well, after the fix applied the server started to crash on replication with the following messages in the log:
===
120911 14:03:08 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.27-log' socket: '/home/b_mgl/mysql.sock' port: 4001 Percona XtraDB Cluster (
GPL), wsrep_23.6.r356
120911 14:03:49 [Note] WSREP: ready state reached
120911 14:03:49 [Note] Slave SQL thread initialized, starting replication in log 'master-bin.
000589' at position 691502044, relay log './slave-bin.000472' position: 442999719
120911 14:03:49 [Note] Slave I/O thread: connected to master 'replicant@192.168.194.185:4001'
,replication started in log 'master-bin.000589' at position 691502444
04:03:49 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=10
max_threads=151
thread_count=10
connection_count=10
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 338675 K bytes of memo
ry
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x2ac724000990
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...

stack_bottom = 2ac72003fe78 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x7c5fb5]
/usr/sbin/mysqld(handle_fatal_signal+0x4a4)[0x6a00f4]
/lib64/libpthread.so.0(+0xf500)[0x2ac58f33f500]
/usr/sbin/mysqld(wsrep_append_foreign_key+0xa2)[0x816cc2]
/usr/sbin/mysqld[0x84dc80]
/usr/sbin/mysqld[0x85100e]
/usr/sbin/mysqld[0x85218a]
/usr/sbin/mysqld[0x83ba01]
/usr/sbin/mysqld[0x81bc2f]
/usr/sbin/mysqld(_ZN7handler13ha_delete_rowEPKh+0x5e)[0x6a4aee]
/usr/sbin/mysqld(_ZN21Delete_rows_log_event11do_exec_rowEPK14Relay_log_info+0x148)[0x7428f8]
/usr/sbin/mysqld(_ZN14Rows_log_event14do_apply_eventEPK14Relay_log_info+0x22d)[0x7480fd]
/usr/sbin/mysqld(_Z26apply_event_and_update_posP9Log_eventP3THDP14Relay_log_info+0x125)[0x531
7b5]
/usr/sbin/mysqld[0x535af7]
/usr/sbin/mysqld(handle_slave_sql+0xa45)[0x537025]
/lib64/libpthread.so.0(+0x7851)[0x2ac58f337851]
/lib64/libc.so.6(clone+0x6d)[0x2ac5900ff11d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): Connection ID (thread ID): 12
Status: NOT_KILLED
===

Seppo Jaakola (seppo-jaakola) wrote :

The crash is in slave SQL thread, are you using both mysql replication and Galera replication? Please show your configuration and overview of your cluster topology.

Galera Cluster is compatible with MySQL replication with *certain* use cases. However, the server should not crash even if configuration is not supported.

Download full text (4.6 KiB)

> The crash is in slave SQL thread, are you using both mysql replication and Galera replication?

Yes, I do. However, the crash is happening even when there is just one Galera node present (the one which is slave to the master running original MySQL 5.5).

> Please show your configuration and overview of your cluster topology.

It's quite simple. I was trying to migrate from master-slave to Galera-based replication. So, currently, I have 3 nodes:

1. node a is the original master;
2. node b is the original slave where I replaced MySQL with Percona XtraDB Cluster, enabled replication to synchronise to the master;
3. node c is an exact copy of the node b with slightly updated my.cnf to make it work with Galera.

The relevant parts of my.cnf from nodes b and c follow:

=== node b ===
[mysqld]
innodb_buffer_pool_size=4G
innodb_file_per_table
innodb_flush_log_at_trx_commit=0
innodb_log_file_size=128M
innodb_flush_method=O_DIRECT
table_open_cache=2048
query_cache_size=32M
read_rnd_buffer_size = 1M
binlog-format=ROW
report-host=xtradb-1
relay-log=slave-bin
log-bin=master-bin
log-slave-updates
server-id=10011
expire_logs_days=4

wsrep_provider=/usr/lib64/libgalera_smm.so
wsrep_provider_options ="gmcast.listen_addr=tcp://0.0.0.0:5001;ist.recv_addr=192.168.5.1:6001; "
wsrep_cluster_name=mgl_cluster
wsrep_node_name=xtradb-1
wsrep_node_address=192.168.5.1
wsrep_slave_threads=8
wsrep_sst_method=xtrabackup
wsrep_sst_receive_address=192.168.5.1:7001

default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
innodb_locks_unsafe_for_binlog=1

[mysqld_safe]
wsrep_urls=gcomm://192.168.5.1:5001,gcomm://192.168.5.2:5001,gcomm://
===

node c's my.cnf is the same except for the following bits:
== node c ===
#log-slave-updates
wsrep_provider_options ="gmcast.listen_addr=tcp://0.0.0.0:5001;ist.recv_addr=192.168.5.2:6001; "
wsrep_node_name=xtradb-2
wsrep_node_address=192.168.5.2
wsrep_sst_receive_address=192.168.5.2:7001
===

Node a is running MySQL 5.5.18 and has the exact same config file (except for Galera specific bits).

Now, when there is just node a running, I'm starting node b and wait until it successfully initialized. Then I start the MySQL replication to synchronize node b with the master (node a). At first I was getting a lot of the following messages:
===
WSREP: skipping FK key append
===

and then, at some point, I've got:
===

14:11:22 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=11
max_threads=151
thread_count=9
connection_count=9
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 338675 K bytes of memo
ry
Hope that's ok; if not, decrease some vari...

Read more...

Changed in galera:
status: Fix Committed → Fix Released

Alex,

Could you share the commit reference please? I see that packages at Percona weren't updated yet. I've also tried to locate the corresponding fix in the repository, but failed to locate the corresponding commit. Thanks!

Alex Yurchenko (ayurchen) wrote :

Dmitry, please see comment #1 above: https://bugs.launchpad.net/galera/+bug/1036774/comments/1

The fix was committed in revision 130. If you mean the problem with the slave thread crash that you reported above - I can't see how it can be related.

If you're still having that crash, please try the same test with the latest code from lp:codership-mysql/5.5.

Seppo Jaakola (seppo-jaakola) wrote :

Dmitry, I'm working on bug: https://bugs.launchpad.net/galera/+bug/1078346 which shows quite similar stack trace (except that it happens through direct client connection).

Can you reproduce this slave crash? Do you have relay log files from crash time available?

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1240

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers