MySQL patches by Codership

Server crashes when wsrep_cluster_address is changed and wsrep_sst_method is not mysqldump

Bug #600250 reported by Alex Yurchenko on 2010-06-30

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	MySQL patches by Codership	Fix Released	Undecided	Alex Yurchenko	MySQL patches by Codership 0.8

Bug Description

Current code leads to a bad crash when wsrep_cluster_address is changed, because one thread is trying to initialize wsrep provider while another tries to shutdown mysqld because non-mysqldump snapshot is not supported when storage engines are initialized.

Error log:
100629 16:21:14 [Note] WSREP: Stop replication
100629 16:21:14 [Note] WSREP: Closing send monitor...
100629 16:21:14 [Note] WSREP: Closed send monitor.
100629 16:21:14 [Note] WSREP: gcomm: terminating thread
100629 16:21:14 [Note] WSREP: gcomm: joining thread
100629 16:21:14 [Note] WSREP: gcomm: closing backend
100629 16:21:14 [Note] WSREP: New COMPONENT: primary = no, my_idx = 0, memb_num = 1
100629 16:21:14 [Warning] WSREP: socket in state 0
100629 16:21:14 [Note] WSREP: gcomm: closed
100629 16:21:14 [Note] WSREP: Flow-control interval: [0, 1]
100629 16:21:14 [Note] WSREP: Received NON-PRIMARY.
100629 16:21:14 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 565558)
100629 16:21:14 [Note] WSREP: Received self-leave message.
100629 16:21:14 [Note] WSREP: Flow-control interval: [-6917529027641081856, -9223372036854775808
]
100629 16:21:14 [Note] WSREP: Received SELF-LEAVE. Closing connection.
100629 16:21:14 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 565558)
100629 16:21:14 [Note] WSREP: RECV thread exiting 0: Success
100629 16:21:14 [Note] WSREP: recv_thread() joined.
100629 16:21:14 [Note] WSREP: Closing slave action queue.
100629 16:21:14 [Note] WSREP: Closed GCS connection
100629 16:21:14 [Note] WSREP: gcs_recv() returned -77 (File descriptor in bad state)
...
100629 16:21:14 [Note] wsrep recv thread exiting (code:5)
100629 16:21:14 [Note] WSREP starting shutdown
...
100629 16:21:14 [Note] /tmp/galera/mysql/libexec/mysqld: Normal shutdown

100629 16:21:14 [Note] WSREP: gcs_recv() returned -77 (File descriptor in bad state)
100629 16:21:14 [Note] WSREP: Stop replication
...
100629 16:21:14 [Note] wsrep recv thread exiting (code:5)
100629 16:21:14 [Note] WSREP: mm_galera_recv(): return 0
100629 16:21:14 [Note] wsrep recv thread exiting (code:0)
100629 16:21:16 [Note] WSREP: rollbacker thread exiting
100629 16:21:16 [Note] WSREP: Shifting CLOSED -> DESTROYED (TO: 565558)
...
*** glibc detected *** /tmp/galera/mysql/libexec/mysqld: corrupted double-linked list: 0x000000000ddf3370 ***
100629 16:21:16 [Note] WSREP:
100629 16:21:16 [Note] WSREP: Start replication
100629 16:21:16 [Note] WSREP: Provider options: log_debug = 0; persistent_writesets = 0; local_cache_size = 20971520; dbug_spec = ;
100629 16:21:16 [Note] WSREP: Configured state: 7cbe6e26-83c6-11df-0800-ef472a8480ef:565558

Changing wsrep_cluster_address should be supported unless we really need to take snapshot after it (in case when we want to get out of split brain, we don't).

Suggested solution: mimic successful SST (since we have state UUID and state seqno) and rely on provider logic to discover that the current state UUID and seqno don't correspond to group's.

Alex Yurchenko (ayurchen) on 2010-06-30

Changed in codership-mysql:
assignee:	nobody → Alex Yurchenko (ayurchen)
milestone:	none → 0.8
status:	New → Confirmed

Revision history for this message

Alex Yurchenko (ayurchen) wrote on 2011-06-21:

duplicate of lp:711993

Changed in codership-mysql:
status:	Confirmed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.