Galera

sometimes IST does not start

Bug #985747 reported by Alex Yurchenko on 2012-04-19

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	Galera	Fix Released	High	Teemu Ollakka	Galera 23.2.1

Bug Description

Reported in https://groups.google.com/forum/?fromgroups#!topic/codership-team/_5_cPHOEVtI

- node1 : named cygnus is the node which was stopped and that is now coming back as Joiner and should received properly IST
- node2 : named laurane is the node which was choosed as Donor.

node1's shutdown was initiated at 14:46:47 as the log shows.

node1's log :

120419 14:30:02 [Note] WSREP: MDL conflict -> BF abort
request: (2 seqno 5154113 mode 1 Qstate 1 cmd 0 8 TRUNCATE TABLE poller_time)
granted: (583918 seqno 5154114 mode 0 Qstate 1 cmd 3 5 INSERT INTO poller_time (poller_id, pid, start_time, end_time) VALUES (0, 19630, NOW()

120419 14:46:47 [Note] /opt/mysql-galera/bin/mysqld: Normal shutdown

120419 14:46:47 [Note] WSREP: Stop replication
120419 14:46:47 [Note] WSREP: Closing send monitor...
120419 14:46:47 [Note] WSREP: Closed send monitor.
120419 14:46:47 [Note] WSREP: gcomm: terminating thread
120419 14:46:47 [Note] WSREP: gcomm: joining thread
120419 14:46:47 [Note] WSREP: gcomm: closing backend
120419 14:46:47 [Note] WSREP: evs::proto(2d6df99a-886e-11e1-0800-15e7fc053cbb, LEAVING, view_id(REG,2d6df99a-886e-11e1-0800-15e7fc053cbb,75)) uuid c61c9f05-8914-11e1-0800-60eb1083cb96 missing from install message, assuming partitioned
120419 14:46:47 [Note] WSREP: evs::proto(2d6df99a-886e-11e1-0800-15e7fc053cbb, LEAVING, view_id(REG,2d6df99a-886e-11e1-0800-15e7fc053cbb,75)) uuid d2e4fb87-8870-11e1-0800-38327ad7aba0 missing from install message, assuming partitioned
120419 14:46:47 [Note] WSREP: GMCast::handle_stable_view: view(view_id(NON_PRIM,2d6df99a-886e-11e1-0800-15e7fc053cbb,75) memb {
        2d6df99a-886e-11e1-0800-15e7fc053cbb,
} joined {
} left {
} partitioned {
        c61c9f05-8914-11e1-0800-60eb1083cb96,
        d2e4fb87-8870-11e1-0800-38327ad7aba0,
})
120419 14:46:47 [Note] WSREP: New COMPONENT: primary = no, my_idx = 0, memb_num = 1
120419 14:46:47 [Note] WSREP: GMCast::handle_stable_view: view((empty))
120419 14:46:47 [Note] WSREP: gcomm: closed
120419 14:46:47 [Note] WSREP: Flow-control interval: [8, 16]
120419 14:46:47 [Note] WSREP: Received NON-PRIMARY.
120419 14:46:47 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 5160086)
120419 14:46:47 [Note] WSREP: Received self-leave message.
120419 14:46:47 [Note] WSREP: Flow-control interval: [0, 0]
120419 14:46:47 [Note] WSREP: Received SELF-LEAVE. Closing connection.
120419 14:46:47 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 5160086)
120419 14:46:47 [Note] WSREP: RECV thread exiting 0: Success
120419 14:46:47 [Note] WSREP: recv_thread() joined.
120419 14:46:47 [Note] WSREP: Closing slave action queue.
120419 14:46:47 [Note] WSREP: New cluster view: global state: 61408137-81c8-11e1-0800-5dbd477990e1:5160086, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 1
120419 14:46:48 [Note] WSREP: New cluster view: global state: 61408137-81c8-11e1-0800-5dbd477990e1:5160086, view# -1: non-Primary, number of nodes: 0, my index: -1, protocol version 1
120419 14:46:48 [Note] WSREP: applier thread exiting (code:0)
120419 14:46:49 [Note] WSREP: SST kill local trx: 588248
120419 14:46:49 [Note] WSREP: rollbacker thread exiting
120419 14:46:49 [Note] Event Scheduler: Purging the queue. 0 events
120419 14:46:49 [Note] WSREP: dtor state: CLOSED
120419 14:46:49 [Note] WSREP: mon: entered 3 oooe fraction 0 oool fraction 0
120419 14:46:49 [Note] WSREP: mon: entered 3 oooe fraction 0 oool fraction 0
120419 14:46:49 [Note] WSREP: mon: entered 930739 oooe fraction 0 oool fraction 0.000114962
120419 14:46:49 [Note] WSREP: cert index usage at exit 804
120419 14:46:49 [Note] WSREP: cert trx map usage at exit 309
120419 14:46:49 [Note] WSREP: deps set usage at exit 0
120419 14:46:49 [Note] WSREP: avg deps dist 330.854
120419 14:46:49 [Note] WSREP: wsdb trx map usage 0 conn query map usage 0
120419 14:46:49 [Note] WSREP: Shifting CLOSED -> DESTROYED (TO: 5160086)
120419 14:46:49 [Note] WSREP: Flushing memory map to disk...
120419 14:46:50 [Note] unregister_replicator OK
120419 14:46:50 InnoDB: Starting shutdown...
120419 14:46:53 InnoDB: Shutdown completed; log sequence number 12275710641
120419 14:46:53 [Note] /opt/mysql-galera/bin/mysqld: Shutdown complete

120419 14:46:54 mysqld_safe mysqld from pid file /opt/mysql-galera/data/mysqld.pid ended
120419 14:47:38 mysqld_safe Starting mysqld daemon with databases from /opt/mysql-galera/data
120419 14:47:38 [Note] WSREP: wsrep_load(): loading provider library '/opt/mysql-galera/lib/plugin/libgalera_smm.so'
120419 14:47:38 [Note] WSREP: wsrep_load(): Galera 2.0(rXXXX) by Codership Oy <email address hidden> loaded succesfully.
120419 14:47:38 [Note] WSREP: Reusing existing '/opt/mysql-galera/data//galera.cache'.
120419 14:47:38 [Note] WSREP: Passing config to GCS: gcache.dir = /opt/mysql-galera/data/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /opt/mysql-galera/data//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 0.5; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
120419 14:47:38 [Note] WSREP: wsrep_sst_grab()
120419 14:47:38 [Note] WSREP: Start replication
120419 14:47:38 [Note] WSREP: Found saved state: 61408137-81c8-11e1-0800-5dbd477990e1:5160086
120419 14:47:38 [Note] WSREP: Assign initial position for certification: 5160086, protocol version: -1
120419 14:47:38 [Note] WSREP: Setting initial position to 61408137-81c8-11e1-0800-5dbd477990e1:5160086
120419 14:47:38 [Note] WSREP: protonet asio version 0
120419 14:47:38 [Note] WSREP: backend: asio
120419 14:47:38 [Note] WSREP: GMCast version 0
120419 14:47:38 [Note] WSREP: (d86affd3-8a1d-11e1-0800-5ea3f3e2042e, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
120419 14:47:38 [Note] WSREP: (d86affd3-8a1d-11e1-0800-5ea3f3e2042e, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
120419 14:47:38 [Note] WSREP: EVS version 0
120419 14:47:38 [Note] WSREP: PC version 0
120419 14:47:38 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '192.168.0.5:4567'
120419 14:47:39 [Note] WSREP: GMCast::handle_stable_view: view(view_id(PRIM,c61c9f05-8914-11e1-0800-60eb1083cb96,77) memb {
        c61c9f05-8914-11e1-0800-60eb1083cb96,
        d2e4fb87-8870-11e1-0800-38327ad7aba0,
        d86affd3-8a1d-11e1-0800-5ea3f3e2042e,
} joined {
} left {
} partitioned {
})
120419 14:47:39 [Note] WSREP: declaring c61c9f05-8914-11e1-0800-60eb1083cb96 stable
120419 14:47:39 [Note] WSREP: declaring d2e4fb87-8870-11e1-0800-38327ad7aba0 stable
120419 14:47:39 [Note] WSREP: gcomm: connected
120419 14:47:39 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
120419 14:47:39 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
120419 14:47:39 [Note] WSREP: Opened channel 'my_wsrep_cluster'
120419 14:47:39 [Note] WSREP: New COMPONENT: primary = yes, my_idx = 2, memb_num = 3
120419 14:47:39 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
120419 14:47:39 [Note] WSREP: Waiting for SST to complete.
120419 14:47:39 [Note] WSREP: STATE EXCHANGE: sent state msg: d911f3b9-8a1d-11e1-0800-94d10f7c556a
120419 14:47:39 [Note] WSREP: STATE EXCHANGE: got state msg: d911f3b9-8a1d-11e1-0800-94d10f7c556a from 0 (laurane.lolomin.net)
120419 14:47:39 [Note] WSREP: STATE EXCHANGE: got state msg: d911f3b9-8a1d-11e1-0800-94d10f7c556a from 1 (vmfedora1)
120419 14:47:39 [Note] WSREP: STATE EXCHANGE: got state msg: d911f3b9-8a1d-11e1-0800-94d10f7c556a from 2 (cygnus.lolomin.net)
120419 14:47:39 [Note] WSREP: Quorum results:
        version = 2,
        component = PRIMARY,
        conf_id = 63,
        members = 2/3 (joined/total),
        act_id = 5160427,
        last_appl. = -1,
        protocols = 0/3/1 (gcs/repl/appl),
        group UUID = 61408137-81c8-11e1-0800-5dbd477990e1
120419 14:47:39 [Note] WSREP: Flow-control interval: [14, 28]
120419 14:47:39 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 5160427)
120419 14:47:39 [Note] WSREP: State transfer required:
        Group state: 61408137-81c8-11e1-0800-5dbd477990e1:5160427
        Local state: 61408137-81c8-11e1-0800-5dbd477990e1:5160086
120419 14:47:39 [Note] WSREP: New cluster view: global state: 61408137-81c8-11e1-0800-5dbd477990e1:5160427, view# 64: Primary, number of nodes: 3, my index: 2, protocol version 1
120419 14:47:39 [Warning] WSREP: Gap in state sequence. Need state transfer.
120419 14:47:41 [Note] WSREP: Running: 'wsrep_sst_rsync 'joiner' '192.168.0.1' 'sst:5T13wPid' '/opt/mysql-galera/data/' '/etc/my-galera.cnf' '6751' 2>sst.err'
120419 14:47:42 [Note] WSREP: Prepared SST request: rsync|192.168.0.1:4444/rsync_sst
ERROR 2003 (HY000): Can't connect to MySQL server on 'cygnus' (111)
120419 14:47:42 [Note] WSREP: Assign initial position for certification: 5160427, protocol version: 2
120419 14:47:42 [Note] WSREP: Prepared IST receiver, listening at: tcp://192.168.0.1:4568
120419 14:47:42 [Note] WSREP: Node 2 (cygnus.lolomin.net) requested state transfer from '*any*'. Selected 0 (laurane.lolomin.net)(SYNCED) as donor.
120419 14:47:42 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 5160434)
120419 14:47:42 [Note] WSREP: Requesting state transfer: success, donor: 0
120419 14:47:44 [Note] WSREP: SST complete, seqno: 5160086
120419 14:47:44 InnoDB: The InnoDB memory heap is disabled
120419 14:47:44 InnoDB: Mutexes and rw_locks use GCC atomic builtins
120419 14:47:44 InnoDB: Compressed tables use zlib 1.2.3
120419 14:47:44 InnoDB: Using Linux native AIO
120419 14:47:44 InnoDB: Initializing buffer pool, size = 512.0M
120419 14:47:44 InnoDB: Completed initialization of buffer pool
120419 14:47:44 InnoDB: highest supported file format is Barracuda.
120419 14:47:45 InnoDB: Waiting for the background threads to start
120419 14:47:46 InnoDB: 1.1.8 started; log sequence number 12275710641
120419 14:47:47 [Note] Event Scheduler: Loaded 0 events
120419 14:47:47 [Note] WSREP: Signalling provider to continue.
120419 14:47:47 [Note] WSREP: Received SST: 61408137-81c8-11e1-0800-5dbd477990e1:5160086
120419 14:47:47 [Note] /opt/mysql-galera/bin/mysqld: ready for connections.
Version: '5.5.20-log' socket: '/opt/mysql-galera/data/mysql.sock' port: 3306 Source distribution, wsrep_23.4.r3713
120419 14:47:47 [Note] WSREP: SST received: 61408137-81c8-11e1-0800-5dbd477990e1:5160086
120419 14:47:47 [Note] WSREP: Receiving IST: 341 writesets, seqnos 5160086-5160427

node2's log :

120419 10:50:02 [Note] WSREP: MDL conflict -> BF abort
request: (1 seqno 5080636 mode 1 Qstate 1 cmd 0 8 TRUNCATE TABLE poller_time)
granted: (346474 seqno 5080637 mode 0 Qstate 1 cmd 3 5 INSERT INTO poller_time (poller_id, pid, start_time, end_time) VALUES (0, 14193, NOW()
120419 14:46:47 [Note] WSREP: GMCast::handle_stable_view: view(view_id(PRIM,c61c9f05-8914-11e1-0800-60eb1083cb96,76) memb {
        c61c9f05-8914-11e1-0800-60eb1083cb96,
        d2e4fb87-8870-11e1-0800-38327ad7aba0,
} joined {
} left {
} partitioned {
        2d6df99a-886e-11e1-0800-15e7fc053cbb,
})
120419 14:46:47 [Note] WSREP: New COMPONENT: primary = yes, my_idx = 0, memb_num = 2
120419 14:46:47 [Note] WSREP: forgetting 2d6df99a-886e-11e1-0800-15e7fc053cbb (tcp://192.168.0.1:4567)
120419 14:46:47 [Note] WSREP: declaring d2e4fb87-8870-11e1-0800-38327ad7aba0 stable
120419 14:46:47 [Note] WSREP: STATE_EXCHANGE: sent state UUID: ba05ef61-8a1d-11e1-0800-058e29c0df9f
120419 14:46:47 [Note] WSREP: STATE EXCHANGE: sent state msg: ba05ef61-8a1d-11e1-0800-058e29c0df9f
120419 14:46:47 [Note] WSREP: STATE EXCHANGE: got state msg: ba05ef61-8a1d-11e1-0800-058e29c0df9f from 0 (laurane.lolomin.net)
120419 14:46:47 [Note] WSREP: STATE EXCHANGE: got state msg: ba05ef61-8a1d-11e1-0800-058e29c0df9f from 1 (vmfedora1)
120419 14:46:47 [Note] WSREP: Quorum results:
        version = 2,
        component = PRIMARY,
        conf_id = 62,
        members = 2/2 (joined/total),
        act_id = 5160086,
        last_appl. = 5159779,
        protocols = 0/3/1 (gcs/repl/appl),
        group UUID = 61408137-81c8-11e1-0800-5dbd477990e1
120419 14:46:47 [Note] WSREP: Flow-control interval: [12, 23]
120419 14:46:47 [Note] WSREP: New cluster view: global state: 61408137-81c8-11e1-0800-5dbd477990e1:5160086, view# 63: Primary, number of nodes: 2, my index: 0, protocol version 1
120419 14:46:48 [Note] WSREP: Assign initial position for certification: 5160086, protocol version: 2
120419 14:46:52 [Note] WSREP: cleaning up 2d6df99a-886e-11e1-0800-15e7fc053cbb (tcp://192.168.0.1:4567)
120419 14:47:39 [Note] WSREP: GMCast::handle_stable_view: view(view_id(PRIM,c61c9f05-8914-11e1-0800-60eb1083cb96,77) memb {
        c61c9f05-8914-11e1-0800-60eb1083cb96,
        d2e4fb87-8870-11e1-0800-38327ad7aba0,
        d86affd3-8a1d-11e1-0800-5ea3f3e2042e,
} joined {
} left {
} partitioned {
})
120419 14:47:39 [Note] WSREP: New COMPONENT: primary = yes, my_idx = 0, memb_num = 3
120419 14:47:39 [Note] WSREP: declaring d2e4fb87-8870-11e1-0800-38327ad7aba0 stable
120419 14:47:39 [Note] WSREP: declaring d86affd3-8a1d-11e1-0800-5ea3f3e2042e stable
120419 14:47:39 [Note] WSREP: STATE_EXCHANGE: sent state UUID: d911f3b9-8a1d-11e1-0800-94d10f7c556a
120419 14:47:39 [Note] WSREP: STATE EXCHANGE: sent state msg: d911f3b9-8a1d-11e1-0800-94d10f7c556a
120419 14:47:39 [Note] WSREP: STATE EXCHANGE: got state msg: d911f3b9-8a1d-11e1-0800-94d10f7c556a from 0 (laurane.lolomin.net)
120419 14:47:39 [Note] WSREP: STATE EXCHANGE: got state msg: d911f3b9-8a1d-11e1-0800-94d10f7c556a from 1 (vmfedora1)
120419 14:47:39 [Note] WSREP: STATE EXCHANGE: got state msg: d911f3b9-8a1d-11e1-0800-94d10f7c556a from 2 (cygnus.lolomin.net)
120419 14:47:39 [Note] WSREP: Quorum results:
        version = 2,
        component = PRIMARY,
        conf_id = 63,
        members = 2/3 (joined/total),
        act_id = 5160427,
        last_appl. = 5160161,
        protocols = 0/3/1 (gcs/repl/appl),
        group UUID = 61408137-81c8-11e1-0800-5dbd477990e1
120419 14:47:39 [Note] WSREP: Flow-control interval: [14, 28]
120419 14:47:39 [Note] WSREP: New cluster view: global state: 61408137-81c8-11e1-0800-5dbd477990e1:5160427, view# 64: Primary, number of nodes: 3, my index: 0, protocol version 1
120419 14:47:40 [Note] WSREP: Assign initial position for certification: 5160427, protocol version: 2
120419 14:47:42 [Note] WSREP: Node 2 (cygnus.lolomin.net) requested state transfer from '*any*'. Selected 0 (laurane.lolomin.net)(SYNCED) as donor.
120419 14:47:42 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 5160434)
120419 14:47:42 [Note] WSREP: IST request: 61408137-81c8-11e1-0800-5dbd477990e1:5160086-5160427|tcp://192.168.0.1:4568
120419 14:47:43 [Note] WSREP: Running: 'wsrep_sst_rsync 'donor' '192.168.0.1:4444/rsync_sst' 'sst:5T13wPid' '/opt/mysql-galera/data/' '/etc/my-galera.cnf' '61408137-81c8-11e1-0800-5dbd477990e1' '5160086' '1''
120419 14:47:43 [Note] WSREP: sst_donor_thread signaled with 0
120419 14:47:43 [Note] WSREP: async IST sender starting to serve tcp://192.168.0.1:4568 sending 5160087-5160427

node1's netstat :
# netstat -lpvnt | grep -iE "(3306|4567|4568)"
tcp 0 0 0.0.0.0:4567 0.0.0.0:* LISTEN 6751/mysqld
tcp 0 0 192.168.0.1:3306 0.0.0.0:* LISTEN 6751/mysqld

node2's netstat :
# netstat -lpvnt | grep -iE "(3306|4567|4568)"
tcp 0 0 192.168.0.5:3306 0.0.0.0:* LISTEN 23714/mysqld
tcp 0 0 0.0.0.0:4567 0.0.0.0:* LISTEN 23714/mysqld

Related branches

lp:galera/2.x

Alex Yurchenko (ayurchen) on 2012-04-19

Changed in galera:
importance:	Undecided → High
milestone:	none → 23.2.1
status:	New → Confirmed

Revision history for this message

Henrik Ingo (hingo) wrote on 2012-04-20:

Just out of curiosity: It seems Galera is trying to do both IST and SST at the same time? Is this normal or part of the bug?

Revision history for this message

Alex Yurchenko (ayurchen) wrote on 2012-04-20: Re: [Bug 985747] Re: sometimes IST does not start

On 2012-04-20 14:07, Henrik Ingo wrote:
> Just out of curiosity: It seems Galera is trying to do both IST and
> SST
> at the same time? Is this normal or part of the bug?

It is a part of the protocol:

- Connect
- SST (if needed)
- SE initialization
- IST (if possible)
- ready for connections

If IST is possible, SST is "faked".

Revision history for this message

Laurent Minost (lolomin) wrote on 2012-04-26:

Hi,

I'm facing this bug very frequently when restarting our nodes, this is not the case everytime but almost 80% of time. The behaviour that is most problematic with this bug is that the two nodes are then stalled in a state where they seems to wait each other for the IST to end and so two nodes are then unavailable which needs direct human action then to recover the situation, by the way as new information on the node which should received IST to recover from the situation I need to kill -9 mysql otherwise it cannot stop properly by itself.

node1 (receiver/recovering node) :
...
Version: '5.5.20-log' socket: '/opt/mysql-galera/data/mysql.sock' port: 3306 Source distribution, wsrep_23.4.rXXXX
120426 9:39:06 [Note] WSREP: SST received: 61408137-81c8-11e1-0800-5dbd477990e1:8373272
120426 9:39:06 [Note] WSREP: Receiving IST: 114 writesets, seqnos 8373272-8373386
...

node2 (donor) :
...
120426 9:39:05 [Note] WSREP: IST request: 61408137-81c8-11e1-0800-5dbd477990e1:8373272-8373386|tcp://192.168.0.10:4568
120426 9:39:05 [Note] WSREP: Running: 'wsrep_sst_rsync 'donor' '192.168.0.10:4444/rsync_sst' 'sst:5T13wPid' '/opt/mysql-galera/data/' '/etc/my-galera.cnf' '61408137-81c8-11e1-0800-5dbd477990e1' '8373272' '1''
120426 9:39:05 [Note] WSREP: sst_donor_thread signaled with 0
120426 9:39:05 [Note] WSREP: async IST sender starting to serve tcp://192.168.0.10:4568 sending 8373273-8373386
...

As I'm able to reproduce the problem "easily" for the moment, if you need any other informations or debug trace don't hesitate.

Revision history for this message

Alex Yurchenko (ayurchen) wrote on 2012-04-27:

Hi Laurent,

We have identified one bug in the IST handshake serialisation and are working on fixing that.

When IST fails in your cluster, does it happen between _identical_ or different Galera binaries? By identical I mean libgalera_smm.so having the same checksum. If it is always different, then we might be fixing it.

If not, could you create a 2-node cluster where this can be easily reproduced and give us "sufficient" access to it? (By "sufficient" I mean enough privileges to gdb the mysqld process and run tcpdump)

Regards,
Alex

Revision history for this message

Laurent Minost (lolomin) wrote on 2012-04-28:

Hi Alexei,

Here are the md5sum of each galera lib :

[root@cygnus plugin]# md5sum libgalera_smm.so
581e4b135ad57ca1bacdb98a1f666dac libgalera_smm.so

[root@laurane plugin]# md5sum libgalera_smm.so
47b9432d9c5802e4ba7499dec58a404a libgalera_smm.so

[root@vmfedora1 plugin]# md5sum libgalera_smm.so
c828b9624b9c869a2137498b29aaedb6 libgalera_smm.so

Since the compilation was done on each node independantly, md5sums are different.

Regards,

Laurent

Revision history for this message

Alex Yurchenko (ayurchen) wrote on 2012-04-28:

Could you try the same binary on all of the nodes?

Revision history for this message

Laurent Minost (lolomin) wrote on 2012-04-30:

Hi Alexei,

Sorry, I think this is not possible because I have the first node which is 64bits and the two other nodes are 32 bits.

I tried to replace the ligalera_smm.so 64bits from the first node with the libgalera_smm.so 32 bits from the second node and I had :

120430 09:55:32 mysqld_safe Starting mysqld daemon with databases from /opt/mysql-galera/data
120430 9:55:32 [Note] WSREP: wsrep_load(): loading provider library '/opt/mysql-galera/lib/plugin/libgalera_smm.so'
120430 9:55:32 [ERROR] WSREP: wsrep_load(): dlopen(): /opt/mysql-galera/lib/plugin/libgalera_smm.so: wrong ELF class: ELFCLASS32
120430 9:55:32 [ERROR] WSREP: wsrep_load(/opt/mysql-galera/lib/plugin/libgalera_smm.so) failed: Invalid argument (22). Reverting to no provider.
120430 9:55:32 [Note] WSREP: wsrep_load(): loading provider library 'none'
120430 9:55:32 [ERROR] Aborting

120430 9:55:32 [Note] WSREP: Service disconnected.
120430 9:55:33 [Note] WSREP: Some threads may fail to exit.
120430 9:55:33 [Note] /opt/mysql-galera/bin/mysqld: Shutdown complete

120430 09:55:33 mysqld_safe mysqld from pid file /opt/mysql-galera/data/mysqld.pid ended

What was the purpose of having all nodes with same binary please ?
Thanks.

Laurent

Revision history for this message

Alex Yurchenko (ayurchen) wrote on 2012-04-30:

Ok, this is exactly the issue we're working on: IST handshake between 32bit and 64bit platforms is broken.

Revision history for this message

Laurent Minost (lolomin) wrote on 2012-04-30:

Hi Alex,

Ok I see, so if you need some others tests, I can do some for you, don't hesitate. I cannot provide direct access to these machines due to NAT/Fw restrictions that could be hard/long to solve.

thanks,

Laurent

Revision history for this message

Laurent Minost (lolomin) wrote on 2012-05-04:

#10

Hi Alex,

As I've just noticed this behaviour, I was thinking it could be useful for you :

I have seen that IST seems to not working when donor is i686 and receiver is x64 but when donor is x64 and receiver is i686 then it seems to work properly, this was the case a few minutes ago when I've restarted one of my node and I also noticed this "behaviour" a few days ago:

120504 16:16:20 [Note] WSREP: Assign initial position for certification: 12196479, protocol version: 2
120504 16:16:20 [Note] WSREP: Prepared IST receiver, listening at: tcp://192.168.0.5:4568
120504 16:16:20 [Note] WSREP: Node 2 (laurane.lolomin.net) requested state transfer from '*any*'. Selected 0 (cygnus.lolomin.net)(SYNCED) as donor.
120504 16:16:20 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 12196493)
120504 16:16:20 [Note] WSREP: Requesting state transfer: success, donor: 0
120504 16:16:21 [Note] WSREP: SST complete, seqno: 12196440
120504 16:16:21 InnoDB: The InnoDB memory heap is disabled
120504 16:16:21 InnoDB: Mutexes and rw_locks use GCC atomic builtins
120504 16:16:21 InnoDB: Compressed tables use zlib 1.2.3
120504 16:16:21 InnoDB: Using Linux native AIO
120504 16:16:21 InnoDB: Initializing buffer pool, size = 384.0M
120504 16:16:21 InnoDB: Completed initialization of buffer pool
120504 16:16:21 InnoDB: highest supported file format is Barracuda.
120504 16:16:21 InnoDB: Waiting for the background threads to start
120504 16:16:22 InnoDB: 1.1.8 started; log sequence number 17525844419
120504 16:16:22 [Note] Event Scheduler: Loaded 0 events
120504 16:16:22 [Note] WSREP: Signalling provider to continue.
120504 16:16:22 [Note] WSREP: Received SST: 61408137-81c8-11e1-0800-5dbd477990e1:12196440
120504 16:16:22 [Note] WSREP: SST received: 61408137-81c8-11e1-0800-5dbd477990e1:12196440
120504 16:16:22 [Note] WSREP: Receiving IST: 39 writesets, seqnos 12196440-12196479
120504 16:16:22 [Note] /opt/mysql-galera/bin/mysqld: ready for connections.
Version: '5.5.23-log' socket: '/opt/mysql-galera/data/mysql.sock' port: 3306 Source distribution, wsrep_23.5.r3743

Hope this could be useful for you and will help to get a solution for the problem even if I understand that having different architecture is not recommended I think for production usage at least ...

Regards,

Laurent

Hi Alex,

As I've just noticed this behaviour, I was thinking it could be useful for you :

120504 16:16:20 [Note] WSREP: Assign initial position for certification: 12196479, protocol version: 2
120504 16:16:20 [Note] WSREP: Prepared IST receiver, listening at: tcp://192.168.0.5:4568
120504 16:16:20 [Note] WSREP: Node 2 (laurane.lolomin.net) requested state transfer from '*any*'. Selected 0 (cygnus.lolomin.net)(SYNCED) as donor.
120504 16:16:20 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 12196493)
120504 16:16:20 [Note] WSREP: Requesting state transfer: success, donor: 0
120504 16:16:21 [Note] WSREP: SST complete, seqno: 12196440
120504 16:16:21 InnoDB: The InnoDB memory heap is disabled
120504 16:16:21 InnoDB: Mutexes and rw_locks use GCC atomic builtins
120504 16:16:21 InnoDB: Compressed tables use zlib 1.2.3
120504 16:16:21 InnoDB: Using Linux native AIO
120504 16:16:21 InnoDB: Initializing buffer pool, size = 384.0M
120504 16:16:21 InnoDB: Completed initialization of buffer pool
120504 16:16:21 InnoDB: highest supported file format is Barracuda.
120504 16:16:21  InnoDB: Waiting for the background threads to start
120504 16:16:22 InnoDB: 1.1.8 started; log sequence number 17525844419
120504 16:16:22 [Note] Event Scheduler: Loaded 0 events
120504 16:16:22 [Note] WSREP: Signalling provider to continue.
120504 16:16:22 [Note] WSREP: Received SST: 61408137-81c8-11e1-0800-5dbd477990e1:12196440
120504 16:16:22 [Note] WSREP: SST received: 61408137-81c8-11e1-0800-5dbd477990e1:12196440
120504 16:16:22 [Note] WSREP: Receiving IST: 39 writesets, seqnos 12196440-12196479
120504 16:16:22 [Note] /opt/mysql-galera/bin/mysqld: ready for connections.
Version: '5.5.23-log'  socket: '/opt/mysql-galera/data/mysql.sock'  port: 3306  Source distribution, wsrep_23.5.r3743

Hope this could be useful for you and will help to get a solution for the problem even if I understand that having different architecture is not recommended I think for production usage at least ...

Regards,

Laurent

Revision history for this message

Alex Yurchenko (ayurchen) wrote on 2012-05-09:

#11

fix committed in r127

Changed in galera:
assignee:	nobody → Teemu Ollakka (teemu-ollakka)
status:	Confirmed → Fix Committed

Revision history for this message

Laurent Minost (lolomin) wrote on 2012-05-14:

#12

node1 64bits Edit (10.9 KiB, text/plain)

Hi,

I've just tested r127 on my nodes, maybe I've done something wrong but now IST seems to start but after nodes resync they both crash each other, you will find the logs attached, if you need some others informations I will keep the libgalera_smm.so of rev 127 to do some futur test.

Regards,

Laurent

Revision history for this message

Laurent Minost (lolomin) wrote on 2012-05-14:

#13

node2 32bits Edit (12.9 KiB, text/plain)

second log, sorry didn't find how to attach to the first comment ...

Revision history for this message

Alex Yurchenko (ayurchen) wrote on 2012-05-14:

#14

What does the following message from haproxy mean?

Message from syslogd@localhost at May 14 17:10:31 ...
haproxy[21584]: proxy galeracluster has no server available!
120514 17:10:34 [Note] WSREP: Stop replication

Does sit have any side effects? It looks like someone/something have stopped donor in the middle of IST.

Revision history for this message

Laurent Minost (lolomin) wrote on 2012-05-14:

#15

Hi Alexey,

The message only indicates to the admin that the pool composed of three servers to use for Load Balancing for HAProxy is now empty and no server is available for serving requests.
There is no action from HAProxy on Galera replication, HAproxy is only used in front of Galera cluster and load balance requests to nodes which are available, it knows if node is available or not by querying a port on each server which is giving a 200 HTTP code if the result of galera node status is Synced / 4 or 503 HTTP code if it's other.

I do not have any clue why the replication stopped as you're saying, the only thing that was done is start a node (node1 64bits) with the new libgalera_ssm.so which compiled properly without any warning/errors, then started the second node with another compiled libgalera_smm.so compiled on the node (node2 32 bits), then get the logs you're having attached.

If you need me to do any others tests, don't hesitate.

Regards,

Laurent

Revision history for this message

Alex Yurchenko (ayurchen) wrote on 2012-05-14:

#16

Are you running Cluster Control?

Revision history for this message

Laurent Minost (lolomin) wrote on 2012-05-15:

#17

Yes, I'm running Cluster Control Monitor which is hosted on a different machine, on every node we have the cmon agent running. Do you think there can be a relation between what we are seeing and Cluster Control ?

Revision history for this message

Laurent Minost (lolomin) wrote on 2012-05-15:

#18

Hmm, I've just checked Cluster Control Web Frontend on the Job Messages view and saw some entries which seems to be related with the behaviour you're describing (" It looks like someone/something have stopped donor in the middle of IST. ")

1397 1397 Galera Cluster Recovery FINISHED Galera Cluster Recovery 2012-05-14 17:10:57

if I click on this entry, I can see details about Cluster Control seems to have done :

Job Id Message Id Message Return Code Time
1397 47 192.168.0.1: unloading wsrep_provider: SET GLOBAL wsrep_provider='none' 0 2012-05-14 17:10:43
1397 48 192.168.0.1: loading wsrep_provider: SET GLOBAL wsrep_provider='/usr/lib/galera/libgalera_smm.so' 0 2012-05-14 17:10:43
1397 49 192.168.0.5: unloading wsrep_provider: SET GLOBAL wsrep_provider='none' 0 2012-05-14 17:10:55
1397 50 192.168.0.5: loading wsrep_provider: SET GLOBAL wsrep_provider='/usr/lib/galera/libgalera_smm.so' 0 2012-05-14 17:10:55
1397 51 Electing master 192.168.0.1 : SET GLOBAL wsrep_cluster_address='gcomm://' 0 2012-05-14 17:10:55
1397 52 Reviving 192.168.0.5 : SET GLOBAL wsrep_cluster_address='gcomm://192.168.0.1:4567' 0 2012-05-14 17:10:57
1397 53 Job finished 0 2012-05-14 17:10:57

So I really think we have the culprit on this test ... sorry for this Alexey because you are losing some time by my fault with this ...
I will disable Cluster Control and test again the new lib with IST fix, will let you know then.
Thanks again for your expertise.

Revision history for this message

Alex Yurchenko (ayurchen) wrote on 2012-05-15:

#19

Yes, cluster control has a known issue with this. In short it assumes that there is no outside interaction with the cluster, so when you restart nodes manually it thinks that they fail and tries to restart them. Severalnines guys seem to be workign on it, you may want to ask them about the progress if cluster control is important for you.

Revision history for this message

Laurent Minost (lolomin) wrote on 2012-05-16:

#20

Hi Alex,

Finally cut off the Cluster Control main Agent on the fourth host so it sees nothing of what is occuring on the node, by the way, I found this behaviour of restarting nodes by itself unusable because we have no control on it and we cannot disable it ... will see with severalnines what can be do with that ...

Concerning IST fix, I installed back the new libgalera_smm.so (Revision 127) and IST seems to work properly between x86 and x64 by now for the few nodes restarts I did so I think this bug is now fixed. Will be able to confirm totally in the next days after more time and node's stop/start cycles I think.

Again, thks for your time, support and clear explanations ! ;)

Teemu Ollakka (teemu-ollakka) on 2012-05-30

Changed in galera:
status:	Fix Committed → Fix Released

Revision history for this message

Vinay Joosery (vinaa) wrote on 2012-06-04:

#21

Hi folks,
ClusterControl must be shutdown when doing manual maintenance (i.e. bypassing the ClusterControl GUI).
This instruction is printed now in the web interface.

The reason is that ClusterControl cannot know whether the admin shut down the Galera node or the node crashed.

If you shut down the node via ClusterControl GUI, then ClusterControl will not restart it.

Best regards,
Vinay

www.severalnines.com

Revision history for this message

Vinay Joosery (vinaa) wrote on 2012-06-27:

#22

Hello again,
This Laurent guy does not give up easily :)

As from v1.1.32, it is now possible to disable Galera autorecovery in ClusterControl.
There is a new configuration parameter in cmon.cnf: enable_autorecovery=1 (default 1 == enabled, 0 means disabled - only manual recovery).

More information on v1.1.32 and upgrade instructions can be found here:
http://support.severalnines.com/entries/21633407-released-clustercontrol-v1-1-32

Best regards,
Vinay

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.