IST is not working between different EC2 accessibility zones

Bug #1013356 reported by Alex Yurchenko
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Galera
Fix Released
High
Teemu Ollakka

Bug Description

This is apparently because IST module substitutes provided DNS name for locally resolved IP:

On joiner:

120614 19:18:51 [Note] WSREP: Passing config to GCS: base_host = ec2-184-169-211-46.us-west-1.compute.amazonaws.com; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 1G; gcs.fc_debug = 0; gcs.fc_factor = 0.5; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
...
120614 19:18:53 [Note] WSREP: State transfer required:
 Group state: 42c8a9df-b653-11e1-0800-d5a2c0ae8603:1
 Local state: 42c8a9df-b653-11e1-0800-d5a2c0ae8603:0
120614 19:18:53 [Note] WSREP: New cluster view: global state: 42c8a9df-b653-11e1-0800-d5a2c0ae8603:1, view# 4: Primary, number of nodes: 2, my index: 1, protocol version 2
120614 19:18:53 [Warning] WSREP: Gap in state sequence. Need state transfer.
120614 19:18:55 [Note] WSREP: Running: 'wsrep_sst_rsync 'joiner' 'ec2-184-169-211-46.us-west-1.compute.amazonaws.com' '' '/var/lib/mysql/' '/etc/my.cnf' '21607' 2>sst.err'
120614 19:18:55 [Note] WSREP: Prepared SST request: rsync|ec2-184-169-211-46.us-west-1.compute.amazonaws.com:4444/rsync_sst
120614 19:18:55 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120614 19:18:55 [Note] WSREP: Assign initial position for certification: 1, protocol version: 2
120614 19:18:55 [Note] WSREP: Prepared IST receiver, listening at: tcp://10.171.75.241:4568
...
120614 19:18:58 [Note] WSREP: SST received: 42c8a9df-b653-11e1-0800-d5a2c0ae8603:0
120614 19:18:58 [Note] WSREP: Receiving IST: 1 writesets, seqnos 0-1
120614 19:18:58 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.23' socket: '/var/lib/mysql/mysql.sock' port: 3306 MySQL Community Server (GPL), wsrep_23.6.r3755, wsrep_23.6.r3755
120614 19:19:58 [Warning] WSREP: 0 (ip-10-12-126-42): State transfer to 1 (ip-10-171-75-241) failed: -110 (Connection timed out)
120614 19:19:58 [ERROR] WSREP: gcs/src/gcs_group.c:gcs_group_handle_join_msg():712: Will never receive state. Need to abort.

On donor:

120614 19:18:55 [Note] WSREP: IST request: 42c8a9df-b653-11e1-0800-d5a2c0ae8603:0-1|tcp://10.171.75.241:4568
120614 19:18:55 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120614 19:18:55 [Note] WSREP: Running: 'wsrep_sst_rsync 'donor' 'ec2-184-169-211-46.us-west-1.compute.amazonaws.com:4444/rsync_sst' '(null)' '/var/lib/mysql/' '/etc/my.cnf' '42c8a9df-b653-11e1-0800-d5a2c0ae8603' '0' '1''
120614 19:18:55 [Note] WSREP: sst_donor_thread signaled with 0
120614 19:19:58 [ERROR] WSREP: IST failed: IST sender, failed to connect 'tcp://10.171.75.241:4568': Connection timed out: 110 (Connection timed out)

Changed in galera:
assignee: nobody → Teemu Ollakka (teemu-ollakka)
importance: Undecided → High
milestone: none → 23.2.2
status: New → Confirmed
Revision history for this message
Teemu Ollakka (teemu-ollakka) wrote :
Changed in galera:
status: Confirmed → Fix Committed
Changed in galera:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.