main.percona_bug1008609 breaks any replication test further in the same run

Bug #1515602 reported by Laurynas Biveinis on 2015-11-12
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Status tracked in 5.7
5.6
New
High
Unassigned
5.7
Fix Released
High
Laurynas Biveinis

Bug Description

When main.percona_bug1008609 completes, any replication test on any MTR worker starts failing with slaves being unable to connect to master. It shows up as, for example

main.auth_rpl [ fail ]
        Test ended at 2015-11-12 14:06:00

CURRENT_TEST: main.auth_rpl
mysqltest: In included file ./include/wait_for_slave_param.inc at line 156:
included from ./include/wait_for_slave_io_to_start.inc at line 44:
included from ./include/wait_for_slave_to_start.inc at line 30:
included from ./include/start_slave.inc at line 46:
included from ./include/rpl_for_each_connection.inc at line 63:
included from ./include/rpl_start_slaves.inc at line 30:
included from ./include/rpl_init.inc at line 463:
included from ./include/master-slave.inc at line 51:
included from /Users/laurynas/percona/mysql-server/mysql-test/t/auth_rpl.test at line 4:
At line 156: Timeout in include/wait_for_slave_param.inc

The result from queries just before the failure was:
< snip >
relaylog_name = 'No such row'
SHOW RELAYLOG EVENTS IN 'No such row';
Log_name Pos Event_type Server_id End_log_pos Info

**** slave_relay_info on server_1 ****
SELECT * FROM mysql.slave_relay_log_info;
Number_of_lines Relay_log_name Relay_log_pos Master_log_name Master_log_pos Sql_delay Number_of_workers Id Channel_name

**** slave_master_info on server_1 ****
SELECT * FROM mysql.slave_master_info;
Number_of_lines Master_log_name Master_log_pos Host User_name User_password Port Connect_retry Enabled_ssl Ssl_ca Ssl_capath Ssl_cert Ssl_cipher Ssl_key Ssl_verify_server_cert Heartbeat Bind Ignored_server_ids Uuid Retry_count Ssl_crl Ssl_crlpath Enabled_auto_position Channel_name

**** mysql.gtid_executed on server_1 ****
SELECT * FROM mysql.gtid_executed;
source_uuid interval_start interval_end
rpl_topology= 1->2
rand_seed: '' _rand_state: ''
extra debug info if any: ''
rpl_topology=1->2
connection server_2;
safe_process[91283]: Child process: 91284, exit: 1

Checking the error log of the slave shows failure to connect to master because server uuid is identical to that master.

This is caused by main.percona_bug1008609 running mysqld --bootstrap with --datadir pointing to the master MTR datadir, which is later copied to the worker master and any slave working directories. These datadir contains an auto.cnf, containing a server UUID, created during bootstrap. MTR deletes that file after the initial --bootstrap run so that later any servers started generate their own UUIDs. MTR also re-copies the master datadir for servers to use as needed.

And percona_bug1008609 re-creates auto.cnf with an UUID, which is then copied for the servers to use, resulting in identical UUIDs for masters and slaves.

Tags: ci Edit Tag help
tags: added: ci

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-942

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Related blueprints