Percona Server with XtraDB

crash-resistant replication doesn't work when InnoDB operates with binary log disabled

Reported by Nickolay Ihalainen on 2012-12-20
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server
Status tracked in 5.6
5.1
High
Sergei Glushchenko
5.5
High
Sergei Glushchenko
5.6
Medium
Sergei Glushchenko

Bug Description

Steps to reproduce:
1) setup a slave (S1) ( (5.5.17-rel22.1-log) for already existing master (M)
2) make a copy via xtrabackup
3) create a new mysql instance (S2) (5.5.28-rel29.2) from backup taken at 2). Execute change master to make S2 a slave for S1 (new master)
4) check if replication is working correctly from S1 to S2.
5) execute kill -9 on mysqld process on S2
6) During recovery, old master log file and master log position are shown after
 InnoDB: In a MySQL replication slave the last master binlog file

Not seeing the expected result. But logs need to be further verified.

 The hosts were prepared as described in the bug report.

Full slave error log is here http://sprunge.us/RERT (ignore the 1062 error for now)

Few show slave status outputs: http://sprunge.us/MFNJ

Interesting to note:

a)

show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Queueing master event to the relay log
                  Master_Host: Percona
                  Master_User: root
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000004
          Read_Master_Log_Pos: 75733104
               Relay_Log_File: archie-relay-bin.000004
                Relay_Log_Pos: 14733102
        Relay_Master_Log_File: mysql-bin.000004
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 72653284
              Relay_Log_Space: 39013793
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 1
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 392
1 row in set (0.00 sec)

mysql>
mysql> show slave status\G
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (111)
ERROR:
Can't connect to the server

b)

InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 79107507, file name mysql-bin.000004
InnoDB: and relay log file
InnoDB: position 21187325, file name ./archie-relay-bin.000004
InnoDB: Last MySQL binlog file position 0 22958867, file name /var/lib/mysql/mysql-bin.000002

As can be seen, the position looks good.

@Nickolay, can you verify the above results? I crashed it twice so as to get more results.

Nickolay Ihalainen (ihanick) wrote :
Download full text (3.7 KiB)

I have 3 Virtual machines:
c1:
[mysqld]
server_id=1
innodb_file_per_table
log-bin=mysql1_binlog

c2: slave of c1, donor for c3, future master of c3
[mysqld]
server_id=2
innodb_file_per_table
log-bin=mysql2_binlog
relay-log=mysql2_relaylog
innodb_recovery_update_relay_log=1
innodb_flush_log_at_trx_commit=0
log_slave_updates

c3: new server,
[mysqld]
server_id=3
innodb_file_per_table
log-bin=mysql3_binlog
relay-log=mysql3_relaylog
innodb_recovery_update_relay_log=1
innodb_flush_log_at_trx_commit=0

Steps to reproduce:
1) setup c1 with single innodb table
2) setup replication from c1(master) to c2(slave)
3) on c1 insert rows:
start while true ; do mysql test -e 'insert x values(1000);' ; done
4) c2, send kill -9 to mysqld_safe & mysqld
5) copy /var/lib/mysql from c2 to c3
6) close connections via tcp 3306 from c3 to c1
7) start mysql on c2
8) start mysql on c3
9) on c3: stop slave; change master (not working) ; reset slave;change master; start slave;
9) check on c3 if replication is up&running
10) on c3 kill -9 on mysqld pid
11) replication is not working now on c3, because it wont to use relay log names from c2.
SHOW SLAVE STATUS\G shows:
             Master_Log_File: mysql2_binlog.000015
          Read_Master_Log_Pos: 211757
               Relay_Log_File: mysql2_relaylog.000008
                Relay_Log_Pos: 839409
        Relay_Master_Log_File: mysql1_binlog.000002

InnoDB: Warning: innodb_overwrite_relay_log_info is enabled. Updates by other storage engines may not be synchronized.
InnoDB: relay-log.info is detected.
InnoDB: relay log: position 842094, file name ./mysql2_relaylog.000008
InnoDB: master log: position 2341964, file name mysql1_binlog.000002

InnoDB: Apply batch completed
InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 2339279, file name mysql1_binlog.000002
InnoDB: and relay log file
InnoDB: position 839409, file name ./mysql2_relaylog.000008
InnoDB: Last MySQL binlog file position 0 797205, file name ./mysql2_binlog.000014

              Master_Log_File: mysql1_binlog.000002
          Read_Master_Log_Pos: 2341964
               Relay_Log_File: mysql2_relaylog.000008
                Relay_Log_Pos: 839409
        Relay_Master_Log_File: mysql1_binlog.000002
             Slave_IO_Running: No
            Slave_SQL_Running: No

Can't run change master:
change master to master_host='192.168.2.148', master_user='repl', master_password='pass',master_log_file='mysql2_binlog.000014', master_log_pos=797205;
130412 21:40:26 [ERROR] Failed to open the relay log './mysql2_relaylog.000008' (relay_log_pos 839409)

reset slave;
now change master is working

kill -9 to mysqld on new server

130412 21:42:05 [Note] Plugin 'FEDERATED' is disabled.
InnoDB: Warning: innodb_overwrite_relay_log_info is enabled. Updates by other storage engines may not be synchronized.
InnoDB: relay-log.info is detected.
InnoDB: relay log: position 211907, file name ./mysql3_relaylog.000003
InnoDB: master log: position 211757, file name mysql2_binlog.000015

InnoDB: Apply batch completed
InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 2339279, file name mysql1_binlog.000002
InnoDB: and relay log f...

Read more...

Nickolay -

What VM software and client OSes do you use?

I am able to reproduce this following instructions provided by Nickolay.

The workaround is to turn on log-slave-updates on c3 as well.

summary: - crash resistant replication doesn't work correctly after change master
- or reset slave
+ crash-resistant replication doesn't work when InnoDB operates with
+ binary log disabled

Will port the testcase to 5.6.

David Smid (datasmid) wrote :

Regression on Fedora 19 with 5.5.32-31.0

[12:23:00][Step 2/4] rpl.rpl_truncate_3innodb 'mix' [ pass ] 612
[12:23:00][Step 2/4] rpl.rpl_typeconv_innodb 'mix' [ skipped ] Doesn't support --binlog-format='mixed'
[12:23:15][Step 2/4] rpl.rpl_percona_bug1092593 'mix' [ fail ] Found warnings/errors in server log file!
[12:23:15][Step 2/4] Test ended at 2013-07-05 12:23:15
[12:23:15][Step 2/4] line
[12:23:15][Step 2/4] 130705 13:23:06 InnoDB: Error: table `mysqld.2`.`#sql2784_5e_1` does not exist in the InnoDB internal
[12:23:15][Step 2/4] 130705 13:23:06 InnoDB: Error: table `mysqld.2`.`#sql2784_e8_1` does not exist in the InnoDB internal
[12:23:15][Step 2/4] ^ Found warnings in /home/white_dev_01_d_nl_eng_tc_01/buildAgent/work/ae1dfd3f186721d8/debug/mysql-test/var/log/mysqld.2.err
[12:23:15][Step 2/4] ok
[12:23:15][Step 2/4]
[12:23:15][Step 2/4] - saving '/home/white_dev_01_d_nl_eng_tc_01/buildAgent/work/ae1dfd3f186721d8/debug/mysql-test/var/log/rpl.rpl_percona_bug1092593-mix/' to '/home/white_dev_01_d_nl_eng_tc_01/buildAgent/work/ae1dfd3f186721d8/debug/mysql-test/var/log/rpl.rpl_percona_bug1092593-mix/'
[12:23:15][Step 2/4] mysql-test-run: *** ERROR: Not all tests completed
[12:23:15][Step 2/4]
[12:23:15][Step 2/4] Only 341 of 2518 completed.

David -

Please open a new bug for this. Thanks!

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers