DROP TABLE IF EXISTS may brake replication if slave has replication filters

Bug #1475107 reported by Jericho Rivera
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
Medium
Yura Sorokin
5.5
Invalid
Undecided
Unassigned
5.6
Fix Released
Medium
Yura Sorokin

Bug Description

Copied from original bug description in mysql bug #77684 reported by FernandoL

Description:
If a replica has replication filters on a given database and DROP TABLE IF EXISTS is issued on a table that holds a key being used as Foreign Key by another table the statement will fail as expected with error:

ERROR 1217 (23000): Cannot delete or update a parent row: a foreign key constraint fails

however this will break replication on the replica with:

Last_SQL_Error: Query caused different errors on master and slave. Error on master: message (format)='Cannot delete or update a parent row: a foreign key constraint fails' error code=1217 ; Error on slave: actual message='no error', error code=0. Default database: 'db1'. Query: 'DROP TABLE IF EXISTS `table1` /* generated by server */'

How to repeat:
Setup master with:

binlog_format=STATEMENT
or
binlog_format=ROW

Setup replica with:

replicate-ignore-db = db1
replicate-wild-ignore-table = db1.%

Then run on the master the following statements:

CREATE DATABASE `db1`;

USE `db1`;

CREATE TABLE `table1` (`ID` bigint(20) primary key) ENGINE=InnoDB;

CREATE TABLE `table2` ( `ID` bigint(20) NOT NULL AUTO_INCREMENT, `DIVISION_ID` bigint(20) DEFAULT NULL, PRIMARY KEY (`ID`), KEY `FK_TABLE1_DIVISION_1` (`DIVISION_ID`), CONSTRAINT `FK_TABLE1_DIVISION_1` FOREIGN KEY (`DIVISION_ID`) REFERENCES `table1` (`ID`) ON DELETE CASCADE ) ENGINE=InnoDB;

DROP TABLE IF EXISTS `db1`.`table1`;

Suggested fix:
The problem seems to be related to the "USE" above as the following works as expected:

CREATE DATABASE `db1`;
CREATE TABLE `db1`.`table1` (`ID` bigint(20) primary key) ENGINE=InnoDB;
CREATE TABLE `db1`.`table2` ( `ID` bigint(20) NOT NULL AUTO_INCREMENT, `DIVISION_ID` bigint(20) DEFAULT NULL, PRIMARY KEY (`ID`), KEY `FK_TABLE1_DIVISION_1` (`DIVISION_ID`), CONSTRAINT `FK_TABLE1_DIVISION_1` FOREIGN KEY (`DIVISION_ID`) REFERENCES `db1`.`table1` (`ID`) ON DELETE CASCADE ) ENGINE=InnoDB;
DROP TABLE IF EXISTS `db1`.`table1`;

however if you add an USE `db1` after the CREATE DATABASE statement the replication error will follow.

Revision history for this message
Nilnandan Joshi (nilnandan-joshi) wrote :

Verified.

mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| ardi |
| mysql |
| performance_schema |
| test |
+--------------------+
5 rows in set (0.00 sec)

mysql> show master status \g
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000003 | 120 | | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

mysql> CREATE DATABASE `db1`;
Query OK, 1 row affected (0.01 sec)

mysql> USE `db1`;
Database changed
mysql> CREATE TABLE `table1` (`ID` bigint(20) primary key) ENGINE=InnoDB;
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TABLE `table2` ( `ID` bigint(20) NOT NULL AUTO_INCREMENT, `DIVISION_ID` bigint(20) DEFAULT NULL, PRIMARY KEY (`ID`), KEY `FK_TABLE1_DIVISION_1` (`DIVISION_ID`), CONSTRAINT `FK_TABLE1_DIVISION_1` FOREIGN KEY (`DIVISION_ID`) REFERENCES `table1` (`ID`) ON DELETE CASCADE ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.02 sec)

mysql>
mysql> DROP TABLE IF EXISTS `db1`.`table1`;
ERROR 1217 (23000): Cannot delete or update a parent row: a foreign key constraint fails
mysql>
mysql>

Changed in percona-server:
status: New → Confirmed
tags: added: upstream
Revision history for this message
Yura Sorokin (yura-sorokin) wrote :

Consider 2 test cases included in
https://github.com/percona-ysorokin/percona-server/commit/5749da6de8e4d1ecd4970ad32886379b3bb80c57

1. We are using fully-qualified table names (rpl_bug1475107working.test).
  DROP TABLE IF EXISTS db1.table1;
2. We are using "USE db1;" statement (rpl_bug1475107working.test).
  USE db1;
  ...
  DROP TABLE IF EXISTS table1;

The difference in behavior turned out to be with mysql_execute_command() at sql_parse.cc:2609
The following code fragment behaves differently in case (1) and (2)

sql_parse.cc:2694
#ifdef HAVE_REPLICATION
  if (unlikely(thd->slave_thread))
  {
    // Database filters.
    if (lex->sql_command != SQLCOM_BEGIN &&
        lex->sql_command != SQLCOM_COMMIT &&
        lex->sql_command != SQLCOM_SAVEPOINT &&
        lex->sql_command != SQLCOM_ROLLBACK &&
        lex->sql_command != SQLCOM_ROLLBACK_TO_SAVEPOINT &&
        !rpl_filter->db_ok(thd->db))
      DBUG_RETURN(0);

The problem is that in case (1) "thd->db" is set to a database name which is not even mentioned in the original SQL statement "DROP TABLE IF EXISTS db1.table1;". In case of running via MTR, it was set to "test". So, rpl_filter->db_ok(thd->db) checking returned "true" as database "test" is not a name in the replication ignore list. Therefore, "DBUG_RETURN(0)" was skipped.
Later, at sql_parse.cc:2785
    if (!(lex->sql_command == SQLCOM_UPDATE_MULTI) &&
 !(lex->sql_command == SQLCOM_SET_OPTION) &&
 !(lex->sql_command == SQLCOM_DROP_TABLE &&
          lex->drop_temporary && lex->drop_if_exists) &&
        all_tables_not_ok(thd, all_tables))
    {
      /* we warn the slave SQL thread */
      my_message(ER_SLAVE_IGNORED_TABLE, ER(ER_SLAVE_IGNORED_TABLE), MYF(0));

this function set error in thd->m_stmt_da to ER_SLAVE_IGNORED_TABLE and immediately returned "0".

Case (2) is completely different. thd->db was set to "db1" and therefore function execution terminated at
if(... && !rpl_filter->db_ok(thd->db))
  DBUG_RETURN(0);
However, it did not change anything in "thd->m_stmt_da". "sql_errno" there was still 0.
After that, in Query_log_event::do_apply_event() we compare master error with slave error
log_event.cc:4968

    if ((expected_error && expected_error != actual_error &&
         !concurrency_error_code(expected_error)) &&
        !ignored_error_code(actual_error) &&
        !ignored_error_code(expected_error))
    {
      rli->report(ERROR_LEVEL, 0,
                      "\
Query caused different errors on master and slave. \
Error on master: message (format)='%s' error code=%d ; \
Error on slave: actual message='%s', error code=%d. \
Default database: '%s'. Query: '%s'",
                      ER_SAFE(expected_error),
                      expected_error,
                      actual_error ? thd->get_stmt_da()->message() : "no error",
                      actual_error,
                      print_slave_db_safe(db), query_arg);

To sum up, for the "non-failing" case (1) it works only because we pass invalid database name in rpl_filter->db_ok(). At the same time, "failing" case (2) fails only because sql_errno wasn't changed (was still zero) when the statement was ignored.

Revision history for this message
Yura Sorokin (yura-sorokin) wrote :

Problem also exist in the original MySQL 5.6 and 5.7 servers but not in 5.5.

Revision history for this message
Yura Sorokin (yura-sorokin) wrote :

The problem turned out to be with missing "if (is_trans_keyword() || rpl_filter->db_ok(thd->db))" statement in the "Query_log_event::do_apply_event()" method.
Most probably it was lost during one of the the 5.5 <-> 5.6 merges.

Here is a suggested fix
https://github.com/percona-ysorokin/percona-server/commit/3b7feaefa9a1c28fdff68552b70432d4ee1e098d

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Please re-check:

$ git log --oneline -S is_trans_keyword <5.6 branch> -- sql/log_event.cc
45424a3 Bug#18145032: NO EMPTY TRANSACTION IS CREATED FOR A FILTERED CREATE TEMPORARY TABLE WITH GTIDS Bug#18095502: NO EMPTY TRANSACTION IS CREATED FOR REPLICATE-IGNORE-DB OR REPLICATE-DO-DB
dbb832c Bug #50914 mysqlbinlog not handling drop of current default database
2049d1a Bug #50407 mysqlbinlog --database=X produces bad output for SAVEPOINTs

commit 45424a3 removes that line in 5.6, and it looks like it's done on purpose there?

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1639

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.