XA PREPARE inconsistent with XTRABACKUP

Bug #1651941 reported by David Zhao on 2016-12-22
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server moved to https://jira.percona.com/projects/PS
Status tracked in 5.7
5.5
New
Undecided
Unassigned
5.6
New
Undecided
Unassigned
5.7
Fix Released
High
Unassigned

Bug Description

XTRABACKUP does this to make sure the innodb redo log it copies and the binlog position it notes down are consistent:

FLUSH TABLE WITH READ LOCK; ---- (1)
....
FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS; ---- (2)
....
--- copy innodb redo log files ---- (3)
SHOW MASTER STATUS; ---- (4)
UNLOCK TABLES; ---- (5)

Since at beginning of any transaction commit, a shared global COMMIT lock is acquired, which is also acquired by FTWRL in exclusive mode, the above FTWRL at (1) can(or is intended to) make sure no transaction is prepared to innodb and no transaction's binlogs flushed to binlog file while XTRABACKUP is running between above (1) and (5) steps.

However, XA PREPARE doesn't acquire the COMMIT lock and it makes the transaction prepared in innodb and also flushes the transaction's binlogs to binlog file(in the flush stage), so this behavior can potentially lose prepared transactions:

Suppose transaction T1 got prepared via XA PREPARE between above stmt (2) and (3), then stmt (4) will return a binlog position right after T1's binlogs, but T1's innodb redo logs are still in the redo log buffer, not flushed by stmt (2) and not copied at (3), then T1 will be lost when the DB instance is restored later, and the restored DB instance's inndb data and binlog data will be inconsistent --- T1 exists in binlog but not in innodb.

So I think this is a bug and it exists in many MySQL versions although I am using Percona-Server-MySQL-5.7.16-10. For the official MySQL team, this may not be seen as a bug since they don't consider working correctly with XTRABACKUP, so I am reporting this bug to you here. And the way I fix it is to acuqire the global COMMIT lock before calling ha_prepare() in Sql_cmd_xa_prepare::trans_xa_prepare. My patch is attached here.

BTW, the XA PREPARE command does innodb prepare AFTER the flush stage, which is probably wrong too, and I reported that bug to MySQL team, the bug link is:http://bugs.mysql.com/bug.php?id=84297

David Zhao (david.zhao.cn) wrote :
David Zhao (david.zhao.cn) wrote :

I updated the patch and it works for me, the formatting is not perfect (I squeezed some extra changes into it later).

David, IMHO upstream would be interested in this same bug report too, it is possible that their MySQL Enterprise Backup is affected the same.

tags: added: contribution
tags: added: upstream
Yura Sorokin (yura-sorokin) wrote :

Fixed in the 5.7.19 upstream merge PR
https://github.com/percona/percona-server/pull/1892

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1043

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Related blueprints

Remote bug watches

Bug watches keep track of this bug in other bug trackers.