ROLLBACK to SAVEPOINT does not rollback correctly on slaves

Bug #1700593 reported by Romuald Brunet on 2017-06-26
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Fix Committed
Undecided
Krunal Bauskar

Bug Description

This may be related to #1524948 (but in my case the client isn't killed)

When using ROLLBACK TO SAVEPOINT on a cluster member, other cluster members seems to roll back to the beginning of the transaction, and not the savepoint.

This leads to consistency error in the cluster and most probably shutdown one that data is acceded again.

Example:

-- ========================================================
CREATE TABLE example (
    id INT NOT NULL PRIMARY KEY,
    name VARCHAR(64));

BEGIN;

  -- Will **not** be replicated to the other members
  INSERT INTO example VALUES(1, 'first!');

  SAVEPOINT savepoint001;
  -- no need for any operation here
  -- (and no change if there is)
  ROLLBACK TO SAVEPOINT savepoint001;

  -- Will show in the other nodes
  INSERT INTO example VALUES(2, 'second');

COMMIT;

SELECT * FROM example;

/*
On executed member
+----+--------+
| id | name |
+----+--------+
| 1 | first! |
| 2 | second |
+----+--------+

Other members:
+----+--------+
| id | name |
+----+--------+
| 2 | second |
+----+--------+

*/
-- ========================================================

Server version: 5.6.36-82.0-56 Percona XtraDB Cluster (GPL), Release rel82.0, Revision de7a681, WSREP version 26.20, wsrep_26.20

I also tested with 5.7.18-15-57-log (WSREP version 29.20) and it worked as expected (both rows are present on all members)

affects: percona-server → percona-xtradb-cluster
Sveta Smirnova (svetasmirnova) wrote :

Thank you for the report.

I cannot repeat described behavior. Please attach configuration files for all nodes and error log files from all nodes.

Changed in percona-xtradb-cluster:
status: New → Incomplete
Romuald Brunet (j-romuald) wrote :
Romuald Brunet (j-romuald) wrote :
Romuald Brunet (j-romuald) wrote :
Romuald Brunet (j-romuald) wrote :
Romuald Brunet (j-romuald) wrote :

I've attached the configuration and logs.

This is the default Debian configuration from http://repo.percona.com/apt/ with additions from the Percona tutorials for master/master configuration :p

For the logs I've stopped/started both members, and made a select in both after running example script

Romuald Brunet (j-romuald) wrote :

I've attached the configuration and logs

The configuration is the default one from the Debian package of http://repo.percona.com/apt/, with additions from the master/master tutorial :p

I've stopped / started both servers and pasted the logs from the start, dropped the example table, then re-run the script (with the same result)

Romuald Brunet (j-romuald) wrote :

(woops, launchpad didn't show the last log I thought I forgot to paste it)

Changed in percona-xtradb-cluster:
status: Incomplete → Opinion
status: Opinion → New
Jonathan Schafer (jschafer75) wrote :

Our tests ran successfully with 5.6.28-25.14.1, but failed in the same way as the OP's did on 5.6.37-26.21.1 and 5.6.36-26

Changed in percona-xtradb-cluster:
assignee: nobody → Krunal Bauskar (krunal-bauskar)

This error happens when log-bin=0

Changed in percona-xtradb-cluster:
status: New → Confirmed
Romuald Brunet (j-romuald) wrote :

FYI it does seems to be enabled in our production cluster (log_bin = ON). I don't think we ever disabled it

1. Attached configuration has log-bin disabled. (default is off)

2. I couldn't reproduce it with log-bin=1 and code confirms the same.

3. I foresee issue exist with 5.7.19 (and event 5.7.18).

As per your comment it seems like you tested it with 5.7.18 with log-bin=1 and couldn't reproduce it.

"I also tested with 5.7.18-15-57-log (WSREP version 29.20) and it worked as expected (both rows are present on all members)"

"5.7.18-15-57-log" ... -log suggest log-bin=1

commit 481c6e517ecba6acdfd2adc323921dc1434d2d74
Merge: 09dbf65 66b8ade
Author: Krunal Bauskar <email address hidden>
Date: Wed Oct 25 11:37:20 2017 +0530

    Merge pull request #559 from kbauskar/5.6-pxc-883

    - PXC#883: ROLLBACK to SAVEPOINT does not rollback correctly on slaves

commit 66b8ade02a2fcfa3c107e30355b2e8349287fcf9
Author: Krunal Bauskar <email address hidden>
Date: Tue Oct 24 11:24:17 2017 +0530

    - PXC#883: ROLLBACK to SAVEPOINT does not rollback correctly on slaves

      * ROLLBACK to savepoint involves 2 main actions:
        - truncate binlog to the said position
        - rollback at innodb storage engine level.

        wsrep plugin has no role to play in savepoint rollback and
        call to wsrep plugin should be avoided in such case.

      * What would happen if we call wsrep plugin rollback action
        for savepoint rollback ?
        * wsrep plugin doesn't differentiate between normal and savepoint
          rollback and so will cause complete transaction rollback there-by
          discarding valid action even befor the savepoint.
        * this drawback currently is limited to log-bin=0 but ideal solution
          is to avoid calling wsrep plugin for savepoint rollback as it doesn't
          have role to play.

Changed in percona-xtradb-cluster:
status: Confirmed → Fix Committed
Romuald Brunet (j-romuald) wrote :

Many thanks :)

Jonathan Schafer (jschafer75) wrote :

Thanks for the fix y'all!

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-883

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers