wsrep_osu_method=RSU allows only one ALTER TABLE to run concurrently

Bug #1330944 reported by Kenny Gryp
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Invalid
Undecided
Unassigned

Bug Description

Version: 5.6.15-56 Percona XtraDB Cluster (GPL), Release 25.5, Revision 759, wsrep_25.5.r4061

Running schema changes in parallel is useful In order to speed up the schema changes, so it uses more cpu cores. This also requires nodes to spend less time in 'maintenance mode'.
This currently fails with RSU.

The same error is given as in bug 1330941

Example

On a single node, with 2 sessions, do...

First create 2 tables and put some data in it, so that the ALTER TABLE takes a few seconds, enough to start another ALTER TABLE on another table in another session.

session1 mysql> set global wsrep_osu_method=rsu;
Query OK, 0 rows affected (0.00 sec)

RSU

session1 mysql> alter table test add key (a);

Add an index, immediately run the other ALTER on the second table in the second session:

session2 mysql> alter table test2 add key (a);
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

You immediately get a deadlock issue.

Log output:

2014-06-17 10:26:04 3875 [Note] WSREP: Member 0.0 (node1) desyncs itself from group
2014-06-17 10:26:04 3875 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 33)
2014-06-17 10:26:04 3875 [Note] WSREP: Provider paused at 62eb8c72-f601-11e3-b42c-ab6847529d86:33 (70)
2014-06-17 10:26:05 3875 [ERROR] WSREP: Node desync failed.: 11 (Resource temporarily unavailable)
     at galera/src/replicator_smm.cpp:desync():1623
2014-06-17 10:26:05 3875 [Warning] WSREP: RSU desync failed 3 for alter table test2 add key (a)
2014-06-17 10:26:05 3875 [Warning] WSREP: ALTER TABLE isolation failure
2014-06-17 10:26:06 3875 [Note] WSREP: resuming provider at 70
2014-06-17 10:26:06 3875 [Note] WSREP: Provider resumed.
2014-06-17 10:26:06 3875 [Note] WSREP: Member 0.0 (node1) resyncs itself to group
2014-06-17 10:26:06 3875 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 33)
2014-06-17 10:26:06 3875 [Note] WSREP: Member 0.0 (node1) synced with group.
2014-06-17 10:26:06 3875 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 33)

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Yes, this must be one of the limitations of RSU. You can run only one
RSU at a time.

On the galera side, a RSU desyncs the node. So, the second RSU fails because
second desync actually returns error.

This needs to be fixed on galera end to make desync after first one
idempotent(?).

Changed in percona-xtradb-cluster:
status: New → Confirmed
Revision history for this message
Krunal Bauskar (krunal-bauskar) wrote :

So the issue is operating as per the design I dont see any bug in this case.

Changed in percona-xtradb-cluster:
status: Confirmed → Invalid
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1690

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.