wsrep_osu_method=RSU allows only one ALTER TABLE to run concurrently
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC |
Invalid
|
Undecided
|
Unassigned |
Bug Description
Version: 5.6.15-56 Percona XtraDB Cluster (GPL), Release 25.5, Revision 759, wsrep_25.5.r4061
Running schema changes in parallel is useful In order to speed up the schema changes, so it uses more cpu cores. This also requires nodes to spend less time in 'maintenance mode'.
This currently fails with RSU.
The same error is given as in bug 1330941
Example
On a single node, with 2 sessions, do...
First create 2 tables and put some data in it, so that the ALTER TABLE takes a few seconds, enough to start another ALTER TABLE on another table in another session.
session1 mysql> set global wsrep_osu_
Query OK, 0 rows affected (0.00 sec)
RSU
session1 mysql> alter table test add key (a);
Add an index, immediately run the other ALTER on the second table in the second session:
session2 mysql> alter table test2 add key (a);
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
You immediately get a deadlock issue.
Log output:
2014-06-17 10:26:04 3875 [Note] WSREP: Member 0.0 (node1) desyncs itself from group
2014-06-17 10:26:04 3875 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 33)
2014-06-17 10:26:04 3875 [Note] WSREP: Provider paused at 62eb8c72-
2014-06-17 10:26:05 3875 [ERROR] WSREP: Node desync failed.: 11 (Resource temporarily unavailable)
at galera/
2014-06-17 10:26:05 3875 [Warning] WSREP: RSU desync failed 3 for alter table test2 add key (a)
2014-06-17 10:26:05 3875 [Warning] WSREP: ALTER TABLE isolation failure
2014-06-17 10:26:06 3875 [Note] WSREP: resuming provider at 70
2014-06-17 10:26:06 3875 [Note] WSREP: Provider resumed.
2014-06-17 10:26:06 3875 [Note] WSREP: Member 0.0 (node1) resyncs itself to group
2014-06-17 10:26:06 3875 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 33)
2014-06-17 10:26:06 3875 [Note] WSREP: Member 0.0 (node1) synced with group.
2014-06-17 10:26:06 3875 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 33)
Changed in percona-xtradb-cluster: | |
status: | New → Confirmed |
Yes, this must be one of the limitations of RSU. You can run only one
RSU at a time.
On the galera side, a RSU desyncs the node. So, the second RSU fails because
second desync actually returns error.
This needs to be fixed on galera end to make desync after first one
idempotent(?).