TOI ALTER TABLE blocks all DML's for duration of ALTER TABLE in separate databases

Bug #928919 reported by Patrick Zoblisein
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Galera
Confirmed
Wishlist
Alex Yurchenko
MySQL patches by Codership
Confirmed
Wishlist
Unassigned
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.5
Confirmed
Wishlist
Unassigned
5.6
Confirmed
Wishlist
Unassigned

Bug Description

Hi,

Using Percona XtraDB Cluster with Galera - 5.5.17-22.1.log and am seeing long-running ALTER TABLE statements in Node1-Database A block all writes on Node2-Database B using wsrep_OSU_method='TOI'.

--Node 1 - Database "n126db1" - table "t1"

CREATE TABLE `t1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `hostname` varchar(64) NOT NULL,
  `port` int(11) NOT NULL,
  `instime` datetime NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=18042764 DEFAULT CHARSET=latin1;

--Node 2 - Database "n127db2" - table "t2"

CREATE TABLE `t2` (
  `c1` int(11) NOT NULL,
  `c2` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Node 1 (n126db1) - ALTER TABLE t1 add column c5 int;
Node 2 (n127db2) - insert into t2 values (1,2);

mysql> show processlist;
+------+-----------------+-----------+---------+---------+-------+------------------------+----------------------------------+-----------+---------------+-----------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |
+------+-----------------+-----------+---------+---------+-------+------------------------+----------------------------------+-----------+---------------+-----------+
| 1 | event_scheduler | localhost | NULL | Daemon | 64607 | Waiting on empty queue | NULL | 0 | 0 | 1 |
| 2 | system user | | NULL | Sleep | 64606 | wsrep aborter idle | NULL | 0 | 0 | 1 |
| 3 | system user | | n126db1 | Sleep | 44 | copy to tmp table | alter table t1 add column c5 int | 0 | 0 | 4066970 |
| 4 | system user | | NULL | Sleep | 566 | committed 3569220 | NULL | 0 | 0 | 1 |
| 5 | system user | | NULL | Sleep | 566 | committed 3569199 | NULL | 0 | 0 | 1 |
| 6 | system user | | NULL | Sleep | 566 | committed 3569222 | NULL | 0 | 0 | 1 |
| 7 | system user | | NULL | Sleep | 566 | committed 3569215 | NULL | 0 | 0 | 1 |
| 8 | system user | | NULL | Sleep | 566 | committed 3569219 | NULL | 0 | 0 | 1 |
| 9 | system user | | NULL | Sleep | 566 | committed 3569211 | NULL | 0 | 0 | 1 |
| 10 | system user | | NULL | Sleep | 566 | committed 3569223 | NULL | 0 | 0 | 1 |
| 4329 | root | localhost | n127db2 | Query | 42 | query end | insert into t2 values (1,2) | 0 | 0 | 2 |
| 4339 | root | localhost | NULL | Query | 0 | sleeping | show processlist | 0 | 0 | 1 |
+------+-----------------+-----------+---------+---------+-------+------------------------+----------------------------------+-----------+---------------+-----------+
12 rows in set (0.00 sec)

The documentation at http://www.codership.com/wiki/doku.php?id=rolling_schema_upgrade states that part of the database in the whole cluster is locked - but the above test seems to show that ALL databases on the node are locked.

Contrast this behavior when the test is run on a single-RSU node. The ALTER TABLE takes just as long - but inserts into the other database are not blocked.

Thanks
Patrick

no longer affects: codership-mysql
Revision history for this message
Alex Yurchenko (ayurchen) wrote :

Default setting of replicator.commit_order (IN-ORDER) prevents out-of-order committing of changes to the database, hence connection hanging in "query end" state: DML was applied, but commit postponed till the end of DDL.

Possible actions:
1) Leave as it is and fix documentation
2) Disable commit order control for TOI actions
The latter is probably easier.

Changed in galera:
assignee: nobody → Alex Yurchenko (ayurchen)
Revision history for this message
Alex Yurchenko (ayurchen) wrote :

Certification still works as expected, postponed till next release.

Changed in galera:
importance: Undecided → Wishlist
milestone: none → 23.2.1
status: New → Confirmed
Revision history for this message
Alex Yurchenko (ayurchen) wrote :

One important consideration - the need to commit db changes (including DDLs) in total order makes it nearly impossible to fix.

Changed in galera:
milestone: 23.2.1 → none
status: Confirmed → Won't Fix
Revision history for this message
Alex Yurchenko (ayurchen) wrote :

After more consideration this appears to be an issue specific to MySQL DDL processing and solution lies there.
closely related https://bugs.launchpad.net/codership-mysql/+bug/1257069

Changed in codership-mysql:
importance: Undecided → Wishlist
status: New → Confirmed
Changed in galera:
milestone: none → 4.0
status: Won't Fix → Confirmed
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1156

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.