TOI ALTER TABLE blocks all DML's for duration of ALTER TABLE in separate databases

Bug #928919 reported by Patrick Zoblisein on 2012-02-08
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Galera
Wishlist
Alex Yurchenko
MySQL patches by Codership
Wishlist
Unassigned
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.5
Confirmed
Wishlist
Unassigned
5.6
Confirmed
Wishlist
Unassigned

Bug Description

Hi,

Using Percona XtraDB Cluster with Galera - 5.5.17-22.1.log and am seeing long-running ALTER TABLE statements in Node1-Database A block all writes on Node2-Database B using wsrep_OSU_method='TOI'.

--Node 1 - Database "n126db1" - table "t1"

CREATE TABLE `t1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `hostname` varchar(64) NOT NULL,
  `port` int(11) NOT NULL,
  `instime` datetime NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=18042764 DEFAULT CHARSET=latin1;

--Node 2 - Database "n127db2" - table "t2"

CREATE TABLE `t2` (
  `c1` int(11) NOT NULL,
  `c2` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Node 1 (n126db1) - ALTER TABLE t1 add column c5 int;
Node 2 (n127db2) - insert into t2 values (1,2);

mysql> show processlist;
+------+-----------------+-----------+---------+---------+-------+------------------------+----------------------------------+-----------+---------------+-----------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |
+------+-----------------+-----------+---------+---------+-------+------------------------+----------------------------------+-----------+---------------+-----------+
| 1 | event_scheduler | localhost | NULL | Daemon | 64607 | Waiting on empty queue | NULL | 0 | 0 | 1 |
| 2 | system user | | NULL | Sleep | 64606 | wsrep aborter idle | NULL | 0 | 0 | 1 |
| 3 | system user | | n126db1 | Sleep | 44 | copy to tmp table | alter table t1 add column c5 int | 0 | 0 | 4066970 |
| 4 | system user | | NULL | Sleep | 566 | committed 3569220 | NULL | 0 | 0 | 1 |
| 5 | system user | | NULL | Sleep | 566 | committed 3569199 | NULL | 0 | 0 | 1 |
| 6 | system user | | NULL | Sleep | 566 | committed 3569222 | NULL | 0 | 0 | 1 |
| 7 | system user | | NULL | Sleep | 566 | committed 3569215 | NULL | 0 | 0 | 1 |
| 8 | system user | | NULL | Sleep | 566 | committed 3569219 | NULL | 0 | 0 | 1 |
| 9 | system user | | NULL | Sleep | 566 | committed 3569211 | NULL | 0 | 0 | 1 |
| 10 | system user | | NULL | Sleep | 566 | committed 3569223 | NULL | 0 | 0 | 1 |
| 4329 | root | localhost | n127db2 | Query | 42 | query end | insert into t2 values (1,2) | 0 | 0 | 2 |
| 4339 | root | localhost | NULL | Query | 0 | sleeping | show processlist | 0 | 0 | 1 |
+------+-----------------+-----------+---------+---------+-------+------------------------+----------------------------------+-----------+---------------+-----------+
12 rows in set (0.00 sec)

The documentation at http://www.codership.com/wiki/doku.php?id=rolling_schema_upgrade states that part of the database in the whole cluster is locked - but the above test seems to show that ALL databases on the node are locked.

Contrast this behavior when the test is run on a single-RSU node. The ALTER TABLE takes just as long - but inserts into the other database are not blocked.

Thanks
Patrick

no longer affects: codership-mysql
Alex Yurchenko (ayurchen) wrote :

Default setting of replicator.commit_order (IN-ORDER) prevents out-of-order committing of changes to the database, hence connection hanging in "query end" state: DML was applied, but commit postponed till the end of DDL.

Possible actions:
1) Leave as it is and fix documentation
2) Disable commit order control for TOI actions
The latter is probably easier.

Changed in galera:
assignee: nobody → Alex Yurchenko (ayurchen)
Alex Yurchenko (ayurchen) wrote :

Certification still works as expected, postponed till next release.

Changed in galera:
importance: Undecided → Wishlist
milestone: none → 23.2.1
status: New → Confirmed
Alex Yurchenko (ayurchen) wrote :

One important consideration - the need to commit db changes (including DDLs) in total order makes it nearly impossible to fix.

Changed in galera:
milestone: 23.2.1 → none
status: Confirmed → Won't Fix
Alex Yurchenko (ayurchen) wrote :

After more consideration this appears to be an issue specific to MySQL DDL processing and solution lies there.
closely related https://bugs.launchpad.net/codership-mysql/+bug/1257069

Changed in codership-mysql:
importance: Undecided → Wishlist
status: New → Confirmed
Changed in galera:
milestone: none → 4.0
status: Won't Fix → Confirmed

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1156

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers