ALTER TABLE ... ENGINE=INNODB in one database blocks ALTER TABLE in other databases

Bug #1257069 reported by Nirbhay Choubey
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Confirmed
Undecided
Unassigned

Bug Description

Originally logged here : https://mariadb.atlassian.net/browse/MDEV-5158

Problem Description:

During simultaneous execution of ALTERs on same node, one ALTER TABLE can block the
other ALTER TABLE command even when both are running on two tables from different
databases. This seem to be known limitation.

How to reproduce:

1) Start a mysqld instance with wsrep on.
2) Connection 1:
    A long-running ALTER TABLE command on an innodb table say test1.t1

3) Connection 2:
    Simultaneously, run another ALTER TABLE on a different innodb table say test2.t2

4) Connection 3:
    Monitor the progress from a different connection:
        > SHOW PROCESSLIST;
        ..
        | 4 | root | localhost:38774 | test1 | Query | 4 | copy to tmp table | alter table t1 engine=myisam | 26.185 |
        | 6 | root | localhost:38784 | test2 | Query | 2 | checking permissions | alter table t2 engine=myisam | 0.000 |
        ..

tags: added: maria-5.5-galera
Revision history for this message
Aleksey Sanin (aleksey-l) wrote :
Revision history for this message
Alex Yurchenko (ayurchen) wrote :

This is a conceptual problem with wsrep replication model caused by the need to control the order in which results of DDL operation become effective (when DDL operation ends). From wsrep perspective it is then that the database state undergoes the atomic change. And this change must happen in the same order on all nodes - no sooner, no later. So far MySQL implementation of DDL operations does not provide any means to control when they become effective. Since we have no way to control this, we need to pause execution of any other operations until current DDL ends.

In other words, what is important is when DDL operation ends, and for MySQL it is a matter of luck. Contrast this to transaction execution, for which we have COMMIT.

no longer affects: galera
Revision history for this message
Jeff Armstrong (n-launchpa7-f) wrote :

The point is that the DDL is in another database altogether.

We (and many other) application developers partition (not shard) our applications for scalability using the Database as a strongly walled partition. We never run DDL, transactions or updates that span multiple Databases.

As servers grow more and more powerful, the circumstance where a database may share a server with another database will grow. We have >400 databases per server with current commodity hardware and expect this to grow

Perhaps you should allow an option that we could enable that would limit this serialisation to within a single database. e.g.

wsrep_ddl_lock=server|database

We could then tune the setting to match our application.

Regards
Jeff

Revision history for this message
Alex Yurchenko (ayurchen) wrote :

Hi Jeff,

I understand all that, but the problem is not in *locking* (we have all means to allow parallel applying even within a single table). It is in *ordering*. We need each DDL to become effective on every node in exactly the same order, otherwise all this GTID stuff is for naught. Since ATM there is no "commit" operation for MySQL DDL, the only way to ensure the order is to start it in order and wait till it ends without letting anybody else to commit (we call it "total order isolation").

In other words solution to this will require some serious DDL execution hacking.

Regards,
Alex

Revision history for this message
Alex Yurchenko (ayurchen) wrote :

essentially a duplicate of https://bugs.launchpad.net/codership-mysql/+bug/928919 , but has better comments.

Changed in codership-mysql:
status: New → Confirmed
Revision history for this message
Jeff Armstrong (n-launchpa7-f) wrote :

Hi Alex,

Please forgive my imprecision - I appreciate there is a difference between locking and ordering (serialisation).

My comment is that the serialisation of DDL is only important within the scope of a single Database - for us (and I believe many others) the scope for TOI is the Database and not the Server.

We operate a collection of Servers, and view these as hardware / network containers for Databases (Rather than running 400 instances of the MySQL server on different Ports / IPs).

All our transactional logic, ACID requirements etc. are scoped within a single Database. We move Databases from server to server as required by performance, and for efficient network usage etc. Application and web logic all follow the Database.

Would it be possible (clearly at the cost of memory / resources) to consider a mode whereby the scope of the GTID is per Database rather than per Server?

Regards
Jeff

Revision history for this message
Alex Yurchenko (ayurchen) wrote :

> Would it be possible (clearly at the cost of memory / resources) to consider a mode whereby the scope of the GTID is per Database rather than per Server?

Logically this is one way to go about it. It probably won't be that big deal in terms of computational resources (what is few thousand GTID sequences?), but it would be a serious engineering challenge - it is not just more than one GTID sequence, it is also a changing number of sequences. So I'm not sure it is easier to do than enhancing DDL processing.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.