Hanging "System Lock" when executing "flush table ... for export"

Bug #1454441 reported by suanmeiguo on 2015-05-12
112
This bug affects 17 people
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
High
Sergei Glushchenko
5.1
Invalid
Undecided
Unassigned
5.5
Invalid
Undecided
Unassigned
5.6
Fix Released
High
Sergei Glushchenko

Bug Description

Description:
I was trying to run "flush table ... for export" on a table. There weren't any operations on that table, the query should go through. But it didn't and the query was hanging on "System Lock" for a long time. Sometimes it hangs for minutes and sometimes it hangs for hours or days.

The environment is a busy database with a lot of DML. The table is a new table without any data, and there was no operations on it. However, there were other DML running on other tables.

We found the requirements to reproduce this issue are:
1. change buffer is actively used (ibuf_size > 1)
2. the table to be flushed has no records in change buffer (we created a new table to demonstrate this)

How to repeat:
We can now reproduce this issue in a very simple environment. So attached a link to our Vagrant file that was used to reproduce this issue.

https://github.com/ftb-dataengineering/Virtual-Machine/tree/master/system-lock

Here are 4 simple steps to reproduce the hanging system lock:

1. [terminal] vagrant up & vagrant ssh
2. [terminal] sysbench --test=/usr/share/doc/sysbench/tests/db/oltp.lua --oltp-table-size=50000000 --mysql-db=test --mysql-user=vagrant --mysql-password=vagrant prepare
3. [sql] create table app (id int primary key);
4. [sql] flush tables app for export;

Suggested fix:
We found that in percona-server/storage/innobase/ibuf/ibuf0ibuf.cc ibuf_contract_in_background function, it's stuck in the while loop. Basically the function still tries to flush the ibuf when there's no data for that specific table_id.

In this line: n_bytes = ibuf_merge(table_id, &n_pag2, FALSE);
When there's no records in change buffer for table A, it should return 0 for n_bytes, hence the loop can be broken later on. However it returns 1 instead of 0.

Eventually the issue is from ibuf_merge_space, there's a line:
++sum_sizes;

For unknown reason it plus 1, and it becomes the return value for n_bytes.

If there's purpose for ++sum_sizes, there should be a check in ibuf_contract_in_background function. Or the ++sum_sizes should be fixed.

There is a temporary solution to resolve the hanging system lock:
set global innodb_change_buffer_max_size = 0;
or
set global innodb_change_buffering = none;

This will clear the change buffer and hence break the while loop.

This bug is also report on mysql.
https://bugs.mysql.com/bug.php?id=77011

tags: added: upstream
Download full text (6.5 KiB)

This was easy enough to verify with PS 5.6.24. While sysbench was running, I tried:

[openxs@centos ~]$ mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.6.24-72.2-log Percona Server (GPL), Release 72.2, Revision 8d0f85b

Copyright (c) 2009-2015 Percona LLC and/or its affiliates
Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create table app(id int primary key);
Query OK, 0 rows affected (0.97 sec)

mysql> flush tables app for export;
Query OK, 0 rows affected (0.01 sec)

mysql> flush tables app for export;
ERROR 1192 (HY000): Can't execute the given command because you have active locked tables or an active transaction
mysql> unlock tables;
Query OK, 0 rows affected (0.00 sec)

mysql> flush tables app for export;
Query OK, 0 rows affected (0.00 sec)

mysql> unlock tables;
Query OK, 0 rows affected (0.00 sec)

mysql> flush tables app for export;

... and now it hanged. In other session we can see:

mysql> show processlist;
+----+------+-----------+------+---------+------+-------------+------------------------------------------------------------------------------------------------------+-----------+---------------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined |
+----+------+-----------+------+---------+------+-------------+------------------------------------------------------------------------------------------------------+-----------+---------------+
| 1 | root | localhost | test | Query | 76 | System lock | flush tables app for export | 0 | 0 |
| 2 | root | localhost | test | Query | 2 | update | INSERT INTO sbtest1(k, c, pad) VALUES(2513807, '05488995274-59078070332-49503422272-20145529088-7820 | 0 | 0 |
| 3 | root | localhost | test | Query | 0 | init | show processlist | 0 | 0 |
+----+------+-----------+------+---------+------+-------------+------------------------------------------------------------------------------------------------------+-----------+---------------+
3 rows in set (0.01 sec)

and, indeed, at this time change buffer is used:

mysql> show engine innodb status\G
*************************** 1. row ***************************
  Type: InnoDB
  Name:
Status:
=====================================
2015-05-25 19:50:21 7f345b658700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 37 seconds
-----------------
BACKGROUND THREAD
----------...

Read more...

tags: added: i56733
Roel Van de Paar (roel11) wrote :

This bug looks upstream-familiar (besides 77011). If still needed, I can go and find/check. Let me know.

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-900

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.