Percona Server crashes with segmentation fault when disk quota is reached
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| MySQL Server |
Unknown
|
Unknown
|
||
| Percona Server moved to https://jira.percona.com/projects/PS |
Fix Released
|
Low
|
George Ormond Lorch III | |
| 5.1 |
Invalid
|
Undecided
|
Unassigned | |
| 5.5 |
Won't Fix
|
Low
|
George Ormond Lorch III |
Bug Description
Percona Server crashes with segmentation fault when mysqld tries to write data to a table that would mean extending the tablespace file, and disk quota is reached outside the database directory, but on the same volume.
=== OS:
DISTRIB_ID=Ubuntu
DISTRIB_
DISTRIB_
DISTRIB_
2.6.32-341-ec2 #42-Ubuntu SMP Tue Dec 6 14:56:13 UTC 2011 x86_64 GNU/Linux
=== Percona Server version
Server version: 5.5.24-55-log Percona Server (GPL), Release 26.0
-- Filesystem and mount options
/dev/sdm /vol/ebs1 xfs rw,nodiratime,
=== Test case
-- create a new group and set quota of 10M for testing purposes
groupadd free2
xfs_quota -x -c 'limit -g bsoft=10m bhard=10m free2' /vol/ebs1
-- create database and change the group to free2 and setgid for the database directory so that all files created in the directory are owned by the user free2
mysql [(none)]> create database free2test;
chgrp -R free2 /vol/ebs1/
chmod +s /vol/ebs1/
-- create the table as follows in the database free2test
mysql [(none)]> use free2test;
Database changed
mysql [free2test]> CREATE TABLE `t1` ( `i` int(11) not null auto_increment primary key, c char(255) default 'dummy_text' ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.03 sec)
-- now create a 9M file outside of the database directory but in /vol/ebs1 where the quota is applied and change its group to free2 (the one on which quota is set)
dd if=/dev/zero of=/vol/
chgrp free2 /vol/ebs1/
-- now insert values into the table till MySQL crashes
insert into t1(i) values (null);
insert into t1 select null, c from t1; <-- 11th time this is executed and MySQL crashes
The gdb backtrace of the crash is attached. The section to look at is "Thread 1 (Thread 10457)", the segmentation fault is caused inside the function btr_page_
Related branches
- Laurynas Biveinis (community): Needs Fixing on 2013-04-16
- Alexey Kopytov (community): Approve on 2012-11-28
-
Diff: 16 lines (+5/-1)1 file modifiedPercona-Server/storage/innobase/fsp/fsp0fsp.c (+5/-1)
Ovais Tariq (ovais-tariq) wrote : | #1 |
Alexey Kopytov (akopytov) wrote : | #3 |
I believe it applies to upstream as well. In which case, it should also have an upstream bug report linked.
Ovais Tariq (ovais-tariq) wrote : | #4 |
I am able to repeat the crash with the test case provided above on both Percona Server 5.5.24 and 5.5.28-29.1 on CentOS 5.8
Ovais Tariq (ovais-tariq) wrote : | #5 |
The crash is also repeatable on upstream 5.5.28 and the stack trace is exactly the same. Attached you will find the complete backtrace generated from the core file.
Based on the trace, the call stack is probably like: (in reverse order)
fsp_try_
fsp_reserve_
fseg_alloc_
btr_page_alloc_low
btr_page_alloc
btr_page_
(with no NULL checking in btr_page_alloc and btr_page_
if (new_block) {
buf_block_
}
)
fsp_try_
Non-innodb code -- my_write in my_write.c handles this with a loop till space is available (with special handling for EDQUOT) whereas os_file_write_func doesn't do that and fails if ENOSPC/EDQUOT.
Ovais Tariq (ovais-tariq) wrote : | #7 |
I have reported the bug upstream: http://
Ovais -
Thanks. Could you also check 5.1 (since we will start fixing it there), and if debug builds crash differently?
Ovais Tariq (ovais-tariq) wrote : | #9 |
Error log of MySQl 5.5.28 crash
Ovais Tariq (ovais-tariq) wrote : | #10 |
Laurynas,
I am not able to repeat the crash on MySQL 5.1.66. The exact steps when done on 5.1.66 are handled gracefully and when there is no space available the following reported:
mysql> insert into t1 select null, c from t1;
ERROR 1114 (HY000): The table 't1' is full
mysql> insert into t1 select null, c from t1;
ERROR 1114 (HY000): The table 't1' is full
The error log entries are also different:
Version: '5.1.66-community' socket: '/mnt/mysql/
121117 3:06:58 InnoDB: Error: Write to file ./free2test/t1.ibd failed at offset 0 131072.
InnoDB: 16384 bytes should have been written, only -1 were written.
InnoDB: Operating system error number 122.
InnoDB: Check that your OS and file system support files of this size.
InnoDB: Check also that the disk is not full or a disk quota exceeded.
InnoDB: Error number 122 means 'Disk quota exceeded'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://
121117 3:06:58 [ERROR] /usr/sbin/mysqld: The table 't1' is full
121117 3:07:13 [ERROR] /usr/sbin/mysqld: The table 't1' is full
Note that MySQL 5.5.28 reports "Operating system error number 0" while 5.1.66 reports "Operating system error number 122" which is more correct.
Ovais -
Thanks. Can you test the 5.5 debug build too?
Ovais Tariq (ovais-tariq) wrote : | #12 |
Laurynas,
I am able to reproduce the crash on upstream 5.5.28 debug-build. The debug-build crashes with an assertion failure instead of segmentation fault. Attached is the gdb backtrace of the crashed 5.5.28 debug-build
Ovais Tariq (ovais-tariq) wrote : | #13 |
Attached is the error log of the crashed upstream 5.5.28 debug-build
George Ormond Lorch III (gl-az) wrote : | #14 |
I tried this against 5.5.-28.29.1 on CentOS 6, using xfs and the sequence described above but with no crash. I properly got on the 15th insert:
mysql> insert into t1 select null, c from t1;
ERROR 1114 (HY000): The table 't1' is full
Perhaps there is some minor difference in xfs between Ubuntu, CentOS 5 and Centos 6?
I will try with upstream 5.5.28...
George Ormond Lorch III (gl-az) wrote : | #15 |
OK, must be some difference in xfs, MySQL 5.5.28 behaves correctly as well (unless I am missing something in the reproduction).
I will try to create a VM with Ubuntu and see if I can reproduce there...
1)
The assertion failure in #12 and #13 is the effect and not the cause of the crash.
btr0btr.c:
new_block = btr_page_
new_page = buf_block_
btr_page_alloc returns NULL and buf_block_get_frame fails on that assertion.
It falls for this assertion
ut_ad(block);
Interestingly, PS has non-debug assertions in place there:
ut_a(srv_
if (srv_pass_
return(0);
}
ut_ad(block);
Now, the reason why that assertion is not getting triggered in MySQL/PS release builds (and instead crashing with signal 11) is because:
=======
#ifdef UNIV_DEBUG
/******
Gets a pointer to the memory frame of a block.
@return pointer to the frame */
UNIV_INLINE
buf_frame_t*
buf_block_
/*=====
const buf_block_t* block) /*!< in: pointer to the control block */
__attribute_
#else /* UNIV_DEBUG */
# define buf_block_
#endif /* UNIV_DEBUG */
=======
ie. in non UNIV_DEBUG builds, it is a macro, hence the backtrace just shows only upto btr_page_
2) Regarding 'The table ... is full', fsp_reserve_
@Ovais, can you provide the cnf file you used for the MySQL/PS instances mentioned above? (also was it a shared tablespace (spaceid=0) or single tablespace)
Ovais Tariq (ovais-tariq) wrote : | #17 |
Here is the my.cnf file I used in the tests:
[client]
socket = /tmp/mysql.sock
[mysqld]
datadir = /mnt/mysql-55
user = mysql
socket = /tmp/mysql.sock
innodb_
core-file
[mysqld_safe]
core-file-size = unlimited
Ovais Tariq (ovais-tariq) wrote : | #18 |
George,
Did you test with innodb_
George Ormond Lorch III (gl-az) wrote : | #19 |
Ovais, oddly enough, no I did not use file_per_table which is quite odd as I almost always use it. Anyway, I retested on CentOS 6 with PS 5.5.28-29.1 debug builds (both with and without WITH_DEBUG=ON) and file_per table, also tried a few different things on what files are contained in the free2 group and get either:
"ERROR 1114 (HY000): The table 't1' is full"
from mysql if I "chgrp -R free2 /5gb/mysql/
121120 9:26:02 [ERROR] /usr/local/
121120 9:26:02 [ERROR] /usr/local/
from mysqld if I follow the test procedure exactly.
I failed miserably at getting CentOS 5 VM running and building yesterday, wow, Stewart was correct, it really does, uhh, stink...so I will try Ubuntu 10.4 lucid to try and save some time rather than fighting anymore with CentOS 5 and see if I can reproduce.
My specifics:
mount:
/dev/sdb on /5gb type xfs (rw,nodiratime,
my.cnf:
[client]
port = 3306
socket = /tmp/mysql.sock
[mysqld]
port = 3306
socket = /tmp/mysql.sock
skip-external-
key_buffer_size = 256M
max_allowed_packet = 1M
table_open_cache = 256
sort_buffer_size = 1M
read_buffer_size = 1M
read_rnd_
myisam_
thread_cache_size = 8
query_cache_size= 16M
thread_concurrency = 8
log-bin=mysql-bin
binlog_format=mixed
server-id = 1
innodb_
innodb_
innodb_
innodb_
innodb_
innodb_
innodb_
innodb_
innodb_
innodb_
core-file
[root@localhost glorch]# ls -l /5gb/mysql/
total 149616
drwx------. 2 mysql free2 45 Nov 20 09:25 free2test
-rw-rw----. 1 mysql free2 18874368 Nov 20 09:35 ibdata1
-rw-rw----. 1 mysql free2 67108864 Nov 20 09:35 ib_logfile0
-rw-rw----. 1 mysql free2 67108864 Nov 20 09:23 ib_logfile1
-rw-rw----. 1 mysql free2 5 Nov 20 09:23 localhost.pid
drwx------. 2 mysql free2 4096 Nov 20 09:23 mysql
-rw-rw----. 1 mysql free2 27338 Nov 20 09:23 mysql-bin.000001
-rw-rw----. 1 mysql free2 126 Nov 20 09:23 mysql-bin.000002
-rw-rw----. 1 mysql free2 65536 Nov 20 09:35 mysql-bin.000003
-rw-rw----. 1 mysql free2 57 Nov 20 09:23 mysql-bin.index
drwx------. 2 mysql free2 4096 Nov 20 09:23 performance_schema
drwx------. 2 mysql free2 6 Nov 20 09:23 test
[root@localhost glorch]# ls -l /5gb/mysql/
total 9232
-rw-rw----. 1 mysql free2 65 Nov 20 09:24 db.opt
-rw-rw----. 1 mysql mysql 8578 Nov 20 09:25 t1.frm
-rw-rw----. 1 mysql mysql 9437184 Nov 20 09:35 t1.ibd
Ovais Tariq (ovais-tariq) wrote : | #20 |
George,
I only get the crash when I create a file out of MySQL but within the same group on which the quota is applied. Perhaps you didn't actually run the steps in the sequence that I mentioned? And I see that you also have binlogging enabled, so one of the messages that you see "Waiting for someone to free space" is binlogging related.
I will try to run my test case on a CentOS 6 VM and let you know.
George Ormond Lorch III (gl-az) wrote : | #21 |
OK, I got it Ovais, was just replying when I got your response here. My directory structure was slightly different than yours. I got it under Ubuntu 10.04.4 lucid using your my.cnf and a closer approximation of your directory structure. On initial inspection, Raghavendras analysis of the issue above is spot on. I will switch back to CentOS 6 (my preferred environment) and retry there, then continue on with the fix. Thanks for all the info.
George Ormond Lorch III (gl-az) wrote : | #22 |
Annnd, I got it in CentOS 6 now too. Thanks a bunch guys...possible fix forthcoming.
Now, that we have verified it, we need to see what should the behaviour be in this case -- whether to return DB_OUT_
Also, error handling is required in btr_cur_
......
} else {
*rec = btr_page_
}
......
(big_rec needs to be set NULL as well as DB_OUT_
Ovais Tariq (ovais-tariq) wrote : | #24 |
Raghu,
Related to what you are saying that InnoDB should spin and wait for space to be available, there is something similar in upstream 5.7.0 where MySQL retries the operation several times before giving up, see this bug report: http://
George Ormond Lorch III (gl-az) wrote : | #25 |
OK, so this is rather nasty, in PS 5.5.28:
btr0cur.
fsp0fsp.
So then btr0cur.
It then calls:
btr0btr.
OK, so, lets fix the math error in fsp_reserve_
n_free_up = (size - free_limit) / FSP_EXTENT_SIZE;
to:
if (size <= free_limit) {
n_free_up = 0;
} else if (alloc_type == FSP_UNDO) {
n_free_up = (size - free_limit) / FSP_EXTENT_SIZE;
}
OK, so this solves the issue for _small_ inserts, like if the test above just did something like:
for i in `seq 2000`; do mysql -e 'INSERT INTO free2test.t1(i) VALUES (null)'; done;
But, if we follow the sequence prescribed above by doing:
mysql -e 'INSERT INTO free2test.t1 SELECT null FROM free2test.t1'
Then we have a big problem. There is no space left to perform the rollback. If we set a simple break point at the end of fsp_reserve_
b fsp0fsp.c:3088 if space > 0 && (!success || !n_pages_added)
This will stop us every time we 'detect' the error...
Here is the call stack where we first detect the space issue:
#0 fsp_reserve_
#1 0x0000000000879ecc in btr_cur_
at /home/glorch/
#2 0x0000000000960e12 in row_ins_
#3 0x0000000000962681 in row_ins_index_entry (index=
#4 0x00000000009631c5 in row_ins_index_e...
Ovais Tariq (ovais-tariq) wrote : | #26 |
George,
I repeated the test case with innodb_
121127 5:38:57 InnoDB: Error: Write to file ./ibdata1 failed at offset 0 20971520.
InnoDB: 1048576 bytes should have been written, only 196608 were written.
InnoDB: Operating system error number 0.
InnoDB: Check that your OS and file system support files of this size.
InnoDB: Check also that the disk is not full or a disk quota exceeded.
InnoDB: Error number 0 means 'Success'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://
You can see that the error log entry makes reference to partial write and that the error number was 0, which is exactly what is written to the error log when InnoDB crashed. However, repeatedly executing the insert ... select only results in the following message:
121127 5:38:57 [ERROR] /usr/sbin/mysqld: The table 't1' is full
121127 5:39:04 [ERROR] /usr/sbin/mysqld: The table 't1' is full
121127 5:39:06 [ERROR] /usr/sbin/mysqld: The table 't1' is full
Could this be taken as a workaround "using single shared tablespace with innodb_
George Ormond Lorch III (gl-az) wrote : | #27 |
Ovais, innodb_
I can make the server crash or fail gracefully by simply varying the number of rows inserted at one time which varies the both the required space to perform the inserts _and_ the space required to perform a rollback. By inserting a single row at a time, there is almost always going to be enough space left to perform a rollback when enough extents can't be reserved for the insert. By increasing the number of rows inserted at once until the balance of space reaches a point where the row size requirement is just less than the available free space is where the rollback crash will occur since then the attempt to reserve any extents to perform the rollback will fail, sending the server into a massive anxiety attack and shutting down.
So changing where the space is allocated is really just pushing peas around the plate.
See http://
The money quote:
"[17 Mar 2009 9:39] Marko Mäkelä
Fixing this bug would be a major undertaking, and it might affect the performance of InnoDB as well. InnoDB assumes in many places that space is available for the undo log records. We could add a check (and a rollback) to each place where something is written to the undo log, but that would likely move the problem elsewhere, since B-tree delete operations sometimes require extra space. We could also shut down InnoDB in the event of such a failure, but that would require extra checking (possibly even locking) in the MySQL-to-InnoDB interface.
The bottom line is that the root cause (the InnoDB system tablespace being too small) cannot be fixed without a server restart. Currently, InnoDB can't be shut down and restarted independently of MySQL. The fix is too risky to be implemented in the stable releases of MySQL."
Ovais Tariq (ovais-tariq) wrote : | #28 |
George,
Based on Marko's comments in this bug report http://
If there really would be any performance regression, I think many users would be more concerned by that rather than the fixes in tablespace management :)
I don't think "Fix Released" is a correct status for this bug. The server still crashes on full disk errors. What has been fixed is bug 1083700 (bug in reserved extent number calculation), and this one should be "Won't Fix" instead.
Percona now uses JIRA for bug reports so this bug report is migrated to: https:/
Is this reproducible with -
- Current 5.5 release?
- Current 5.1 release?
- Is this bug Percona Server-specific or is it reproducible with MySQL too?
Also would be useful to try reproducing this using a debug build.