Tokubackup initialisation fails with SIGFPE intermittently
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Percona Server moved to https://jira.percona.com/projects/PS | Status tracked in 5.7 | |||||
5.5 |
Invalid
|
Undecided
|
Unassigned | |||
5.6 |
Opinion
|
High
|
Vlad Lesin | |||
5.7 |
Opinion
|
High
|
Vlad Lesin |
Bug Description
On 5.6 trunk, intermittently:
./mtr --force --max-test-fail=0 --suite-
...
worker[1] mysql-test-run: WARNING: Process [mysqld.1 - pid: 30046, winpid: 30046, exit: 256] died after mysql-test-run waited 0.2 seconds for /mnt/workspace/
...
Core was generated by `/mnt/workspace
Program terminated with signal 8, Arithmetic exception.
#0 0x00002b5fc7797f4b in file_hash_
file=0x12a705e0 "/selinux/mls")
at /mnt/workspace/
136 return (the_hash[
#0 0x00002b5fc7797f4b in file_hash_
file=0x12a705e0 "/selinux/mls")
at /mnt/workspace/
#1 0x00002b5fc7797e45 in file_hash_
full_
at /mnt/workspace/
#2 0x00002b5fc7797dcc in file_hash_
this=
at /mnt/workspace/
#3 0x00002b5fc7797cfa in file_hash_
this=
file=
at /mnt/workspace/
#4 0x00002b5fc77a0af2 in manager:
this=
at /mnt/workspace/
#5 0x00002b5fc779ef9e in manager::open (this=0x2b5fc79
file=
at /mnt/workspace/
#6 0x00002b5fc77a5f36 in open (file=0x7fff9ee
at /mnt/workspace/
#7 0x0000003d99a09cde in is_selinux_
from /lib64/
#8 0x0000003d99a0fe0e in ?? () from /lib64/
#9 0x0000003d99a10306 in ?? () from /lib64/
#10 0x0000003d9894f798 in __CTOR_LIST__ () from /lib64/libc.so.6
#11 0x00002b5fc7dd5078 in ?? ()
#12 0x00002b5fc778d000 in ?? ()
#13 0x0000003d99a03f6b in _init () from /lib64/
#14 0x00002b5fc7bc44a8 in ?? ()
#15 0x0000003d97e0d4ab in call_init () from /lib64/
#16 0x0000003d97e0d5b5 in _dl_init_internal ()
from /lib64/
#17 0x0000003d97e00aaa in _dl_start_user () from /lib64/
#18 0x0000000000000010 in ?? ()
at /opt/percona-
#19 0x00007fff9ee4b6f4 in ?? ()
#20 0x00007fff9ee4b754 in ?? ()
#21 0x00007fff9ee4b76f in ?? ()
#22 0x00007fff9ee4b7e6 in ?? ()
#23 0x00007fff9ee4b7f8 in ?? ()
#24 0x00007fff9ee4b817 in ?? ()
#25 0x00007fff9ee4b882 in ?? ()
#26 0x00007fff9ee4bb7d in ?? ()
#27 0x00007fff9ee4bbf5 in ?? ()
#28 0x00007fff9ee4bc91 in ?? ()
#29 0x00007fff9ee4bcb3 in ?? ()
#30 0x00007fff9ee4bcd3 in ?? ()
#31 0x00007fff9ee4bcf2 in ?? ()
#32 0x00007fff9ee4bd0f in ?? ()
#33 0x00007fff9ee4bd24 in ?? ()
#34 0x00007fff9ee4bd3f in ?? ()
#35 0x0000000000000000 in ?? ()
Looks like division by zero in % m_size?
tags: | added: tokubackup tokudb |
Can't repeat the crash.
file_hash_ table:: m_size is initialized explicitly in file_hash_table ctor with 1 and changed only during rehashing in file_hash_ table:: maybe_resize( ).
So I don't have assumptions about how m_size can become zero. The only theoretical assumption is overflow in file_hash_ table:: maybe_resize( ) in this line:
m_size = m_size + m_count;
m_count is a number of files in the hash table. I don't think there are so much files on test host to be the cause of overflow.
But there is one more ability to make m_count big enough for overflow. This is m_count decrement in file_hash_ table:: remove( ).
At this moment the only thing I can do is to insert asserts in the code in a weak hope to catch the moment when m_size becomes zero or m_count becomes size_t(-1).