Comment 0 for bug 1648015

Revision history for this message
Adrian SÅ‚owik (slowik-adrian) wrote :

PXC 5.7.14 was created using db1, db2 and garbd as arbitrator.
During "drop database" statement cluster hung and killed itself.
(Drop database statement was issued once at a time and this bug occured saveral times)
We hit this bug saveral times and migrated databases to PXC 5.6

Now with only db1 and garbd as arbitrator I tried to replicate the problem (there is no second node to replicate to, only arbitrator). I created 3 loops to drop databases i parallel and node hung:

Process list:
+---------+-------------+-----------+------+---------+--------+-----------------------------------+------------------------+-----------+---------------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined |
+---------+-------------+-----------+------+---------+--------+-----------------------------------+------------------------+-----------+---------------+
| 1 | system user | | NULL | Sleep | 162439 | wsrep: aborter idle | NULL | 0 | 0 |
| 2 | system user | | NULL | Sleep | 75375 | wsrep: applier idle | NULL | 0 | 0 |
| 8218611 | root | localhost | NULL | Query | 843 | System lock | Drop database s0017522 | 0 | 0 |
| 8218614 | root | localhost | NULL | Query | 842 | wsrep: preparing for TO isolation | Drop database z0004779 | 0 | 0 |
| 8218622 | root | localhost | NULL | Query | 833 | wsrep: preparing for TO isolation | Drop database t0000391 | 0 | 0 |
| 8218666 | root | localhost | NULL | Query | 0 | starting | show full processlist | 0 | 0 |
+---------+-------------+-----------+------+---------+--------+-----------------------------------+------------------------+-----------+---------------+

Log file:

2016-12-07T09:37:16.688769Z 0 [ERROR] [FATAL] InnoDB: Semaphore wait has lasted > 600 seconds. We intentionally crash the server because it appears to be hung.
2016-12-07 10:37:16 0x7f61866e1700 InnoDB: Assertion failure in thread 140056843917056 in file ut0ut.cc line 917
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
09:37:16 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster

key_buffer_size=8388608
read_buffer_size=262144
max_used_connections=1301
max_threads=1301
thread_count=6
connection_count=4
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1025588 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x40000
/opt/mysql/bin/mysqld(my_print_stacktrace+0x47)[0x562f40c89e08]
/opt/mysql/bin/mysqld(handle_fatal_signal+0x46c)[0x562f4047cd27]
/lib64/libpthread.so.0(+0x10ec0)[0x7f73029feec0]
/lib64/libc.so.6(gsignal+0x3b)[0x7f7301e475eb]
/lib64/libc.so.6(abort+0x180)[0x7f7301e48a10]
/opt/mysql/bin/mysqld(+0x62726f)[0x562f4043f26f]
/opt/mysql/bin/mysqld(_ZN2ib5fatalD1Ev+0x6f)[0x562f40efcaaf]
/opt/mysql/bin/mysqld(srv_error_monitor_thread+0xeea)[0x562f40e98236]
/lib64/libpthread.so.0(+0x74d9)[0x7f73029f54d9]
/lib64/libc.so.6(clone+0x6d)[0x7f7301f01a7d]
You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.

Some data from monitoring stuff:
- a lot of disk write operations (1230 ops/sec)
- a lot of innodb data fsyncs (491/sec)
- a lot of innodb pages read (1580/sec)
- a lot of opened files (729/sec)