Comment 0 for bug 1704301

Revision history for this message
Andrew Garner (muzazzi) wrote : InnoDB crashes on ALTER TABLE

Reproduced under Percona Server 5.7.18-14 (Linux/Ubuntu 14.04) and 5.7.17-11 (Homebrew / OS X 10.12) in separate environments.

InnoDB: Assertion failure in thread 140333360805632 in file dict0dict.cc line 1856
InnoDB: Failing assertion: dict_sys->size > 0
InnoDB: We intentionally generate a memory trap.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
06:10:20 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=3
max_threads=751
thread_count=2
connection_count=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 306770 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7fa1acabcd90
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fa1e81eee40 thread_stack 0x40000
/opt/percona-server/bin/mysqld(my_print_stacktrace+0x2c)[0xed36ac]
/opt/percona-server/bin/mysqld(handle_fatal_signal+0x461)[0x7a0e91]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7fa217bf7330]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7fa217034c37]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7fa217038028]
/opt/percona-server/bin/mysqld[0x76f442]
/opt/percona-server/bin/mysqld(_Z26dict_table_rename_in_cacheP12dict_table_tPKcm+0x142b)[0x11af7cb]
/opt/percona-server/bin/mysqld(_ZN11ha_innobase26commit_inplace_alter_tableEP5TABLEP18Alter_inplace_infob+0x17ca)[0xfa657a]
/opt/percona-server/bin/mysqld[0x76133d]
/opt/percona-server/bin/mysqld(_Z17mysql_alter_tableP3THDPKcS2_P24st_ha_create_informationP10TABLE_LISTP10Alter_info+0x4177)[0xd1f8f7]
/opt/percona-server/bin/mysqld(_ZN19Sql_cmd_alter_table7executeEP3THD+0x517)[0xe0c7f7]
/opt/percona-server/bin/mysqld(_Z21mysql_execute_commandP3THDb+0x1909)[0xcac249]
/opt/percona-server/bin/mysqld(_Z11mysql_parseP3THDP12Parser_state+0x5c5)[0xcb2375]
/opt/percona-server/bin/mysqld(_Z16dispatch_commandP3THDPK8COM_DATA19enum_server_command+0xaba)[0xcb2eba]
/opt/percona-server/bin/mysqld(_Z10do_commandP3THD+0x1b7)[0xcb4907]
/opt/percona-server/bin/mysqld(handle_connection+0x2a0)[0xd796f0]
/opt/percona-server/bin/mysqld(pfs_spawn_thread+0x1b4)[0xf07174]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7fa217bef184]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fa2170fbffd]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fa1acb2d210): ALTER TABLE `instances` ADD COLUMN `resurrection_paused` tinyint(1)
Connection ID (thread ID): 427
Status: NOT_KILLED

You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.

I've attached a snippet of the workload that led to this initially. Running just this snippet for just a single table from this migration in a loop will eventually crash mysqld - this happens reliably in < 1 minute on my local machine.

This seems similar to the symptoms reported in https://bugs.mysql.com/bug.php?id=84672. I see that was difficult to reproduce and hand-waved away as hardware issues.

Looking at Data dictionary memory we (sometimes) see weird results just before a crash - I suppose this is dict_sys->size going negative briefly:

$ while true;do mysql -sse 'SHOW ENGINE INNODB STATUS\G' | egrep '^Dictionary memory allocated';sleep 0.25;done
Dictionary memory allocated 9881
Dictionary memory allocated 22969
Dictionary memory allocated 42601
Dictionary memory allocated 49145
Dictionary memory allocated 18446744073709549046
Dictionary memory allocated 18446744073709549046
Dictionary memory allocated 32745
Dictionary memory allocated 19657
Dictionary memory allocated 18446744073709545734
Dictionary memory allocated 18446744073709545734
Dictionary memory allocated 7206
Dictionary memory allocated 42512
Dictionary memory allocated 13033
ERROR 2013 (HY000): Lost connection to MySQL server at 'reading initial communication packet', system error: 104

As a workaround, enabling old_alter_table=ON seems to avoid the crasher - at least a test will run overnight without crashing.