Repeated assertion failures on 5.6 when promoted master

Bug #1614935 reported by Louis Rouxel
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server moved to https://jira.percona.com/projects/PS
New
Undecided
Unassigned

Bug Description

I have a server that has been successfully working as a Percona Server 5.5 for years (master, then slave).
We have successfully upgraded it with Percona Server 5.6 and it has worked properly as a slave for several weeks.
Once promoted to master, it is crashing repetitevely until switched back to a slave status.

The master workload is quite intensive (5TB dataset, 300,000 tables, +1000 simultaneous connections, up to 1000 queries per second), while the slave workload is more quiet.

I know that repeated assertion failures indicate corrupted InnoDB tablespace but the fact that everything works properly when the server is a slave, including full backups, makes me think we might have another kind of issue here.

Version : 5.6.31-rel77.0

It has happened 6 times within 24 hours. Sometimes very quickly after startup.

The error is *always* "Assertion failure in thread xxxin file trx0trx.cc line 1384
Failing assertion: UT_LIST_GET_LEN(trx->lock.trx_locks) == 0"

Error log :

Version: '5.6.31-77.0-log' socket: '/var/run/mysqld/mysqld.sock' port: 3306 Percona Server (GPL), Release 77.0, Revision 5c1061c
2016-08-18 15:49:49 2be931e43700 InnoDB: Assertion failure in thread 48280564414208 in file trx0trx.cc line 1384
InnoDB: Failing assertion: UT_LIST_GET_LEN(trx->lock.trx_locks) == 0
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
13:49:49 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=33554432
read_buffer_size=65536
max_used_connections=439
max_threads=3002
thread_count=434
connection_count=432
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 458592 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x2be7c814d000
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 2be931e42e00 thread_stack 0x40000
/usr/local/mysql/bin/mysqld(my_print_stacktrace+0x2e)[0x8e212e]
/usr/local/mysql/bin/mysqld(handle_fatal_signal+0x491)[0x6a7e41]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf0a0)[0x2b9ff27e50a0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35)[0x2b9ff4486125]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x180)[0x2b9ff44893a0]
/usr/local/mysql/bin/mysqld[0xa2a579]
/usr/local/mysql/bin/mysqld[0xa2ad9d]
/usr/local/mysql/bin/mysqld[0xa2b395]
/usr/local/mysql/bin/mysqld[0x947618]
/usr/local/mysql/bin/mysqld(_Z13ha_commit_lowP3THDbb+0x86)[0x605786]
/usr/local/mysql/bin/mysqld(_ZN13MYSQL_BIN_LOG6commitEP3THDb+0x1a9)[0x89e279]
/usr/local/mysql/bin/mysqld(_Z15ha_commit_transP3THDbb+0x2f7)[0x605df7]
/usr/local/mysql/bin/mysqld(_Z17trans_commit_stmtP3THD+0x29)[0x799ca9]
/usr/local/mysql/bin/mysqld(_Z21mysql_execute_commandP3THD+0x908)[0x71abd8]
/usr/local/mysql/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x598)[0x71f578]
/usr/local/mysql/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0xf01)[0x720511]
/usr/local/mysql/bin/mysqld(_Z24do_handle_one_connectionP3THD+0x15f)[0x6f40ff]
/usr/local/mysql/bin/mysqld(handle_one_connection+0x47)[0x6f41e7]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x2b9ff27dcb50]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2b9ff4531fbd]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (2be7b7a91010): is an invalid pointer
Connection ID (thread ID): 5429
Status: NOT_KILLED

You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.
160818 15:49:53 mysqld_safe Number of processes running now: 0
160818 15:49:53 mysqld_safe mysqld restarted

Configuration :

[mysqld]
slave-skip-errors = 1062
log-slave-updates = ON
query_cache_size = 0
query_cache_type = 0
ssl-ca = /etc/mysql/ssl/ca-cert.pem
ssl-cert = /etc/mysql/ssl/cert.pem
ssl-key = /etc/mysql/ssl/cert.key
skip-log-warnings
read_only = 0
basedir = /usr/local/mysql/
log-bin = /mysql/log/bin-log
relay-log = /mysql/log/relay-bin
server-id = 1
binlog_format = STATEMENT
skip-name-resolve
skip-external-locking
user = mysql
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
port = 3306
datadir = /var/lib/mysql/data
tmpdir = /var/lib/mysql/tmp
bind-address = 0.0.0.0
key_buffer_size = 32M
read_buffer_size = 64K
read_rnd_buffer_size = 1M
sort_buffer_size = 64K
max_allowed_packet = 200M
long_query_time = 0
max_connections = 3000
back_log = 500
delayed_queue_size = 50000
character-set-server = utf8
default-storage-engine = InnoDB
thread_cache_size = 100
expire_logs_days = 1
low-priority-updates
innodb_stats_on_metadata = 0
innodb_file_per_table
innodb_data_file_path = ibdata1:10M:autoextend
innodb_flush_method = O_DIRECT
innodb_buffer_pool_size = 270G
innodb_log_file_size = 4G
innodb_log_buffer_size = 16M
innodb_thread_sleep_delay = 1000
innodb_thread_concurrency = 36
innodb_log_file_size = 8G
innodb_io_capacity = 6000
innodb_log_group_home_dir = /mysql/iblog
performance_schema = OFF
innodb_buffer_pool_load_at_startup = 1
innodb_buffer_pool_dump_at_shutdown = 1
innodb_file_format = barracuda
max-connect-errors = 1000000
log-error = /mysql/log/mysql-error.log
innodb_read_io_threads = 16
innodb_write_io_threads = 16
innodb_purge_threads = 1
table_open_cache = 81920
innodb_open_files = 131072
open-files-limit = 166851
table-definition-cache = 8192
innodb_flush_neighbors = 0
innodb_read_ahead = 0
binlog_checksum = none

I've solved the problem by switching production back to a 5.5 server, and leaving this 5.6 as a slave.

Tags: upstream
TazFerdi (tazferdi)
no longer affects: debian
Revision history for this message
Louis Rouxel (lrouxel) wrote :

Hi,

We have investigated on this crash and it is not related to concurrency or write operations.
It is actually triggered by a specific kind of request, with CONVERT_TZ, that was not run on the slaves.

This is an upstream MySQL bug,
We have created a ticket here with additional details and specific ways to reproduce the crash:
http://bugs.mysql.com/bug.php?id=82910

tags: added: upstream
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-3519

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.