Master database crashes apparently triggered by network outage
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona Server moved to https://jira.percona.com/projects/PS |
New
|
Undecided
|
Unassigned |
Bug Description
Network issues between 2 datacenters triggered an apparent mysql bug that made a large number of our mysql masters crash. This affected a number of instances.
Linux <HOSTNAME> 3.18.27-
The crash happened on a number of different versions of MySQL
5.6.21-70.1
5.6.28-76.1
5.6.29-76.2
5.6.30-76.3
5.6.31-77.0
The following is relevant output from an error log
2016-09-02 09:27:13 12003 [Warning] Aborted connection 32826055 to db: 'unconnected' user: 'repl' host: 'XXX.XXX.XXX.XXX' (Failed on my_net_write())
2016-09-02 09:27:23 12003 [Note] Start binlog_dump to master_
2016-09-02 09:27:36 12003 [Warning] Aborted connection 32826292 to db: 'unconnected' user: 'repl' host: 'XXX.XXX.XXX.XXX' (Got an error reading communication packets)
09:27:40 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://
key_buffer_
read_buffer_
max_used_
max_threads=16386
thread_count=6709
connection_
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7ec9a6a3b000
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7ee802b6be40 thread_stack 0x30000
/usr/sbin/
/usr/sbin/
/lib/x86_
/usr/sbin/
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7ec9fd87e010): is an invalid pointer
Connection ID (thread ID): 32826829
Status: NOT_KILLED
You may download the Percona Server operations manual by visiting
http://
in the manual which will help you identify the cause of the crash.
160902 09:27:43 mysqld_safe Number of processes running now: 0
160902 09:27:43 mysqld_safe mysqld restarted
=======
My.cnf
=======
#
# The MySQL database server configuration file.
#
# You can copy this to one of:
# - "/etc/mysql/my.cnf" to set global options,
# - "~/.my.cnf" to set user-specific options.
#
# One can use all long options that the program supports.
# Run program with --help to get a list of available options and with
# --print-defaults to see which it would actually understand and use.
#
# For explanations see
# http://
# This will be passed to all mysql clients
# It has been reported that passwords should be enclosed with ticks/quotes
# escpecially if they contain "#" chars...
# Remember to edit /etc/mysql/
[client]
port = 3306
socket = /var/run/
# Here is entries for some specific programs
# The following values assume you have at least 32M ram
# This was formally known as [safe_mysqld]. Both versions are currently parsed.
[mysqld_safe]
socket = /var/run/
nice = 0
numa_interleave = 1
flush_caches = 1
open_files_limit = 65536
[mysqld]
#
# * Basic Settings
#
user = mysql
pid-file = /var/run/
socket = /var/run/
port = 3306
basedir = /usr
datadir = /var/lib/mysql
tmpdir = /tmp
lc-messages-dir = /usr/share/mysql
default-time-zone = '+0:00'
skip-external-
# don't do dns or something
skip-name-resolve
performance_schema = 0
# it'd be cool to set this to only listen in 127.0.0.1 and 10.x.y.z but not
# the public IP. I don't know if bind-address accepts multiple values,
# though
bind-address = 0.0.0.0
#
# * Fine Tuning
#
# These are the Debian defaults and probably need to be tuned
key_buffer = 16M
max_allowed_packet = 128M
thread_stack = 192K
thread_cache_size = 8
# This replaces the startup script and checks MyISAM tables if needed
# the first time they are touched
myisam-recover = BACKUP
max_connections = 16384
max_user_
#table_cache = 64
#thread_concurrency = 10
#
# * Query Cache Configuration
#
query_cache_limit = 1M
query_cache_size = 16M
#
# * Logging and Replication
#
# Both location gets rotated by the cronjob.
# Be aware that this log type is a performance killer.
# As of 5.1 you can enable the log at runtime!
#general_log_file = /var/log/
#general_log = 1
#
# Error log - should be very few entries.
#
log-warnings = 2
log_error = /var/log/
#
# Here you can see queries with especially long duration
slow_query_log = 1
slow_query_log_file = /var/log/
long_query_time = 10
#log-queries-
log_slow_verbosity = microtime,innodb
slow_query_
#
# The following can be used as easy to replay backup logs or for replication.
# note: if you are setting up a replication slave, see README.Debian about
# other settings you may need to change.
server-id = 35791131
report-host = schemalessdb473
log-bin = /var/lib/
auto_increment_
auto_increment_
enforce-
gtid-mode = ON
# force an fsync every statement (trading performance to avoid corruption)
sync_binlog = 1
log-slave-updates
expire_logs_days = 5
slave-net-time = 30
max_binlog_size = 1G
binlog_format = MIXED
table_definitio
table_open_
lock_wait_timeout = 300
relay_log_
relay_log_recovery = ON
default-
#binlog_do_db = include_
#binlog_ignore_db = include_
#
# * InnoDB
#
# InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/.
# Read the manual for more InnoDB related options. There are many!
innodb_
innodb_flush_method = O_DIRECT
innodb_
innodb_
innodb_file_format = ANTELOPE
innodb_
innodb_
#
# * Security Features
#
# Read the manual, too, if you want chroot!
# chroot = /var/lib/mysql/
#
# For generating SSL certificates I recommend the OpenSSL GUI "tinyca".
#
# ssl-ca=
# ssl-cert=
# ssl-key=
character-
collation-
[mysqldump]
quick
quote-names
max_allowed_packet = 16M
single-transaction
[mysql]
#no-auto-rehash # faster start of mysql but no tab completition
[isamchk]
key_buffer = 16M
#
# * IMPORTANT: Additional settings that can override those from this file!
# The files must end with '.cnf', otherwise they'll be ignored.
#
!includedir /etc/mysql/conf.d/
# vim: set syntax=conf:
Here are relevant logs from another MySQL server with the crash:
2016-09-02 08:40:55 58538 [Note] While initializing dump thread for slave with UUID <xxx>, found a zombie dump thread with the same UUID. Master is killing the zombie dump thread(1506741). thread_ id(8749347) slave_server( 41716794) , pos(, 4) thread_ id(8749361) slave_server( 41716794) , pos(, 4) thread_ id(8749365) slave_server( 409262272) , pos(, 4) bugs.percona. com/
2016-09-02 08:40:55 58538 [Note] Start binlog_dump to master_
2016-09-02 08:41:35 58538 [Warning] Aborted connection 8749347 to db: 'unconnected' user: 'repl' host: 'X.X.X.X' (Failed on my_net_write())
2016-09-02 08:41:40 58538 [Note] While initializing dump thread for slave with UUID <xxx>, found a zombie dump thread with the same UUID. Master is killing the zombie dump thread(1506741).
2016-09-02 08:41:40 58538 [Note] Start binlog_dump to master_
2016-09-02 08:41:50 58538 [Warning] Aborted connection 1506741 to db: 'unconnected' user: 'repl' host: 'X.X.X.X' (failed on net_flush())
2016-09-02 08:41:54 58538 [Note] While initializing dump thread for slave with UUID <xxx>, found a zombie dump thread with the same UUID. Master is killing the zombie dump thread(8749333).
2016-09-02 08:41:54 58538 [Note] Start binlog_dump to master_
2016-09-02 08:42:04 58538 [Warning] Aborted connection 8749333 to db: 'unconnected' user: 'repl' host: 'X.X.X.X' (Failed on my_net_write())
2016-09-02 08:43:14 58538 [Warning] Aborted connection 8749365 to db: 'unconnected' user: 'repl' host: 'X.X.X.X' (Failed on my_net_write())
2016-09-02 08:43:29 58538 [Warning] Aborted connection 8749361 to db: 'unconnected' user: 'repl' host: 'X.X.X.X' (Failed on my_net_write())
2016-09-02 08:43:33 58538 [Warning] Aborted connection 8749377 to db: 'unconnected' user: 'repl' host: 'X.X.X.X' (Got an error reading communication packets)
08:43:53 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://
key_buffer_ size=16777216 size=131072 connections= 830 count=811 size)*max_ threads = 1816973 K bytes of memory
read_buffer_
max_used_
max_threads=4098
thread_count=811
connection_
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7fa07171e000 mysqld( my_print_ stacktrace+ 0x2c)[0x8e66dc] mysqld( handle_ fatal_signal+ 0x461)[ 0x66bcb1] 64-linux- gnu/libpthread. so.0(+0xfcb0) [0x7fb37fdfdcb0 ] mysqld[ 0x1320e40]
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fa13ec5be10 thread_stack 0x30000
/usr/sbin/
/usr/sbin/
/lib/x86_
/usr/sbin/