server hang on writes after thread-pool turned on
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona Server moved to https://jira.percona.com/projects/PS |
New
|
Undecided
|
Unassigned |
Bug Description
I can not fully confirm it's related to changing to "thread_
The symptom was, our web service went down due to can not connect to the MySQL server which showed 'too many connections'. (Extra port can not access either, so not able to use console) The server has 32K connections limit but typically the connection number is only a few thousands. lsof showed the process had tons of connections in 'CLOSE_WAIT' state, which means mysqld did not call 'close' on them. In the meanwhile, the load / cpu / diskio on this MySQL server was super low. I gdb into it, seeing a lot of threads were in :
#0 pthread_
#1 0x00000000008d5fd7 in ?? ()
#2 0x00000000008d6532 in thr_lock ()
#3 0x00000000008d6bdb in thr_multi_lock ()
#4 0x00000000007d2c99 in mysql_lock_
...
(I attached the full gdb output)
We had in-house MySQL traffic monitor tool based on in/out packet sniffing, it shows many writes query got stuck all of a sudden, no response packet went out of the server after that.
We finally have to kill the server and restart it.
+------
| Variable_name | Value |
+------
| innodb_version | 5.6.31-77.0 |
| protocol_version | 10 |
| slave_type_
| tls_version | TLSv1.1,TLSv1.2 |
| version | 5.6.31-77.0-log |
| version_comment | Percona Server (GPL), Release 77.0, Revision 5c1061c |
| version_
| version_compile_os | debian-linux-gnu |
+------
thread pool variables I set in my.cnf:
thread_
extra_max_
extra_port=3307
thread_
other variables:
sql_mode = STRICT_ALL_TABLES
key_buffer = 32M
max_allowed_packet = 16M
thread_stack = 256K
thread_cache_size = 64
default_
max_connections = 32000
table_open_cache = 10240
innodb_
innodb_
innodb_
innodb_
innodb_
innodb_
innodb_
innodb_flush_method = O_DIRECT
innodb_
innodb_
innodb_io_capacity = 10000
max_connect_errors = 1000000
#thread_concurrency = 10
back_log = 8192
I found the cause, it was because the running of mysqldump while there were a lot of writes operation.
Basically mysqldump gets read lock at the beginning, and by default the queries by mysqldump go to low priority queue, so the later unlock table query will have no chance to be scheduled if there were too many writes happened and got blocked.
fix suggestion is: unix.cc: connection_ is_high_ prio(): transaction_ active( c->thd) transaction_ active( c->thd) || thd_in_ lock_tables( c->thd) )
In sql/threadpool_
change
c->tickets > 0 && thd_is_
to
c->tickets > 0 && (thd_is_