PS 5.6-QA tree in revid 418 (and earlier) locks up in mysqld

Bug #1222694 reported by Roel Van de Paar on 2013-09-09
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Invalid
Critical
Roel Van de Paar
Random Query Generator
Critical
Roel Van de Paar

Bug Description

[roel@qaserver randgen]$ /ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64/bin/mysql -uroot -S/ssd//236684/current1_8/tmp/master.sock
(...nothing happens...)
^C
[roel@qaserver randgen]$ gdb /ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64/bin/mysqld --pid 25247
[...]
(gdb) thread apply all bt
+thread apply all bt

Thread 1 (process 25247):
+bt
#0 0x0000003636edf253 in poll () from /lib64/libc.so.6
#1 0x00000000005a44ba in handle_connections_sockets () at /bzr/5.6-QA_dbg/Percona-Server-5.6.12-rc60.4/sql/mysqld.cc:6258
#2 0x00000000005ab49b in mysqld_main (argc=73, argv=0x267a208) at /bzr/5.6-QA_dbg/Percona-Server-5.6.12-rc60.4/sql/mysqld.cc:5673
#3 0x0000003636e1ecdd in __libc_start_main () from /lib64/libc.so.6
#4 0x000000000059c735 in _start ()

Tags: qa Edit Tag help
Roel Van de Paar (roel11) wrote :

[roel@qaserver randgen]$ ps -ef | grep 25247
roel 25247 1 0 17:14 pts/1 00:00:22 /ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64/bin/mysqld --no-defaults --basedir=/ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64 --character-sets-dir=/ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64/share/charsets --secure-file-priv=/ssd//236684/current1_8 --log-bin-trust-function-creators --character-set-server=latin1 --lc-messages-dir=/ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64/share --tmpdir=/ssd//236684/current1_8/tmp --connect-timeout=60 --pid-file=/ssd//236684/current1_8/run/master.pid --port=13140 --socket=/ssd//236684/current1_8/tmp/master.sock --datadir=/ssd//236684/current1_8/master-data --log-output=table,file --general-log --general-log-file=/ssd//236684/current1_8/log/master.log --slow-query-log --slow-query-log-file=/ssd//236684/current1_8/log/master-slow.log --server-id=1 --loose-innodb_data_file_path=ibdata1:10M:autoextend --local-infile --loose-skip-ndbcluster --key_buffer_size=1M --sort_buffer=256K --max_heap_table_size=1M --ssl-ca=/ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64/mysql-test/std_data/cacert.pem --ssl-cert=/ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64/mysql-test/std_data/server-cert.pem --ssl-key=/ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64/mysql-test/std_data/server-key.pem --core-file --loose-new --sql-mode=no_engine_substitution --relay-log=slave-relay-bin --loose-innodb --secure-file-priv= --max-allowed-packet=16Mb --loose-innodb-status-file=1 --master-retry-count=65535 --loose-debug-assert-if-crashed-table --loose-debug-assert-on-error --skip-name-resolve --log-output=none --sql_mode=ONLY_FULL_GROUP_BY --utility-user-password=test --innodb_change_buffering=inserts --innodb_changed_pages=FORCE --innodb_flush_log_at_trx_commit=2 --innodb_flush_method=O_DIRECT --innodb_log_arch_dir=/ssd//236684/_epoch/1378710870805672167 --innodb_log_archive=1 --innodb_log_block_size=4096 --innodb_log_buffer_size=1048576 --innodb_log_files_in_group=3 --innodb_max_changed_pages=0 --innodb_track_changed_pages=1 --innodb_use_global_flush_log_at_trx_commit=0 --innodb-buffer-pool-populate --loose-readonly-key-cache-division-limit=1 --query_cache_size=1048576 --query_cache_type=1 --readonly-key-cache-block-size=1 --readonly-loose-max-connect-errors=1 --skip-innodb_doublewrite --thread_handling=pool-of-threads --userstat --utility-user=roel@% --utility-user-schema-access=mysqlinformation_schema --innodb_purge_threads=1 --innodb_adaptive_hash_index_partitions=16 --innodb_buffer_pool_instances=4 --skip-log-bin --core-file --open-files-limit=1024

Roel Van de Paar (roel11) wrote :

It's blocking 5.6-QA testing (full runs restarted on 18/9 fully blocked)

tags: added: 56qual
Roel Van de Paar (roel11) wrote :
Download full text (3.2 KiB)

<Roel> (gdb) thread apply all bt
<Roel> +thread apply all bt
<Roel> Thread 1 (process 5354):
<Roel> +bt
<Roel> #0 0x0000003636edf253 in poll () from /lib64/libc.so.6
<Roel> #1 0x00000000005a450a in handle_connections_sockets () at /bzr/5.6-QA_dbg/Percona-Server-5.6.13-rc60.5/sql/mysqld.cc:6259
<Roel> #2 0x00000000005ab4eb in mysqld_main (argc=76, argv=0x2ca0228) at /bzr/5.6-QA_dbg/Percona-Server-5.6.13-rc60.5/sql/mysqld.cc:5674
<Roel> #3 0x0000003636e1ecdd in __libc_start_main () from /lib64/libc.so.6
<Roel> #4 0x000000000059c785 in _start ()
<Roel> that's it (return to prompt)
<Roel> laurynas,
<Roel> [roel@qaserver 528526]$ /ssd/Percona-Server-5.6.13-rc60.5-440-debug.Linux.x86_64/bin/mysqladmin --socket=/ssd//528526/current1_2/tmp/master.sock ping
<Roel> no prompt return
<laurynas> I don't understand why it's only one thread
<laurynas> where did the rest go
<Roel> idem
<laurynas> error log?
<Roel> good one, checking
<Roel> 2013-09-18 06:44:56 5354 [Note] /ssd/Percona-Server-5.6.13-rc60.5-440-debug.Linux.x86_64/bin/mysqld: ready for connections.
<Roel> Version: '5.6.13-rc60.5-debug-log' socket: '/ssd//528526/current1_2/tmp/master.sock' port: 13020 Percona Server with XtraDB (GPL), Release rc60.5, Revision 440-debug
<Roel> 2013-09-18 06:45:39 5354 [ERROR] Threadpool could not create additional thread to handle queries, because the number of allowed threads was reached. Increasing 'thread_pool_max_threads' parameter can help in this situation.
<Roel> If 'extra_port' parameter is set, you can still connect to the database with superuser account (it must be TCP connection using extra_port as TCP port) and troubleshoot the situation. A likely cause of pool blocks are clients that lock resources for long time. 'show processlist' or 'show engine innodb status' can give additional hints.
<Roel> 2013-09-18 06:45:39 5354 [Note] Threadpool has been blocked for 30 seconds
<Roel> ah
<Roel> [roel@qaserver 528526]$ grep "thread.pool.max.threads" /ssd/randgen/conf/percona_qa/5.6/5.6.yy
<Roel> SET GLOBAL thread_pool_max_threads = zero_to_thousand |
<Roel> still, seems odd?
<laurynas> hmm wait
<laurynas> Roel: zero is not a valid value for tread_pool_max_threads
<Roel> good qa includes boundary value testing :)
<laurynas> with low max thread number tpool might become blocked by design
<Roel> zero_to_thousand:
<Roel> 0 | 1 | 2 | 10 | 100 | 150 | 200 | 250 | 300 | 400 | 500 | 600 | 650 | 700 | 800 | 900 | 999 | 1000 ;
<laurynas> but I still have no idea where did the rest of threads go
<laurynas> zero is invalid-invalid, as in, value range checks will reject it outright
<Roel> ack
<laurynas> now 1 and 2 will stop the server, and probably 10 too
<Roel> but that " Threadpool has been blocked for 30 seconds" was posted ~3-4 hours ago...
<laurynas> Roel: 100 150 200 300 500 600 1000 value list should be OK (even that is long-ish)
<Roel> nothing since then
<Roel> laurynas, should the message not repeat...
<Roel> and the server still reply to pings
<Roel> etc.
<laurynas> not with threadpool
<Roel> cpu's are idle, so no loops
<laurynas> pings go as an incoming request, which I guess needs a worker thread and there are none
<Roel> right
<Roel> ...

Read more...

Roel Van de Paar (roel11) wrote :

Fixed in RQG revid 911

no longer affects: percona-server/5.6
Changed in randgen:
status: New → Fix Committed
importance: Undecided → Critical
assignee: nobody → Roel Van de Paar (roel11)
Changed in percona-server:
status: Fix Committed → Invalid
tags: removed: 56qual qablock

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-371

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers