PS 5.6-QA tree in revid 418 (and earlier) locks up in mysqld

Bug #1222694 reported by Roel Van de Paar
256
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Invalid
Critical
Roel Van de Paar
Random Query Generator
Fix Committed
Critical
Roel Van de Paar

Bug Description

[roel@qaserver randgen]$ /ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64/bin/mysql -uroot -S/ssd//236684/current1_8/tmp/master.sock
(...nothing happens...)
^C
[roel@qaserver randgen]$ gdb /ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64/bin/mysqld --pid 25247
[...]
(gdb) thread apply all bt
+thread apply all bt

Thread 1 (process 25247):
+bt
#0 0x0000003636edf253 in poll () from /lib64/libc.so.6
#1 0x00000000005a44ba in handle_connections_sockets () at /bzr/5.6-QA_dbg/Percona-Server-5.6.12-rc60.4/sql/mysqld.cc:6258
#2 0x00000000005ab49b in mysqld_main (argc=73, argv=0x267a208) at /bzr/5.6-QA_dbg/Percona-Server-5.6.12-rc60.4/sql/mysqld.cc:5673
#3 0x0000003636e1ecdd in __libc_start_main () from /lib64/libc.so.6
#4 0x000000000059c735 in _start ()

Tags: qa
Revision history for this message
Roel Van de Paar (roel11) wrote :

[roel@qaserver randgen]$ ps -ef | grep 25247
roel 25247 1 0 17:14 pts/1 00:00:22 /ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64/bin/mysqld --no-defaults --basedir=/ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64 --character-sets-dir=/ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64/share/charsets --secure-file-priv=/ssd//236684/current1_8 --log-bin-trust-function-creators --character-set-server=latin1 --lc-messages-dir=/ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64/share --tmpdir=/ssd//236684/current1_8/tmp --connect-timeout=60 --pid-file=/ssd//236684/current1_8/run/master.pid --port=13140 --socket=/ssd//236684/current1_8/tmp/master.sock --datadir=/ssd//236684/current1_8/master-data --log-output=table,file --general-log --general-log-file=/ssd//236684/current1_8/log/master.log --slow-query-log --slow-query-log-file=/ssd//236684/current1_8/log/master-slow.log --server-id=1 --loose-innodb_data_file_path=ibdata1:10M:autoextend --local-infile --loose-skip-ndbcluster --key_buffer_size=1M --sort_buffer=256K --max_heap_table_size=1M --ssl-ca=/ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64/mysql-test/std_data/cacert.pem --ssl-cert=/ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64/mysql-test/std_data/server-cert.pem --ssl-key=/ssd/Percona-Server-5.6.12-rc60.4-418-debug.Linux.x86_64/mysql-test/std_data/server-key.pem --core-file --loose-new --sql-mode=no_engine_substitution --relay-log=slave-relay-bin --loose-innodb --secure-file-priv= --max-allowed-packet=16Mb --loose-innodb-status-file=1 --master-retry-count=65535 --loose-debug-assert-if-crashed-table --loose-debug-assert-on-error --skip-name-resolve --log-output=none --sql_mode=ONLY_FULL_GROUP_BY --utility-user-password=test --innodb_change_buffering=inserts --innodb_changed_pages=FORCE --innodb_flush_log_at_trx_commit=2 --innodb_flush_method=O_DIRECT --innodb_log_arch_dir=/ssd//236684/_epoch/1378710870805672167 --innodb_log_archive=1 --innodb_log_block_size=4096 --innodb_log_buffer_size=1048576 --innodb_log_files_in_group=3 --innodb_max_changed_pages=0 --innodb_track_changed_pages=1 --innodb_use_global_flush_log_at_trx_commit=0 --innodb-buffer-pool-populate --loose-readonly-key-cache-division-limit=1 --query_cache_size=1048576 --query_cache_type=1 --readonly-key-cache-block-size=1 --readonly-loose-max-connect-errors=1 --skip-innodb_doublewrite --thread_handling=pool-of-threads --userstat --utility-user=roel@% --utility-user-schema-access=mysqlinformation_schema --innodb_purge_threads=1 --innodb_adaptive_hash_index_partitions=16 --innodb_buffer_pool_instances=4 --skip-log-bin --core-file --open-files-limit=1024

Revision history for this message
Roel Van de Paar (roel11) wrote :

It's blocking 5.6-QA testing (full runs restarted on 18/9 fully blocked)

tags: added: 56qual
Revision history for this message
Roel Van de Paar (roel11) wrote :
Download full text (3.2 KiB)

<Roel> (gdb) thread apply all bt
<Roel> +thread apply all bt
<Roel> Thread 1 (process 5354):
<Roel> +bt
<Roel> #0 0x0000003636edf253 in poll () from /lib64/libc.so.6
<Roel> #1 0x00000000005a450a in handle_connections_sockets () at /bzr/5.6-QA_dbg/Percona-Server-5.6.13-rc60.5/sql/mysqld.cc:6259
<Roel> #2 0x00000000005ab4eb in mysqld_main (argc=76, argv=0x2ca0228) at /bzr/5.6-QA_dbg/Percona-Server-5.6.13-rc60.5/sql/mysqld.cc:5674
<Roel> #3 0x0000003636e1ecdd in __libc_start_main () from /lib64/libc.so.6
<Roel> #4 0x000000000059c785 in _start ()
<Roel> that's it (return to prompt)
<Roel> laurynas,
<Roel> [roel@qaserver 528526]$ /ssd/Percona-Server-5.6.13-rc60.5-440-debug.Linux.x86_64/bin/mysqladmin --socket=/ssd//528526/current1_2/tmp/master.sock ping
<Roel> no prompt return
<laurynas> I don't understand why it's only one thread
<laurynas> where did the rest go
<Roel> idem
<laurynas> error log?
<Roel> good one, checking
<Roel> 2013-09-18 06:44:56 5354 [Note] /ssd/Percona-Server-5.6.13-rc60.5-440-debug.Linux.x86_64/bin/mysqld: ready for connections.
<Roel> Version: '5.6.13-rc60.5-debug-log' socket: '/ssd//528526/current1_2/tmp/master.sock' port: 13020 Percona Server with XtraDB (GPL), Release rc60.5, Revision 440-debug
<Roel> 2013-09-18 06:45:39 5354 [ERROR] Threadpool could not create additional thread to handle queries, because the number of allowed threads was reached. Increasing 'thread_pool_max_threads' parameter can help in this situation.
<Roel> If 'extra_port' parameter is set, you can still connect to the database with superuser account (it must be TCP connection using extra_port as TCP port) and troubleshoot the situation. A likely cause of pool blocks are clients that lock resources for long time. 'show processlist' or 'show engine innodb status' can give additional hints.
<Roel> 2013-09-18 06:45:39 5354 [Note] Threadpool has been blocked for 30 seconds
<Roel> ah
<Roel> [roel@qaserver 528526]$ grep "thread.pool.max.threads" /ssd/randgen/conf/percona_qa/5.6/5.6.yy
<Roel> SET GLOBAL thread_pool_max_threads = zero_to_thousand |
<Roel> still, seems odd?
<laurynas> hmm wait
<laurynas> Roel: zero is not a valid value for tread_pool_max_threads
<Roel> good qa includes boundary value testing :)
<laurynas> with low max thread number tpool might become blocked by design
<Roel> zero_to_thousand:
<Roel> 0 | 1 | 2 | 10 | 100 | 150 | 200 | 250 | 300 | 400 | 500 | 600 | 650 | 700 | 800 | 900 | 999 | 1000 ;
<laurynas> but I still have no idea where did the rest of threads go
<laurynas> zero is invalid-invalid, as in, value range checks will reject it outright
<Roel> ack
<laurynas> now 1 and 2 will stop the server, and probably 10 too
<Roel> but that " Threadpool has been blocked for 30 seconds" was posted ~3-4 hours ago...
<laurynas> Roel: 100 150 200 300 500 600 1000 value list should be OK (even that is long-ish)
<Roel> nothing since then
<Roel> laurynas, should the message not repeat...
<Roel> and the server still reply to pings
<Roel> etc.
<laurynas> not with threadpool
<Roel> cpu's are idle, so no loops
<laurynas> pings go as an incoming request, which I guess needs a worker thread and there are none
<Roel> right
<Roel> ...

Read more...

Revision history for this message
Roel Van de Paar (roel11) wrote :

Fixed in RQG revid 911

no longer affects: percona-server/5.6
Changed in randgen:
status: New → Fix Committed
importance: Undecided → Critical
assignee: nobody → Roel Van de Paar (roel11)
Changed in percona-server:
status: Fix Committed → Invalid
tags: removed: 56qual qablock
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-371

Nathan Chen (nchen1)
information type: Public → Public Security
To post a comment you must log in.
This report contains Public Security information  
Everyone can see this security related information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.