Thread pool: new threads are created too slowly even with tuning
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MariaDB |
Fix Released
|
Undecided
|
Vladislav Vaintroub |
Bug Description
If there are at least one active threads in a group, it takes long time to create a new one.
In the provided test case (1), two threads are running select from a big table, while the third one is trying to send a very fast query not depending on the table at all (e.g. SELECT NULL FROM DUAL). The timer in the test case starts right before the query and ends right after it, so the resulting time provided by MTR relates to this query only.
In the traditional case, without thread pool, the query takes a few milliseconds (1-3 ms in my tests).
With thread-pool-size=1 it takes 1.685 seconds.
I'm using thread-pool-size=1 for more obvious measurements, in fact the issue is scalable, it only requires more threads to reproduce it with a bigger pool size. For example, with thread-pool-size=4 and 10 threads querying the big table, it again takes about 1.7 seconds. Even with the minimum value of thread-
If you run the same test with count=10 (the number of threads querying the table) and with thread-pool-size=1 and default thread-
There might be various ways how such a delay might impact real-life scenarios. To name one, assume you have a job (or just user inflow) that connects to the database every 0.5 second. Each connection runs a 5-second query and exits.
With the traditional thread handling (one thread per connection), even with a 1-core machine it's a tiny load which won't make any problem. After initial growth to 10-12 open connections or so, it will stabilize and be maintained this way as long as needed.
With the artificial delay on request processing, the connections will start piling up (since the requests arrive faster than they are processed). With default parameters on the same 1-core machine it will be only a matter of a minute or so to hit the max connections limit.
The second test case is an example of a script which can be used to observe the problem. It's not pretty, and it also might stop working the same way and will need adjustments after the bug#909537 is fixed, but it reflects the idea, as it's difficult to implement nicely using standard tools.
# Test case 1
# At the moment, it could be easier implemented with sleeps,
# but I want to separate it from the bug#909537.
--enable_
SET @tp_size=
SET @tp_idle=
SET GLOBAL thread_pool_size=1;
SET GLOBAL thread_
--disable_warnings
DROP TABLE IF EXISTS t1;
--enable_warnings
CREATE TABLE t1 ( i INT );
INSERT INTO t1 VALUES (1),(2),(3),(4);
INSERT INTO t1 SELECT a.i FROM
t1 a, t1 b, t1 c, t1 d, t1 e,
t1 f, t1 g, t1 h, t1 i, t1 j;
--let $count=2
--let $id=$count
while ($id)
{
--connect(
--dec $id
}
# Wait till old idle threads are gone
--sleep 2
--let $id=$count
while ($id)
{
--connection con$id
--send
eval SELECT COUNT(*) FROM t1 WHERE i=$id;
--dec $id
}
--connection default
--start_timer
SELECT NULL FROM DUAL;
--end_timer
SHOW GLOBAL STATUS LIKE 'threadpool%
SHOW PROCESSLIST;
DROP TABLE t1;
SET GLOBAL thread_
SET GLOBAL thread_
--exit
# End of test case 1
# Test case 2
# It assumes you already have the server running
# Tune the sleep time depending on your threadpool parameters
debug/client/mysql -uroot -e "SET GLOBAL lock_wait_
while [ 1 ]
do
debug/
sleep 0.5
done
# End of test case 2
Related branches
description: | updated |
summary: |
- Thread pool: new threads are created too slowly even with maximum tuning + Thread pool: new threads are created too slowly even with tuning |
Please take a look onthe description how thread scheduling is done in SQL Server. The best description is perhaps in Ken Hendersons "internals" book, but there is some information spread around on the internet too (google for SQLServer, ums, scheduling). www.vineetgupta .com/2005/ 11/sql- server- non-preemptive- scheduling- ums-clr- and-fiber- mode/
For example, this one
http://
also describes how SQLServer schedules its tasks, whereby not in great level of details.
It is important to understand, zhat unlike OS scheduler which is preemptive, every pooling scheduler is non-preemptive (more or less, we allow some level of preemption). The objective of preemptive scheduler is giving every task a chance to execute. The objective of non-preemptive scheduler is maximizing performance. Quoting from the article above "job in life is to ensure that there is one unblocked thread executing on that CPU at any given point of time, and all other threads are blocked".
If we followed this rule we would never create an additional thread if there is already an active thread in the group. We do that however, even if it is at the speed which you do not like. We can reduce the stall limit, and we can probably talk about an additional QoS feature "max_queuing_time" parameter (which is I think what you ultimately would like to have - limiting queueing time for a request). But before that I would like to ensure the general objectives and techniques of thread pooling are understood and agreed upon.
Having said this, the example looks like specifically crafted thread pooling anti-pattern. It assumes never ending avalanche of long non-yielding CPU-intensive queries, and it assumes a rather questionably designed scheduled job (there is a better job scheduler in MySQL for such things btw:) . For such things I guess thread- per-connection would work better (and, even if you have threadpooling in MariaDB, you can have a separate thread- per-connection scheduler too, so you may decide according to your needs).