Comment 6 for bug 1221608

Revision history for this message
Roel Van de Paar (roel11) wrote :

I think we located the problem for this, and it looks like it was RQG yy (sql) grammar related:

PAST: SET GLOBAL thread_pool_max_threads = zero_to_thousand
NOW: SET GLOBAL thread_pool_max_threads = hundred_to_thousand

zero_to_thousand:
        0 | 1 | 2 | 10 | 100 | 150 | 200 | 250 | 300 | 400 | 500 | 600 | 650 | 700 | 800 | 900 | 999 | 1000 ;

hundred_to_thousand:
        100 | 150 | 200 | 250 | 300 | 400 | 500 | 600 | 650 | 700 | 800 | 900 | 999 | 1000 ;

If the issue is seen again, it's different (i.e. 100 should be plenty, with a maximum of --threads=25 in RQG)

However, even if this be the cause, the problem is that mysqld just completely freezes/locks up and there is nothing one can do with it. The only and last message shown is "Threadpool has been blocked for 30 seconds" and then it just sits there (with the one possible exception of shutdown working correctly as per the log above, but this would need re-verification).

If this is the way that threadpool locks up the server, then maybe the error message should be repeated regularly at the very least, so it's more clear what is happening), or maybe another timeout of some sort would be an idea?

There is another oddity: as per chats with Laurynas, when a 'thread apply all bt' is executed in gdb against a server hanging like this, only a single thread is shown. Here is also an important question: is the server still actually doing something, or not? (I.e. are the "other" live threads still live?). Can development team have a look into adding an MTR testcase for this? Should be relatively easy with a low thread_pool_max_threads setting + a larger set of executing threads matched with DEBUG_SYNC if so required.

As the last question, if it's true, would be a critical one (i.e. server not processing anything in the locked up state), I will mark this bug as critical and 56qual until we can prove otherwise.