pt-kill kills prepared statements without checking busy-time

Bug #1016272 reported by Gavin Towey on 2012-06-21
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Percona Toolkit
Carlos Salguero

Bug Description

We're using pt-kill with the following options:
pt-kill --busy-time 120s --print --match-info "^(select|SELECT)" --interval 10

The idea is to only kill select queries running longer than 120 seconds.

However there's a problem with the logic that pt-kill uses to kill queries. If the Command is not equal to Query, then busy time is ignored, and other match parameters are free to take effect.

In this case, a prepared statement is in the "Execute" Command state, and running a SELECT in the Info part. It gets killed immediately by pt-kill every time.

The current assumption that only items in the processlist with Command=Query are "busy" But that isn't the case.

I think the right fix is to say that any query which is *not idle* is busy.

That is to change the current busy time check from:
if ( $find_spec{busy_time} && ($query->{Command} || '') eq 'Query' ) {

if ( $find_spec{busy_time} && ($query->{Command} || '') ne 'Sleep' ) {

However, I understand that this isn't a trivial change. The smallest change which would fix this issue is to add Execute to the if statement as well as Query, but that seems like it could leave the script vulnerable to similar issues.

The MySQL documentation in this case seems to agree with my case: that many queries not having Command=Query should actually be considered busy:

Gavin Towey (gtowey) wrote :

By the way, we can work around this by adding --ignore-command Execute, but it seems the default behavior does not work as intended.

affects: perconatools → percona-toolkit
tags: added: pt-kill
Changed in percona-toolkit:
status: New → Triaged
Baron Schwartz (baron-xaprb) wrote :

I believe you. I think the tool doesn't work intuitively in any case. I'm not sure whether we will fix this (since you have a workaround) or wait until we redesign/rebuild the tool entirely.

Max Bubenick (max-bubenick) wrote :

Hi, can we resume this bug?

Is still present, and with Gavin's workaround on comment #1 you can't kill queries on Execute state which is needed in several cases.

tags: added: rdba
Sergei Turchanov (turchanov) wrote :

This bug is still present and bites us very hard. It kills prepared statements with busy time less than was configured.

I made my own fix. To be on a safe side I added 6 additional states which correspond to prepared statements (COM_STMT_PREPARE, COM_STMT_EXECUTE, COM_STMT_SEND_LONG_DATA, COM_STMT_CLOSE, COM_STMT_RESET, COM_STMT_FETCH).

--- pt-kill.orig 2017-06-21 11:10:30.000000000 +1000
+++ 2017-06-22 11:34:32.000000000 +1000
@@ -3520,7 +3520,16 @@ sub find {
          next QUERY;

- if ( $find_spec{busy_time} && ($query->{Command} || '') eq 'Query' ) {
+ my %busy_commands = (
+ "Query" => 1,
+ "Execute" => 1,
+ "Prepare" => 1,
+ "Fetch" => 1,
+ "Long Data" => 1,
+ "Reset stmt" => 1,
+ "Close stmt" => 1,
+ );
+ if ( $find_spec{busy_time} && exists($busy_commands{$query->{Command} || ''}) ) {
          next QUERY unless defined($query->{Time});
          if ( $query->{Time} < $find_spec{busy_time} ) {
             PTDEBUG && _d("Query isn't running long enough");

I personally like "if ( $find_spec{busy_time} && ($query->{Command} || '') ne 'Sleep' ) {" variant as it looks much shorter.

But this is a very serious bug which must be addressed ASAP.

Changed in percona-toolkit:
importance: Undecided → Medium
assignee: nobody → Carlos Salguero (carlos-salguero)
tags: added: pt167
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers