unpredictable log_slow_filter behaviour

Bug #1030783 reported by kernel on 2012-07-30
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
Medium
Laurynas Biveinis
5.1
Invalid
Medium
Unassigned
5.5
Fix Released
Medium
Laurynas Biveinis

Bug Description

log_slow_filter behaviour is a total mess in 5.5.25a-27.1. Tested on debian squeeze package + compiled from src on another machine. Slow log configuration is in attached file. During testing I've manipulated local log_slow_filter var, global remained intact with value as shown in attached file.

Test query plan:
# Query_time: 5.859329 Lock_time: 0.000176 Rows_sent: 393 Rows_examined: 1869321 Rows_affected: 0 Rows_read: 393
# Bytes_sent: 35898 Tmp_tables: 1 Tmp_disk_tables: 0 Tmp_table_sizes: 127040
# InnoDB_trx_id: 3807
# QC_Hit: No Full_scan: No Full_join: No Tmp_table: Yes Tmp_table_on_disk: No
# Filesort: Yes Filesort_on_disk: No Merge_passes: 0

*** full_scan,full_join,tmp_table_on_disk,filesort_on_disk
logged despite not mathed

*** full_scan
logged despite not matched

*** full_join
not logged, as expected.
however, another statement with plan
# QC_Hit: No Full_scan: Yes Full_join: No Tmp_table: Yes Tmp_table_on_disk: No
# Filesort: Yes Filesort_on_disk: No Merge_passes: 0
was logged while shouldn't

*** tmp_table
not logged despite matched

*** tmp_table_on_disk
logged despite not matched

*** filesort
not logged despite matched

*** filesort_on_disk
logged despite not matched

Also percona_log_slow_filter testcase has code for only one filter value - full_join, others should be added. And "negative" (that should NOT be logged) statements should be tested as well.

Related branches

Confirmed that it is broken.

Few instances:

1. If log_slow_filter="full_scan" the query is logged even if it is not a full_scan

This is due to a bug:

a)sys_vars.cc:

const char *log_slow_filter_name[]= { "qc_miss", "full_scan", "full_join",
                                      "tmp_table", "tmp_table_on_disk", "filesort", "filesort_on_disk", 0};

b)sql_class.h:

#define QPLAN_NONE 0
#define QPLAN_QC 1 << 0
#define QPLAN_QC_NO 1 << 1
#define QPLAN_FULL_SCAN 1 << 2
#define QPLAN_FULL_JOIN 1 << 3
#define QPLAN_TMP_TABLE 1 << 4
#define QPLAN_TMP_DISK 1 << 5
#define QPLAN_FILESORT 1 << 6
#define QPLAN_FILESORT_DISK 1 << 7

c)
  /* Follow the slow log filter configuration. */
  if (thd->variables.log_slow_filter != 0 &&
      (!(thd->variables.log_slow_filter & thd->query_plan_flags) ||
       ((thd->variables.log_slow_filter & SLOG_F_QC_NO) &&
        (thd->query_plan_flags & QPLAN_QC))))
    DBUG_VOID_RETURN;

So, as you can see the #defines and the set variables are not aligned.

For instance, for a non-table scan simple query with query cache miss thd->query_flags=2 (QPLAN_QC_NO)

But in the set, full_scan = 2 and qc_miss=1,
so

thd->variables.log_slow_filter & thd->query_plan_flags = 2 & 2 => !(2 & 2) is false and the query is NOT filtered out.

The fix is simple:

===========================================================
--- sql/sql_class.h 2012-07-17 22:27:15.000000000 +0530
+++ /tmp/sql_class.h 2012-08-06 04:16:58.041077338 +0530
@@ -80,14 +80,14 @@
   SLOG_RT_SESSION, SLOG_RT_QUERY
 };
 #define QPLAN_NONE 0
-#define QPLAN_QC 1 << 0
-#define QPLAN_QC_NO 1 << 1
-#define QPLAN_FULL_SCAN 1 << 2
-#define QPLAN_FULL_JOIN 1 << 3
-#define QPLAN_TMP_TABLE 1 << 4
-#define QPLAN_TMP_DISK 1 << 5
-#define QPLAN_FILESORT 1 << 6
-#define QPLAN_FILESORT_DISK 1 << 7
+#define QPLAN_QC_NO 1 << 0
+#define QPLAN_FULL_SCAN 1 << 1
+#define QPLAN_FULL_JOIN 1 << 2
+#define QPLAN_TMP_TABLE 1 << 3
+#define QPLAN_TMP_DISK 1 << 4
+#define QPLAN_FILESORT 1 << 5
+#define QPLAN_FILESORT_DISK 1 << 6
+#define QPLAN_QC 1 << 7
 enum enum_log_slow_filter {
   SLOG_F_QC_NO, SLOG_F_FULL_SCAN, SLOG_F_FULL_JOIN,
   SLOG_F_TMP_TABLE, SLOG_F_TMP_DISK, SLOG_F_FILESORT,

--- sql/sql_parse.cc 2012-07-17 22:27:15.000000000 +0530
+++ /tmp/sql_parse.cc 2012-08-06 04:26:48.177788229 +0530
@@ -1534,7 +1534,7 @@
   /* Follow the slow log filter configuration. */
   if (thd->variables.log_slow_filter != 0 &&
       (!(thd->variables.log_slow_filter & thd->query_plan_flags) ||
- ((thd->variables.log_slow_filter & SLOG_F_QC_NO) &&
+ ((thd->variables.log_slow_filter & QPLAN_QC_NO) &&
         (thd->query_plan_flags & QPLAN_QC))))
     DBUG_VOID_RETURN;

=================================================================

(The enum enum_log_slow_filter is also not required and can be removed)

2. If log_filter is anything that is not full_scan, the query is not logged (even if it doesn't belong in any categories above) for same reason as above.

Instead of all those changes, this will also do:

>>bzr diff Percona-Server/sql/sql_parse.cc
=== modified file 'Percona-Server/sql/sql_parse.cc'
--- Percona-Server/sql/sql_parse.cc 2012-08-13 13:42:13 +0000
+++ Percona-Server/sql/sql_parse.cc 2012-08-31 03:56:00 +0000
@@ -1535,7 +1535,7 @@
   /* Follow the slow log filter configuration. */
   if (thd->variables.log_slow_filter != 0 &&
       (!(thd->variables.log_slow_filter & thd->query_plan_flags) ||
- ((thd->variables.log_slow_filter & SLOG_F_QC_NO) &&
+ ((thd->variables.log_slow_filter & (1 << SLOG_F_QC_NO)) &&
         (thd->query_plan_flags & QPLAN_QC))))
     DBUG_VOID_RETURN;

Changed in percona-server:
status: New → Confirmed
Brian Moon (brianlmoon) wrote :

Any update on this? It is hindering my ability to do some query analysis. Thanks.

Stewart Smith (stewart) on 2012-12-10
Changed in percona-server:
importance: Undecided → Medium
status: Confirmed → Triaged
tags: added: slow-extended
George Ormond Lorch III (gl-az) wrote :

Having reviewed the code I can confirm that the full_scan, full_join, tmp_table, tmp_table_on_disk, filesort and filesort_on_disk have absolutely no code implementing the filtering desired. Only the qc_miss is implemented and currently has a fix similar to the one Raghavendra mentioned above.

George Ormond Lorch III (gl-az) wrote :

Actually, that is incorrect, there is code to support it but it was originally incorrect as Raghavendra pointed out above in that it is based on a historic misalignment between QPLAN_* and enum_slow_log_filter. This seems to have been fixed by Laurynas in 02/2013 and this bug was never marked as fixed. There are test cases for these now in t/percona_log_slow_query_plan.test.

no longer affects: percona-server/5.6
Changed in percona-server:
status: Triaged → Fix Released
assignee: nobody → Laurynas Biveinis (laurynas-biveinis)

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-147

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers