pt-query-digest doesn't understand hyphen-padded general log thread ids

Bug #1655689 reported by Sean Purdy
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Confirmed
Undecided
Unassigned

Bug Description

pt-query-digest assumes thread ids are all numeric but thread ids less than 10 digits long seem to be padded with hyphens in general log, at least in this version of mysql (5.5.31)

pt-query-digest version 2.2.20 (and previous)

From general log:

/usr/sbin/mysqld, Version: 5.5.31-0+wheezy1-log ((Debian)). started with:
Tcp port: 3306 Unix socket: /var/run/mysqld/mysqld.sock
Time Id Command Argument
170110 16:40:38 -937767034 Query select sleep(60 * 10)

Running "pt-query-digest --type genlog" just gives output of:

# No events processed.

debug log shows:

# GeneralLogParser:9883 29278 -975513747 Query select sleep(60 * 10)
#
# GeneralLogParser:9886 29278 Not start of general log event

To actually get something processed, I patched the script to remove the hyphens:

*** pt-query-digest 2017-01-11 15:24:53.121133702 +0000
--- pt-query-digest.new 2017-01-11 15:24:53.117133756 +0000
***************
*** 9855,9861 ****
     \A
     (?:(\d{6}\s+\d{1,2}:\d\d:\d\d|\d{4}-\d{1,2}-\d{1,2}T\d\d:\d\d:\d\d\.\d+Z))? # Timestamp
     \s+
! (?:\s*(\d+)) # Thread ID
     \s
     (\w+) # Command
     \s+
--- 9855,9861 ----
     \A
     (?:(\d{6}\s+\d{1,2}:\d\d:\d\d|\d{4}-\d{1,2}-\d{1,2}T\d\d:\d\d:\d\d\.\d+Z))? # Timestamp
     \s+
! (?:\s*([-\d]+)) # Thread ID
     \s
     (\w+) # Command
     \s+
***************
*** 9886,9891 ****
--- 9886,9892 ----
           PTDEBUG && _d('Not start of general log event');
           next;
        }
+ $thread_id =~ s/^-+//;
        my @properties = ('pos_in_log', $pos_in_log, 'ts', $ts,
           'Thread_id', $thread_id);

This worked - here's the debug output:

# Pipeline:11581 31475 Pipeline process GeneralLogParser
# GeneralLogParser:9883 31475 -975513747 Query select sleep(60 * 10)
#
# GeneralLogParser:9904 31475 Event done
# GeneralLogParser:9959 31475 Properties of event: $VAR1 = [
# 'pos_in_log',
# 208,
# 'ts',
# undef,
# 'Thread_id',
# '975513747',
# 'cmd',
# 'Query',
# 'arg',
# 'select sleep(60 * 10)',
# 'bytes',
# 21,
# 'Query_time',
# 0
# ];

Sean Purdy

Revision history for this message
Sean Purdy (spurdycvl) wrote :
Revision history for this message
Nickolay Ihalainen (ihanick) wrote :

Thread Id in general log printed as a signed number:

https://github.com/percona/percona-server/blob/5.5/sql/log.cc#L2743-L2744
      /* command_type, thread_id */
      length= my_snprintf(buff, 32, "%5ld ", (long) thread_id);

(With sizeof(long) == 4, it's enough to have 1000 connections per second and one month uptime to see negative numbers).

But pt-query-digest assuming thread_id as unsigned integer and ignoring general log entries with negative Thread Id

Changed in percona-toolkit:
status: New → Confirmed
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-1407

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.