pt-query-digest parses C-style comments incorrectly

Bug #1125144 reported by Luis Motta Campos
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Triaged
Medium
Unassigned

Bug Description

This reports concerns pt-query-digest 2.1.1.
This is the command line I'm using: 'pt-query-digest --report slow.log'
The version of the server producing the slow query log: 5.1.56-rel12.7-log ((Percona Server (GPL), 12.7, Revision 237))

Revision history for this message
Luis Motta Campos (luismottacampos) wrote :
Revision history for this message
Luis Motta Campos (luismottacampos) wrote :
Revision history for this message
Luis Motta Campos (luismottacampos) wrote :

Sorry, there's no way to edit a bug description and I've pressed the submit button too early.
The issue here is that pt-query-digest should ignore certain types of C-style comments before attempting to analyse the query for table and schema names, as this might cause issues with whatever is in the comment.

The comment in this query was automatically generated by Java Hibernate framework, and contains the SQL operation being attempted and the caller method's full-qualified class name, java-style.

Revision history for this message
Cees de Groot (casedeg) wrote :

Something like the following could do the trick. However, I'm not sure how it would impact the tool in general (quick ugly diff on the binary, but location should be clear).

*** /usr/local/bin/pt-query-digest 2013-02-15 13:13:10.000000000 +0100
--- /tmp/patched 2013-02-15 13:12:37.000000000 +0100
***************
*** 4907,4912 ****
--- 4907,4915 ----
           else {
              PTDEBUG && _d("Got the query/arg line");
              my $arg = substr($stmt, $pos - length($line));
+ # Strip C-style comments
+ $arg =~ s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $2 ? $2 : ""#gse;
+ PTDEBUG && _d('Clean query = ', $arg);
              push @properties, 'arg', $arg, 'bytes', length($arg);
              if ( $args{misc} && $args{misc}->{embed}
                 && ( my ($e) = $arg =~ m/($args{misc}->{embed})/)

Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

pqd does attempt to handle /* comments */, but in this case it gets it wrong.

Changed in percona-toolkit:
status: New → Triaged
summary: - pt-query-digest wrongly parses queries with C-style comments from the
- slow query log
+ pt-query-digest parses C-style comments incorrectly
tags: added: pt-query-digest slow-log
Changed in percona-toolkit:
milestone: none → 2.2.5
importance: Undecided → Medium
Changed in percona-toolkit:
milestone: 2.2.5 → none
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-602

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.