broken fingerprinting of specific INSERT/REPLACE queries
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona Toolkit moved to https://jira.percona.com/projects/PT |
Won't Fix
|
Medium
|
Frank Cizmich |
Bug Description
When fingerprinting an INSERT/REPLACE query that contains some specific SQL fragments within a string parameter, the resulting fingerprint is invalid.
$ pt-fingerprint --version
= > pt-fingerprint 2.2.6
$ pt-fingerprint --query "REPLACE INTO a VALUES('INSERT INTO foo VALUES (1),(2)')"
=> replace into a values('insert into foo values(?+)
Notice the single opening quote and improperly normalized string value. The Insert statement in this context is just a string.
Expected output:
=> replace into a values(?+)
More context:
* If the string only contains a single VALUES tuple, the normalization appears to be fine:
$ pt-fingerprint --query "REPLACE INTO a VALUES('INSERT INTO foo VALUES (1)')"
=> replace into a values(?+)
Some operational context how this bug bubbled up:
We were running pt-query-digest on a MySQL server, tailing its slowlog (with long_query_time=0). pt-query-digest was logging query statistics into a mysql database on that same server. As a result, all queries run by pt-query-digest were also fed back into pt-query-digest.
Now, when a client ran a query "INSERT INTO foo VALUES (1),(2)", the query would be logged by pt-query-digest into the query history table, along with a sample of said query. ("replace into `percona`
This led to a self-reinforcing feedback loop where new "unique" fingerprints were created recursively. pt-query-digest ballooned to consume 27G within one hour. The slow query log -- normally growing at 1G/hr -- grew at a rate of 40G/hr. The outbound network interface was at a 1Gbit/s ceiling, streaming all the new unique queries to all connected slaves.
Obviously, it is bad practice to save monitoring data onto the same cluster as production data, but this is what happened.
Related branches
- David Bennett (community): Needs Fixing
- Daniel Nichter: Pending requested
-
Diff: 434 lines (+103/-34)8 files modifiedbin/pt-fingerprint (+18/-6)
bin/pt-index-usage (+18/-6)
bin/pt-kill (+18/-6)
bin/pt-query-digest (+2/-2)
bin/pt-table-usage (+18/-6)
bin/pt-upgrade (+18/-6)
lib/QueryRewriter.pm (+2/-2)
t/lib/QueryRewriter.t (+9/-0)
summary: |
- broken fingerprinting of specific REPLACE queries + broken fingerprinting of specific INSERT/REPLACE queries |
description: | updated |
tags: | added: pt-fingerprint |
Changed in percona-toolkit: | |
importance: | Undecided → Medium |
status: | New → In Progress |
milestone: | none → 2.2.8 |
assignee: | nobody → Frank Cizmich (frank-cizmich) |
Some triaging:
The diff at the end of the comment is for a test case that reproduces this issue. It appears that the following optimization from QueryRewriter- >fingerprint( ) is a little too greedy and chops off the VALUES incorrectly. When this block is removed, the test succeeds (with some other tests failing, of course).
# mysqldump's INSERT statements will have long values() lists, don't waste :INSERT| REPLACE) (?: IGNORE) ?\s+INTO. +?VALUES\ s*\(.*? \))\s*, \s*\(/is ) {
# time on them... they also tend to segfault Perl on some machines when you
# get to the "# Collapse IN() and VALUES() lists" regex below!
if ( my ($beginning) = $query =~ m/\A((?
$query = $beginning; # Shorten multi-value INSERT statements ASAP
}
=== modified file 't/lib/ QueryRewriter. t' iter.t 2013-09-27 02:16:19 +0000 iter.t 2014-03-09 00:15:46 +0000
--- t/lib/QueryRewr
+++ t/lib/QueryRewr
@@ -247,6 +247,12 @@
);
is( t('insert into foo values( "(1),(2) ")'), >fingerprint( 'select * from foo limit 5'),
+ $qr->fingerprin
+ 'insert into foo values(?+)',
+ 'not confused by parentheses inside a quoted string',
+);
+
+is(
$qr-
'select * from foo limit ?',
'limit alone',