pt-query-digest: a lot of information is removed when doing '--no-report --output'

Bug #1220341 reported by Kenny Gryp
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Confirmed
Undecided
Unassigned

Bug Description

In this particular case, I wanted to do some more complex groupby and ordering on a large slowlog which requires a lot of memory.
 In order to be able to do this without gigabytes of memory and hours of wait time, I wanted to filter out most of the data first and create a smaller slowlog
But unfortunately all that data is lost when doing --no-report --output slowlog

What is lost:
- SET TIMESTAMP
- Time microsecond precision
- Rows_affected
- InnoDB_trx_id
- hostname
- Rows_read
- Bytes_sent
and many many more :(

To reproduce, run:

echo "
# Time: 130828 0:00:00.540005
# User@Host: masked[masked] @ [172.29.126.132]
# Thread_id: 352095 Schema: masked Last_errno: 0 Killed: 0
# Query_time: 0.000208 Lock_time: 0.000072 Rows_sent: 0 Rows_examined: 0 Rows_affected: 1 Rows_read: 0
# Bytes_sent: 11 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0
# InnoDB_trx_id: DF756D
# QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No
# Filesort: No Filesort_on_disk: No Merge_passes: 0
# InnoDB_IO_r_ops: 0 InnoDB_IO_r_bytes: 0 InnoDB_IO_r_wait: 0.000000
# InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000
# InnoDB_pages_distinct: 5
SET timestamp=1377640800;
select 'we_love_daniel';
" | ./pt-query-digest --no-report --output slowlog

Then you will get:

# Time: 130828 0:00:00
# User@Host: masked[masked] @ []
# Thread_id: 352095
# Query_time: 0.000208 Lock_time: 0.000072 Rows_sent: 0 Rows_examined: 0
# QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No
# Filesort: No Filesort_on_disk: No Merge_passes: 0
# InnoDB_IO_r_ops: 0 InnoDB_IO_r_bytes: 0 InnoDB_IO_r_wait: 0.000000
# InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000
# InnoDB_pages_distinct: 5
use masked;
select 'we_love_daniel'
;

Revision history for this message
Kenny Gryp (gryp) wrote :

percona@debian7:/data/percona$ ./pt-query-digest --version
pt-query-digest 2.2.4

Revision history for this message
Kenny Gryp (gryp) wrote :

Attached is a diff that fixes most of it:

This still remains:

-# Time: 130828 0:00:00.540005
+# Time: 130828 0:00:00
 # User@Host: masked[masked] @ [172.29.126.132]
 # Thread_id: 352095 Schema: masked Last_errno: 0 Killed: 0
 # Query_time: 0.000208 Lock_time: 0.000072 Rows_sent: 0 Rows_examined: 0 Rows_affected: 1 Rows_read: 0
@@ -9,5 +9,7 @@
 # InnoDB_IO_r_ops: 0 InnoDB_IO_r_bytes: 0 InnoDB_IO_r_wait: 0.000000
 # InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000
 # InnoDB_pages_distinct: 5
+use masked;
 SET timestamp=1377640800;
-select 'we_love_daniel';
+select 'we_love_daniel'
+;

Changed in percona-toolkit:
status: New → Confirmed
Revision history for this message
Tom Desp (tomdesp) wrote :

I'm facing the exact same problem.

I'm using pt-query-digest (v2.2.20) slowlog output format to filter out queries based on their timestamp, to run further digests on the filtered slowlog :

ptqd 11-21.slow.log --output slowlog --no-report --filter '($event->{timestamp} > 1483957800) && ($event->{timestamp} < 1483959600)' > 11-21.halfhour.log

The major point for me is the loss of the query cache hit info.

Original log:

# Time: 170109 12:08:43
# User@Host: pidf_user[pidf_user] @ [10.226.13.132]
# Thread_id: 32829731 Schema: webmbpidf QC_hit: No
# Query_time: 0.015387 Lock_time: 0.000038 Rows_sent: 0 Rows_examined: 0

Filtered log

# Time: 170109 11:59:59
# User@Host: pest_user[pest_user] @ 10.226.13.13 []
# Thread_id: 32805954
# Query_time: 0.000221 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-1146

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.