pt-query-digest doesn't properly aggregate by db when using --output=json

Bug #1420064 reported by Evan Klitzke
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Triaged
Undecided
Unassigned

Bug Description

I'm using pt-query-digest on a MySQL instance with multiple logical databases. When I run pt-query-digest in the normal (text based) output mode, for common queries like "COMMIT" that appear across I see them properly reflected in the output:

# Query 5: 142.78 QPS, 0.07x concurrency, ID 0x813031B8BBC3B329 at byte 18549436
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.00
# Time range: 2015-02-04 22:05:15 to 22:05:24
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 6 1285
# Exec time 3 658ms 16us 16ms 512us 839us 770us 596us
# Lock time 0 0 0 0 0 0 0 0
# Rows sent 0 0 0 0 0 0 0 0
# Rows examine 0 0 0 0 0 0 0 0
# Rows affecte 0 0 0 0 0 0 0 0
# Bytes sent 0 13.80k 11 11 11 11 0 11
# Merge passes 0 0 0 0 0 0 0 0
# Tmp tables 0 0 0 0 0 0 0 0
# Tmp disk tbl 0 0 0 0 0 0 0 0
# Tmp tbl size 0 0 0 0 0 0 0 0
# Query size 0 7.53k 6 6 6 6 0 6
# String:
# Databases sentinel (889/69%), locket (94/7%)... 11 more
# Hosts 10.30.33.27 (199/15%), 10.30.27.22 (195/15%)... 55 more
# Last errno 0
# Users sentinel_r... (889/69%), locket_rw (94/7%)... 11 more
# Query_time distribution
# 1us
# 10us ################################
# 100us ################################################################
# 1ms ##
# 10ms #
# 100ms
# 1s
# 10s+
commit\G

As you can see above, several databases are listed for COMMIT. All fine and good. When I try with --output=json though the information is lost (output is shown with a bunch of stuff trimmed):

pt-query-digest --output=json tmp-slow-9fdef21f.log | python -m json.tool | less

        {
            "attribute": "fingerprint",
            "checksum": "813031B8BBC3B329",
            "distillate": "COMMIT",
            "metrics": {
                "db": {
                    "value": "vorenus"
                },
                "host": {
                    "value": "10.31.4.95"
                },
                "user": {
                    "value": "vorenus_rw"
                }
            },
            "query_count": 1285,
            "ts_max": "2015-02-04 22:05:24",
            "ts_min": "2015-02-04 22:05:15"
        }

I'd like to ideally get the data broken out by logical database, or if that's not possible/difficult then at least I'd like to see that the query affects multiple databases.

Revision history for this message
Frank Cizmich (frank-cizmich) wrote :

There's the --group-by option, where you can tell pt-q-d to group by fingerprint and db.
Unfortunately it seems to break the --output=json option.

Changed in percona-toolkit:
status: New → Triaged
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-1265

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.