pt-diskstats shows incorrect wr_mb_s

Bug #952727 reported by Vadim Tkachenko
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Fix Released
Medium
Baron Schwartz
2.0
Fix Released
Medium
Baron Schwartz
2.1
Fix Released
Medium
Baron Schwartz

Bug Description

iostat -dxm 5 shows:
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
scd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
fioa 0.00 0.00 0.00 7765.00 0.00 121.32 32.00 0.00 4.02 0.00 0.00
fiob 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.20 0.00 0.60 0.00 0.00 10.67 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
scd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
fioa 0.00 0.00 0.00 7540.40 0.00 117.82 32.00 0.00 4.00 0.00 0.00
fiob 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
scd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
fioa 0.00 0.00 0.00 7500.20 0.00 117.18 32.00 0.00 3.98 0.00 0.00
fiob 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.20 0.00 0.60 0.00 0.00 10.67 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
scd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
fioa 0.00 0.00 0.00 7452.80 0.00 116.44 32.00 20.08 4.03 0.02 15.94
fiob 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 6.20 0.00 0.80 0.20 0.03 0.00 57.60 0.03 31.80 22.00 2.20
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
scd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
fioa 0.00 0.00 0.00 7450.40 0.00 116.41 32.00 20.22 4.02 0.02 15.92
fiob 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

pt-diskstat shows
  #ts device rd_s rd_avkb rd_mb_s rd_mrg rd_cnc rd_rt wr_s wr_avkb wr_mb_s wr_mrg wr_cnc wr_rt busy in_prg io_s qtime stime
{160} fioa 0.1 4.0 0.0 0% 0.0 26.6 30924.6 6.0 181.5 0% 777.5 25.1 47% 0 30924.6 23.3 0.0
{161} fioa 0.1 4.0 0.0 0% 0.0 26.6 30778.5 6.0 181.1 0% 772.9 25.1 47% 0 30778.6 23.2 0.0
{162} fioa 0.1 4.0 0.0 0% 0.0 26.6 30634.3 6.0 180.7 0% 768.3 25.1 46% 0 30634.4 23.2 0.0
{163} fioa 0.1 4.0 0.0 0% 0.0 26.6 30492.0 6.1 180.3 0% 763.7 25.0 46% 0 30492.1 23.1 0.0
{164} fioa 0.1 4.0 0.0 0% 0.0 26.6 30352.7 6.1 179.9 0% 759.2 25.0 46% 0 30352.8 23.1 0.0

while sysbench is:
[ 280s] reads: 0.00 MB/s writes: 120.01 MB/s fsyncs: 0.00/s response time: 0.045ms (95%)
[ 290s] reads: 0.00 MB/s writes: 116.24 MB/s fsyncs: 0.00/s response time: 0.045ms (95%)
[ 300s] reads: 0.00 MB/s writes: 121.96 MB/s fsyncs: 0.00/s response time: 0.045ms (95%)
[ 310s] reads: 0.00 MB/s writes: 118.33 MB/s fsyncs: 0.00/s response time: 0.045ms (95%)
[ 320s] reads: 0.00 MB/s writes: 114.20 MB/s fsyncs: 0.00/s response time: 0.045ms (95%)
[ 330s] reads: 0.00 MB/s writes: 118.77 MB/s fsyncs: 0.00/s response time: 0.045ms (95%)
[ 340s] reads: 0.00 MB/s writes: 119.28 MB/s fsyncs: 0.00/s response time: 0.045ms (95%)

so device is doing close to 120 MB/s,
while
pt-diskstats shows 180

I am using async io mode in sysbench

Related branches

Revision history for this message
Vadim Tkachenko (vadim-tk) wrote :

I have an impression that
pt-diskstat
shows stats (at least wr_mb_s ) as an average from timestamp #1.
Which does not make much sense, because while pt-diskstat is running
I may change a workload significantly, and I am interested to see the actual current stats, not averages from
some earlier point of time.

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

Brian, please investigate -- it should be easy to reproduce with a sample of 3 snapshots of /proc/diskstats. Just take 2 samples and repeat the last sample, and if it outputs more than 0 for the second line in the result, we have a bug.

Changed in percona-toolkit:
assignee: nobody → Brian Fraser (fraserbn)
Revision history for this message
Brian Fraser (fraserbn) wrote :

Just so that this doesn't fall through the cracks, Baron and I talked about this on IRC a couple of days ago. Vadim is spot on, as the tool with --group-by disk displays the difference between the current sample and the first one. For point of refrerence, grouping by sample and all do current versus previous.

That was the intended behavior when it was designed, but obviously it's not working out for everyone. Obviously the bare minimum would be to document this, but beyond that, what else? Add an an option to choose what it compares against? And nif we go that road, just first/previous, or first/previous/number? The latter sounds useful, but also rings an awful lot like overengineering.

Changed in percona-toolkit:
status: New → Triaged
Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

This is actually something that got lost in translation from the old tool to the new one. I believe if you look at the old tool you'll see that by default, it does group-by-disk, but only over the last interval of time as it continues to sample new snapshots and print them out. If you changed the mode or something, it would indeed reach back to sample 0 and print out a line representing the differences from $start to $now, but then as it began iterating forward again, it would be $now1 - $now, $now2 - $now1, and so on.

When the tool is working on a static file, on the other hand, group-by-disk always groups the whole thing, from sample 0 to sample N, into one line per disk.

So this is a case where we need a special behavior. It's difficult to explain, specify, and document, but it's the behavior people will expect by default. It's also subtle, which is why it wasn't noticed right away.

If we change the default group-by to "all," will that solve the problem immediately?

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

Actually I believe --group-by=all does produce the expected behavior with no further ado. It also produces something reasonable when given a sample file.

Changed in percona-toolkit:
status: Triaged → Confirmed
importance: Undecided → Medium
Revision history for this message
Vadim Tkachenko (vadim-tk) wrote :

Guys I am lost. What kind of input you need from me ?

I personally never had needs in an average value calculated from the first measurement.
I always use last - previous.

For me it would be helpful if I get that number by default. I used a filtering by disk only to show information relevant for a disk I am interested in.

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

My last couple of comments were really directed at Brian. If you use --group-by=all with this tool, it will do what you expect. We can change this option's default value in the next release so it works as you expect.

So, Vadim, we don't need any more info. Your workaround for now is --group-by=all.

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

This bug was supposed to be fixed in 2.1.1 but the branch was never merged. Launchpad seems to have confused us.

tags: added: pt-diskstats wrong-output
Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

The fix will be released in a couple of days.

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-497

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.