pt-diskstats computes wrong values for md0

Bug #897029 reported by Baron Schwartz
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Fix Released
Medium
Brian Fraser

Bug Description

The following command-line:

[baron@ginger collected]$ pt-diskstats -g disk -c . *diskstats | cat

Produces a "md0" device line like this:

  #ts device rd_s rd_avkb rd_mb_s rd_mrg rd_cnc rd_rt wr_s wr_avkb wr_mb_s wr_mrg wr_cnc wr_rt busy in_prg
[snip]
 {29} md0 0.5 8.0 0.0 0% 0.0 0.0 2505.0 8.0 9.8 0% 0.0 0.0 0% 0

That is, 2500 8kb writes per second = 9.8 MB/s of writes. Something's wrong here. I will attach the input file.

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :
Revision history for this message
Brian Fraser (fraserbn) wrote :

Ah. That's unexpected. /usr/src/linux/Documentation/iostat.txt says this:

"Here are examples of these different formats:
[...]
2.6 diskstats:
   3 0 hda 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160
   3 1 hda1 35486 38030 38030 38030
"

and later:

"All merges and timings now happen
at the disk level rather than at both the disk and partition level as
in 2.4. Consequently, you'll see a different statistics output on 2.6 for
partitions from that for disks. There are only *four* fields available
for partitions on 2.6 machines. This is reflected in the examples above.

Field 1 -- # of reads issued
    This is the total number of reads issued to this partition.
Field 2 -- # of sectors read
    This is the total number of sectors requested to be read from this
    partition.
Field 3 -- # of writes issued
    This is the total number of writes issued to this partition.
Field 4 -- # of sectors written
    This is the total number of sectors requested to be written to
    this partition."

pt-diskstats currently only deals with the first one; it's assuming that if it finds a device name, the data will be in the disk format, so when facing something partition-shaped, it saves the number of sectors read as the number of reads merged, and so on; thus, this.

The Perl port was dieing on a malformed line, so I just added a second regex validating this and now it's working as expected:

  #ts device rd_s rd_avkb rd_mb_s rd_mrg rd_cnc rd_rt wr_s wr_avkb wr_mb_s wr_mrg wr_cnc wr_rt busy in_prg
  {9} md0 0.8 4.0 0.0 0% 0.0 0.0 2590.7 4.0 10.1 0% 0.0 0.0 0% 0

(Note that the above is missing the fix for #838939, so it's still half-broken. But that should be enough for this ticket.)

Changed in percona-toolkit:
status: New → Triaged
Revision history for this message
Baron Schwartz (baron-xaprb) wrote : Re: [Bug 897029] Re: pt-diskstats computes wrong values for md0

I never explicitly said this, but I designed this tool to work only on
the newer diskstats format, which has more useful information. On
older systems, it's kind of pointless. I'd say if there aren't enough
fields, let's just die with a message like "use iostat instead".

Revision history for this message
Brian Fraser (fraserbn) wrote :

Ah, I must've mis-explained. The short format is exclusive for 2.6 actually, but only for partitions; if you have one of those and look at /proc/diskstats, you should see the short one.
For 2.4, there's actually even more fields, though I can't find a definition for those. Haven't tested it, but I'm assuming that both the old and short versions croak on those.

Should I still make it die? If it helps, both iostat and Sys::Statistics::Linux::DiskStats consider the two forms correct.

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

Let's not die; let's just ignore lines that don't have the detailed fields.

Brian Fraser (fraserbn)
Changed in percona-toolkit:
status: Triaged → Fix Committed
Changed in percona-toolkit:
importance: Undecided → Medium
assignee: nobody → Brian Fraser (fraserbn)
milestone: none → 2.0.3
status: Fix Committed → Fix Released
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-438

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers