pt-heartbeat problem with $i++ side effect in make_interval_iter() -> get_next_interval()

Bug #1406422 reported by Robert Barabas
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Expired
Undecided
Unassigned

Bug Description

It appears that make_interval_iter has a side effect problem in its embedded function, get_next_interval. The $i++ inside the return statement is susceptible to side effects. It is platform dependent how $i++ will be incremented.

For instance, on MAC, the counter is incremented past after its value is used in the return:

[18:34:22 rbarabas@procyon:~]
🌀 uname -a
Darwin procyon 14.0.0 Darwin Kernel Version 14.0.0: Fri Sep 19 00:26:44 PDT 2014; root:xnu-2782.1.97~2/RELEASE_X86_64 x86_64

[18:35:19 rbarabas@procyon:~]
🌀 ./pt-heartbeat-test-interval.pl
Current time: 1419896121
Next interval: 1419896122
Fixed interval: 1419896127
Ultimate fix: 1419896127

You can reproduce the issue easily with the following snippet (note that make_interval_iter is from current pt-heartbeat):

[18:33:30 rbarabas@procyon:~]
🌀 cat pt-heartbeat-test-interval.pl
#!/usr/bin/env perl -w

use strict;
use Time::HiRes qw(gettimeofday);

sub make_interval_iter {
   my ( $interval, $skew ) = @_;
   die "I need an interval argument" unless defined $interval;
   my ($s) = gettimeofday();
   my $start_s = $s + 1;
   my $i = 0;
   my $get_next_interval = sub {
      return $start_s + ($interval * $i++) + $skew;
   };
   return $get_next_interval;
}

sub make_interval_iter_quickfix {
   my ( $interval, $skew ) = @_;
   die "I need an interval argument" unless defined $interval;
   my ($s) = gettimeofday();
   my $start_s = $s + 1;
   my $i = 0;
   my $get_next_interval = sub {
      $i++;
      return $start_s + ($interval * $i) + $skew;
   };
   return $get_next_interval;
}

sub make_interval_iter_clean {
   my ( $interval, $skew ) = @_;
   die "I need an interval argument" unless defined $interval;
   my ($s) = gettimeofday();
   my $get_next_interval = sub {
      return $s + 1 + $interval + $skew;
   };
   return $get_next_interval;
}

my $interval = 5;
my $skew = 0;

eval {
    my $i1 = make_interval_iter($interval, $skew)->();
    my $i2 = make_interval_iter_quickfix($interval, $skew)->();
    my $i3 = make_interval_iter_clean($interval, $skew)->();
    printf("Current time: %d\n" .
           "Next interval: %d\n" .
           "Fixed interval: %d\n" .
           "Ultimate fix: %d\n", time, $i1, $i2, $i3);
};

As you can see, above contains two potential fixes. The make_interval_iter function looks like it has remnants of an old code base. If $i is not really used anymore as a real (used to be loop?) variable, perhaps the function could simply be reduced to:

sub make_interval_iter_clean {
   my ( $interval, $skew ) = @_;
   die "I need an interval argument" unless defined $interval;
   my ($s) = gettimeofday();
   my $get_next_interval = sub {
      return $s + 1 + $interval + $skew;
   };
   return $get_next_interval;
}

... instead.

If the embedded function pointer is used/overloaded by others, perhaps it may be better to promote the embedded function into an individual function for clarity and implement the quickfix for the embedded function with explicit argument / variable handling.

Revision history for this message
Nilnandan Joshi (nilnandan-joshi) wrote :

same code I can see in pt-heartbeat 2.2.12

sub make_interval_iter {
   my ( $interval, $skew ) = @_;
   die "I need an interval argument" unless defined $interval;
   my ($s) = gettimeofday();
   my $start_s = $s + 1;
   my $i = 0;
   my $get_next_interval = sub {
      return $start_s + ($interval * $i++) + $skew;
   };
   return $get_next_interval;
}

Changed in percona-toolkit:
status: New → Confirmed
Revision history for this message
Frank Cizmich (frank-cizmich) wrote :

Hello Robert, Nilnandan

Sorry for the late reply.

I don't think the original code is at fault here. The $i variable is meant to be a multiplier of $interval.
Each time the function is invoked $i is post-incremented, meaning that the first time it is invoked it will be 0, next time 1, then 2, etc
As a result, the invocations will return successive values of:
start_time + 1
start_time + 1 + interval
start_time + 1 + 2*interval
start_time + 1 + 3*interval
....

Which is what the code intends.

The related bug report in https://bugs.launchpad.net/percona-toolkit/+bug/1406390
is better solved fixing the race condition pointed out there.

I'll be closing this issue, but feel free to comment if you feel I missed something.

Changed in percona-toolkit:
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for Percona Toolkit because there has been no activity for 60 days.]

Changed in percona-toolkit:
status: Incomplete → Expired
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-1259

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.