pt-fk-error-logger --run-time works differently than pt-deadlock-logger --run-time

Reported by Brian Fraser on 2012-09-26
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit
Medium
Daniel Nichter

Bug Description

Well, really, it's --interval that's different, but it affects --run-time.

pt-deadlock-logger says this about --run-time:
"If L<"--run-time"> is specified but no L<"--interval"> is specified, a default 1 second interval will be used."

Which while a bit magical, does the right thing; Meanwhile, pt-fk-error-logger's --run-time says nothing about --interval, and if none of the latter is specified, silently quits after finding it's first fk error.

(originally reported by Migual Angel Nieto)

Brian Fraser (fraserbn) on 2012-09-26
Changed in percona-toolkit:
importance: Undecided → Medium
milestone: none → 2.1.5
status: New → Triaged
tags: added: pt-deadlock-logger pt-fk-error-logger
Changed in percona-toolkit:
assignee: nobody → Daniel Nichter (daniel-nichter)
Changed in percona-toolkit:
status: Triaged → In Progress
Daniel Nichter (daniel-nichter) wrote :

After reviewing --run-time and --interval of these and all tools that have --run-time, my conclusion is that the option is only loosely standardized. Some tools have no default --run-time and run once then exit, others run forever. Some have a default value for --run-time. Then throw --interval into the mix and it's less standardized.

I think we should try to standardize it even more, but we can't do that for 2.1 because it would introduce a backwards-incompatability in one tool or another. So I'm going to retarget this to 2.2.

Changed in percona-toolkit:
milestone: 2.1.6 → 2.2.1
assignee: Daniel Nichter (daniel-nichter) → nobody
status: In Progress → Confirmed
Changed in percona-toolkit:
assignee: nobody → Daniel Nichter (daniel-nichter)
Daniel Nichter (daniel-nichter) wrote :

pt-fk-error-logger and pt-deadlock-logger are now standardized wrt --run-time, --interval, --iterations, and --quiet. Previously, the differed, and had a --print option. Since they're both logging tools, they both run forever now by default, and print stuff to STDOUT unless --quiet, and --dest does not suppress the output (only --quiet does). To run once, use --iterations 1.

Changed in percona-toolkit:
status: Confirmed → In Progress
Daniel Nichter (daniel-nichter) wrote :

A fix to Cxn.pm resulted from using it in pt-deadlock-logger: when the parent exists after --daemonize, it disconnected the dbh in its DESTROY(). So the same dbh in the child's copy of the Cxn was dead on arrival, so to speak. The affects only pt-kill in 2.1, but it doesn't actually cause an error because pt-kill retries its queries, so the first time a child tries to kill something, it fails, reconnects, tries again and succeeds.

The only simple solution I could thing of was:

my $cxn = Cxn->new(
   parent => $o->get('daemonize'),
);

If parent is true in DESTROY, the dbh is not disconnected. Then the child proc has to set $cxn->{parent}=0 after daemonizing. Not the pretties solution, but it works.

Usage of Cxn vs. direct DSNParser is about half-and-half. I originally created Cxn because tools would occasionally die with that "rolling back active statement handle" error when the tool forgot to $dbh->disconnect() or failed to because it crashed. That's why it's done in Cxn::DESTROY(): so it's guaranteed to be done no matter what. Maybe Cxn isn't the best solution given this new bug, but it's in the wild now so if we want another solution, we'll have to find and make the time to create, implement, and test it thoroughly. That probably won't happen for awhile.

Changed in percona-toolkit:
status: In Progress → Fix Committed
Changed in percona-toolkit:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers