Comment 7 for bug 745738

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 745738] Re: test_full_integration fails intermittently

On Thu, Mar 31, 2011 at 6:16 PM, John A Meinel <email address hidden> wrote:
> If it is just the one test, it is pretty easy to change the test *as I
> described* and have it pass regularly rather than back it out. I realize
> the test itself was fragile, you could set the timeout to 60min if you
> are unhappy. The point was that if the test was every broken (by a
> real-bug somewhere) I didn't want it to block indefinitely waiting to
> find that bug.
>
> In this case, it is giving false positives, so should be set to a higher
> timeout threshold.
>
> I'm pretty sure this isn't the first flaky test that has been landed in
> the launchpad test suite. I'm not sure why you felt the need to ignore
> the advice I posted here and back out all of the loggerhead tests.

Because of a few things that combine to have me thinking that having
the loggerhead tests run with the lp tests is a mistake.

But first, on the flakiness: we don't know that its a simple timeout
threshold issue (well, you may, *I* don't). It has cost us at least 10
hours of latency so far, and our policy for dealing with flaky tests
is to disable them; as this test was in loggerhead itself, the way to
do that was to stop running the loggerhead tests when changing LP.

On the larger issue; we don't run the tests for other projects when
committing to LP - not bzr, not zope, etc. All these components can
fail to integrate. This leaves me asking 'why is loggerhead special' -
and I don't have a good answer to that. It seems to me that running
the loggerhead test suite when changing LP is a very heavy hammer to
use to solve the bug 'we deployed an incompatible loggerhead + bzr'.
Some other ways we can catch that:
 - have loggerhead declare its bzr compatibility and check that that
is ok in LP.
 - test loggerhead with the bzr that launchpad is using
 - manual qa

-Rob