TestPrunes fails occasionally still

Bug #1471770 reported by Martin Packman
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
Medium
Tim Penhey

Bug Description

TestPrunes was changed as part of bug bug 1459291 due to intermittent failures, but still fails sometimes.

http://reports.vapour.ws/releases/issue/5565dd10749a5670af648404

FAIL: txnpruner_test.go:22: com_juju_juju_worker_txnpruner_test.TestPrunes.pN57_github.com_juju_juju_worker_txnpruner_test.TxnPrunerSuite

txnpruner_test.go:38:
    c.Assert(td >= interval, jc.IsTrue, gc.Commentf("td=%s", td))
... obtained bool = false
... td=9.29ms

This failure is on ppc64/gccgo but that may not be a factor beyond that setup hitting timing issues more often.

Martin Packman (gz)
Changed in juju-core:
assignee: nobody → Menno Smits (menno.smits)
Curtis Hovey (sinzui)
description: updated
Changed in juju-core:
assignee: Menno Smits (menno.smits) → nobody
Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :

As noted in a TODO in the code, this code and its tests need to use an injected Clock.

Changed in juju-core:
assignee: nobody → Menno Smits (menno.smits)
milestone: none → 2.0-beta17
Changed in juju:
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Menno Smits (menno.smits)
milestone: none → 2.0-beta17
no longer affects: juju-core
Tim Penhey (thumper)
Changed in juju:
assignee: Menno Smits (menno.smits) → Tim Penhey (thumper)
Revision history for this message
Martin Packman (gz) wrote :

From bug 1618560 see Roger's reasoning:

"""
The reason for the failure is that the test relies on the scheduler.

There are two goroutines in play here.

When the test works, this happens:

   txnpruner: receives on timer channe
   txnpruner: calls MaybePruneTransaction, which sends on the test channel
   test: receives on test channel
   test: set t0
   txnpruner: reset timer to 10ms
   txnpruner: receives on timer channel at t0+10ms+something small
   txnpruner: calls MaybePruneTransaction, which sends on the test channel
   test: receives on test channel
   test: set t1
   test: t1 >= t0 + 10ms, all ok

But this can happen instead:

   txnpruner: receives on timer channe
   txnpruner: calls MaybePruneTransaction, which sends on the test channel
   test: receives on test channel
   txnpruner: reset timer to 10ms
   test: set t0
   txnpruner: receives on timer channel at t0+10ms+something small
   txnpruner: calls MaybePruneTransaction, which sends on the test channel
   test: receives on test channel
   test: set t1
   test: t1 < t0 + 10ms, fail

That is, there's nothing stopping the timer from resetting before the
test has decided on what time the previous event was set.

To fix this, the time should be set deterministically before the
code can continue running (for example by using a reply channel
or changing the test mock to send the time on the channel)
"""

Tim Penhey (thumper)
Changed in juju:
status: Triaged → In Progress
Tim Penhey (thumper)
Changed in juju:
status: In Progress → Fix Committed
Curtis Hovey (sinzui)
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.