Intermittent unit test failure: StatusSuite.TestRun

Bug #1780766 reported by Simon Richardson on 2018-07-09
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju
Medium
John A Meinel

Bug Description

The following unit test fails intermittently: StatusSuite.TestRun

http://ci.jujucharms.com/job/github-merge-juju/688/console

```
status_test.go:91:
    s.runTestCase(c, test)
status_test.go:122:
    c.Check(ctx.Stdout.(*bytes.Buffer).String(), gc.Equals, out.String())
... obtained string = "" +
... "actions:\n" +
... "- completed at: n/a\n" +
... " status: pending\n"
... expected string = "" +
... "actions:\n" +
... "- action: fakeName\n" +
... " completed at: n/a\n" +
... " id: deadbeef-0000-4000-8000-feedfacebeef\n" +
... " status: some-random-status\n" +
... " unit: \"\"\n"
```

Changed in juju:
status: New → Incomplete
Tim Penhey (thumper) on 2018-07-09
tags: added: intermittent-failure test-failure
Launchpad Janitor (janitor) wrote :

[Expired for juju because there has been no activity for 60 days.]

Changed in juju:
status: Incomplete → Expired
John A Meinel (jameinel) wrote :

$ time go test -count=100 -race -check.v -check.f StatusSuite.TestRun

fails about 5 times in 100. In different places, but it does fail on my machine.

John A Meinel (jameinel) wrote :

The problem seems to lie in here:
func (c *fakeAPIClient) Actions(args params.Entities) (params.ActionResults, error) {
 // If the test supplies a delay time too long, we'll return an error
 // to prevent the test hanging. If the given wait is up, then return
 // the results; otherwise, return a pending status.

 // First, sync.
 _ = <-time.NewTimer(0 * time.Second).C

 select {
 case _ = <-c.delay.C:
  // The API delay timer is up. Pass pre-canned results back.
  return params.ActionResults{Results: c.actionResults}, c.apiErr
 case _ = <-c.timeout.C:
  // Timeout to prevent tests from hanging.
  return params.ActionResults{}, errors.New("test timed out before wait time")
 default:
  // Timeout should only be nonzero in case we want to test
  // pending behavior with a --wait flag on FetchCommand.
  return params.ActionResults{Results: []params.ActionResult{{
   Status: params.ActionPending,
   Output: map[string]interface{}{},
   Started: time.Date(2015, time.February, 14, 8, 15, 0, 0, time.UTC),
   Enqueued: time.Date(2015, time.February, 14, 8, 13, 0, 0, time.UTC),
  }}}, nil
 }

The issue is that while it is true that the setup does:
  fakeClient := makeFakeClient(
   0*time.Second, // No API delay
   5*time.Second, // 5 second test timeout
   tc.tags,
   tc.results,
   tc.actionsByNames,
   "", // No API error
  )

which creates:
  delay: time.NewTimer(delay),
  timeout: time.NewTimer(timeout),

Apparently there is no guarantee that time.NewTimer(0) will actually be ready before the 'default:' branch is evaluated.

the _ = <time.NewTimer(0 *time.Second).C seems very strange.
Presumably it is meant to handle the case where c.delay.C isn't actually ready, but I don't think it actually gives us that.

Note also that all of these are *real* time.NewTimer calls, not *clock* calls. Which also adds real-clock-skew issues.

I'd like to fix those things, but the easier thing to do is just add '--wait 1s' to all the calls, in which case the code will see the Pending result, and keep trying until it sees something else.

John A Meinel (jameinel) wrote :

Turns out "juju show-action-status" doesn't support --wait, that is only accepted by 'show-action-output --wait'.

So instead, I've just changed the code so that a delay of 0 never goes via a Timer object, it just returns immediately.
On the plus side, that lets me get rid of the "first sync <-" oddity, and also lets the tests run slightly faster (they complete under --race in 11-12ms instead of 15ms.)

And it is more deterministic. yay

Changed in juju:
milestone: none → 2.5.4
assignee: nobody → John A Meinel (jameinel)
importance: Undecided → Medium
status: Incomplete → In Progress
John A Meinel (jameinel) on 2019-04-01
Changed in juju:
milestone: 2.5.4 → 2.5.5
status: In Progress → Fix Committed
Changed in juju:
status: Fix Committed → Fix Released
Changed in juju:
milestone: 2.5.6 → 2.5.7
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers