state_leader_test LeadershipSuite.TestCheck has a race condition

Bug #1735153 reported by John A Meinel
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Andrew Wilkins
Andrew Wilkins

Bug Description

The test asserts that after waiting for the leadership to expire, it will then observe that the leasee does not hold the lock anymore.
My guess is that while its true that the app/0 has lost the lease, it is possible for it to get reacquired before the assertion is run.

The test run here shows the issue:

    c.Check(err, gc.ErrorMatches, `"application/0" is not leader of "application"`)
... value = nil
... regex string = "\"application/0\" is not leader of \"application\""
... Error value is nil

    c.Check(ops2, gc.IsNil)
... value []txn.Op = []txn.Op{txn.Op{C:"leases", Id:"application-leadership#application#", Assert:bson.M{"holder":"application/0"}, Insert:interface {}(nil), Update:interface {}(nil), Remove:false}}

Note, while this is found by --race, I think that's just an example of it changing the timing of the test. I don't think there is actually a data race, just a false assumption that there isn't anything going on that would renew the leadership.
Though, inserting time.Sleep() doesn't seem to be triggering the issue.

John A Meinel (jameinel)
description: updated
Revision history for this message
Andrew Wilkins (axwalk) wrote :
John A Meinel (jameinel)
Changed in juju:
assignee: John A Meinel (jameinel) → Andrew Wilkins (axwalk)
Revision history for this message
Andrew Wilkins (axwalk) wrote :

Back-ported to 2.3 branch in

Changed in juju:
status: In Progress → Fix Committed
no longer affects: juju/trunk
Changed in juju:
milestone: 2.3.0 → 2.4-beta1
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers