juju/state: intermittent test failure in SubnetSuite with mongod 3.2

Bug #1588784 reported by Christian Muirhead
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
Triaged
Medium
Unassigned

Bug Description

Running SubnetSuite.TestPickNewAddressRace against mongod 3.2 in a loop results in it failing within about 5 runs.

cd $GOPATH/state
go test -c
bash -c "set -e; while true; do JUJU_MONGOD=/usr/lib/juju/mongo3.2/bin/mongod ./state.test -check.v -check.f SubnetSuite.TestPickNewAddressRace; echo; done"

Digging into the test, it's running two transactions in parallel to provoke a conflict between them. The code expects a txn.ErrAborted, but instead gets an error from mgo that looks like this:

&mgo.QueryError{
    Code:11000,
    Message:"E11000 duplicate key error collection: juju.txns.stash index: _id_ dup key: { : { c: \"ipaddresses\", id: \"d50476f7-6a3c-45d4-8428-6dd70515ea34:192.168.1.0\" } }",
    Assertion:false,
}

Running the loop using JUJU_MONGOD=/usr/lib/juju/bin/mongod doesn't ever produce the error.

Example output from running the test in a loop against mongod 3.2:

PASS: /home/xtian/dev/go/src/github.com/juju/juju/state/subnets_test.go:413: SubnetSuite.TestPickNewAddressRace 0.028s
OK: 1 passed
PASS

PASS: /home/xtian/dev/go/src/github.com/juju/juju/state/subnets_test.go:413: SubnetSuite.TestPickNewAddressRace 0.037s
OK: 1 passed
PASS

PASS: /home/xtian/dev/go/src/github.com/juju/juju/state/subnets_test.go:413: SubnetSuite.TestPickNewAddressRace 0.022s
OK: 1 passed
PASS

PASS: /home/xtian/dev/go/src/github.com/juju/juju/state/subnets_test.go:413: SubnetSuite.TestPickNewAddressRace 0.021s
OK: 1 passed
PASS

----------------------------------------------------------------------
FAIL: /home/xtian/dev/go/src/github.com/juju/juju/state/subnets_test.go:413: SubnetSuite.TestPickNewAddressRace

SetUpTest
[LOG] 0:00.001 INFO juju.state opening state, mongo addresses: ["localhost:40933"]; entity <nil>
[LOG] 0:00.001 DEBUG juju.state dialing mongo
[LOG] 0:00.002 INFO juju.mongo dialled mongo successfully on address "127.0.0.1:40933"
[LOG] 0:00.002 DEBUG juju.state connection established
[LOG] 0:00.002 DEBUG juju.state mongodb login successful
[LOG] 0:00.422 INFO juju.mongo dialled mongo successfully on address "127.0.0.1:40933"
[LOG] 0:00.423 INFO juju.state initializing controller model 60d109d3-73a6-4c6b-87ae-4b25fb08e2eb
[LOG] 0:00.597 INFO juju.state running state anonymously; using unique client id
[LOG] 0:00.597 INFO juju.state starting standard state workers
[LOG] 0:00.597 DEBUG juju.state.workers starting leadership lease manager
[LOG] 0:00.599 INFO juju.mongo dialled mongo successfully on address "127.0.0.1:40933"
[LOG] 0:00.602 DEBUG juju.state.workers starting singular lease manager
[LOG] 0:00.602 DEBUG juju.worker.lease waking to check leases at 2016-06-03 12:19:09.460057936 +0100 BST
[LOG] 0:00.605 DEBUG juju.state.workers starting transaction log watcher
[LOG] 0:00.605 DEBUG juju.state.workers starting presence watcher
[LOG] 0:00.605 INFO juju.state creating cloud image metadata storage
[LOG] 0:00.605 DEBUG juju.worker.lease waking to check leases at 2016-06-03 12:19:09.462834777 +0100 BST
[LOG] 0:00.605 INFO juju.state started state for model-60d109d3-73a6-4c6b-87ae-4b25fb08e2eb successfully
SetUpTest done
[LOG] 0:00.619 INFO juju.mongo dialled mongo successfully on address "127.0.0.1:40933"
[LOG] 0:00.619 INFO juju.mongo dialled mongo successfully on address "127.0.0.1:40933"
/home/xtian/dev/go/src/github.com/juju/juju/state/subnets_test.go:451:
    c.Assert(secondError, jc.ErrorIsNil)
... value *errors.Err = &errors.Err{message:"cannot add IP address \"local-cloud:192.168.1.0\"", cause:(*mgo.QueryError)(0xc8203dc140), previous:(*errors.Err)(0xc82046c140), file:"github.com/juju/juju/state/legacy_ipaddresses.go", line:74} ("cannot add IP address
\"local-cloud:192.168.1.0\": E11000 duplicate key error collection: juju.txns.stash index: _id_ dup key: { : { c: \"ipaddresses\", id: \"60d109d3-73a6-4c6b-87ae-4b25fb08e2eb:192.168.1.0\" } }")
... error stack:
        E11000 duplicate key error collection: juju.txns.stash index: _id_ dup key: { : { c: "ipaddresses", id: "60d109d3-73a6-4c6b-87ae-4b25fb08e2eb:192.168.1.0" } }
        github.com/juju/juju/state/legacy_ipaddresses.go:74:
        github.com/juju/juju/state/legacy_ipaddresses.go:74: cannot add IP address "local-cloud:192.168.1.0"

[LOG] 0:00.626 DEBUG juju.state.workers stopped state workers without error
[LOG] 0:00.626 DEBUG juju.state closed state without error
OOPS: 0 passed, 1 FAILED
--- FAIL: TestPackage (1.00s)
FAIL

Revision history for this message
Christian Muirhead (2-xtian) wrote :

This seems like a bug in mgo against mongo3.2 - you can see the same duplicate key error inserting into the stash collection in this standalone program (using mgo.v2-unstable).

Again, this runs two conflicting transactions in parallel expecting one to fail with a txn.ErrAborted. Most of the time it does, but occasionally (after about 50 times through the loop) it returns the duplicate key error instead. Both transactions assert that a base document exists, and then try to insert the same new document. The assertion is needed to cause the error - without it the transactions only fail with txn.ErrAborted as expected.

Run the example with: go run mgo-duplicate-key.go /usr/lib/juju/mongo3.2/bin

Revision history for this message
Christian Muirhead (2-xtian) wrote :

(Thanks Menno for the extra assertion piece!)

Revision history for this message
Christian Muirhead (2-xtian) wrote :

Created a github issue for mgo: https://github.com/go-mgo/mgo/issues/277

Curtis Hovey (sinzui)
Changed in juju-core:
status: New → Triaged
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.