Race in github.com/joyent/gosdc/localservices/cloudapi

Bug #1604514 reported by Curtis Hovey
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Menno Finlay-Smits
juju-core
Fix Released
High
Menno Finlay-Smits
1.25
Fix Released
High
Menno Finlay-Smits

Bug Description

As seen in
    http://reports.vapour.ws/releases/issue/578e6877749a5643f0c3198c

Race in github.com/joyent/gosdc/localservices/cloudapi
seen from github.com/juju/juju/provider/joyent

This result failed the model-migration branch branch, but it appears to have been merged into master anyway. This issue now affects master.

Curtis Hovey (sinzui)
tags: added: blocker
Changed in juju-core:
assignee: nobody → Menno Smits (menno.smits)
Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :

I doubt very much that this is related to the model-migration branch. That branch didn't go anywhere near this area of functionality.

I'm unable to reproduce the problem locally so it's intermittent.

Still digging...

Changed in juju-core:
status: Triaged → In Progress
tags: removed: blocker regression
Changed in juju-core:
importance: Critical → High
Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :

The problem certainly isn't new but it's timing dependent and only happens occasionally. I've updated the ticket to High and have removed the blocker tag for this reason.

The test that triggers the issue is localServerSuite.TestInstancesGathering in github.com/juju/juju/providers/joyent. It can be triggered occasionally using Dave's stress test script (https://github.com/juju/juju/wiki/Stress-Test) like this:

menno@maiwa ~/go/src/github.com/juju/juju/provider/joyent
$ gotest-stress -check.v -check.f TestInstancesGathering

The issue is that the Joyent provider's StopInstances call makes parallel calls to the Joyent DeleteMachine API. This is no doubt fine when talking to the real Joyent API but the Joyent test double API we use from github.com/joyent/gosdc/localservices/cloudapi isn't goroutine safe. If multiple DeleteMachine calls are handled at the same time there is a risk of a data race.

Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :

The TestInstancesGathering test is the only one which calls StopInstances with more than one instance at a time which is why that's the one that's triggering the race detector.

Fix is here: https://github.com/juju/juju/pull/5839

Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :
Changed in juju-core:
status: In Progress → Fix Committed
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
affects: juju-core → juju
Changed in juju:
milestone: 2.0-beta13 → none
milestone: none → 2.0-beta13
Changed in juju-core:
assignee: nobody → Menno Smits (menno.smits)
importance: Undecided → High
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.