pyjuju

ProvisioningAgent has to deal with eventual consistency

Bug #639888 reported by Gustavo Niemeyer on 2010-09-15

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	pyjuju	Triaged	Low	Unassigned

Bug Description

The EC2 API is "eventually consistent", which also means it's hard to deal with when one wants to infer decisions from a retrieved state.

The ProvisioningAgent is in charge of firing new machines to cover requested machine states that were never seen, but also to cover machine states that were alive but died for whatever reason when they shouldn't.

Now, imagine the following sequence of actions within the ProvisioningAgent:

1. Acquire the topology lock to ensure no one else attempts changes for now
2. Detect a machine state without an id (new machine requested by the admin)
3. Fire the new machine
4. Store the new machine id in the machine state in zookeeper
5. Release the topology lock
6. Acquire the topology lock again, and start over
7. Detect a machine state with an id (set in 4)
8. Observe that EC2 doesn't know about this id yet (eventual consistency FTW!)
9. Behave as if the machine had died, and fire another machine!
10. Repeat from 4.

This problem may be fixed by introducing a "started_time" parameter into the machine state, and ignoring machines which were acted upon recently.

See original description

Tags:

Gustavo Niemeyer (niemeyer) on 2010-09-15

Changed in ensemble:
status:	New → Confirmed
importance:	Undecided → High
description:	updated

Kapil Thangavelu (hazmat) on 2010-12-24

Changed in ensemble:
milestone:	none → 0.4

Kapil Thangavelu (hazmat) on 2011-02-03

Changed in ensemble:
importance:	High → Medium

Kapil Thangavelu (hazmat) on 2011-02-03

Changed in ensemble:
milestone:	0.4 → budapest

Kapil Thangavelu (hazmat) on 2011-02-03

tags:

added: agents

Kapil Thangavelu (hazmat) on 2011-05-11

Changed in ensemble:
milestone:	budapest → dublin

Kapil Thangavelu (hazmat) on 2011-08-17

Changed in ensemble:
milestone:	dublin → none

Curtis Hovey (sinzui) on 2013-10-12

Changed in juju:
status:	Confirmed → Triaged

Curtis Hovey (sinzui) on 2013-10-15

Changed in juju:
importance:	Medium → Low

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.