uniter: high frequency relation operations cause rate limits to be exceeded
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
juju-core |
Fix Released
|
High
|
John A Meinel |
Bug Description
Creating a charm which causes relation operations to happen at a high rate will cause too many calls to the provider and hit rate limits.
To reproduce:
deploy this charm, https:/
juju deploy --repository=
juju deploy --repository=
juju add-relation x1:input x2:output
juju add-relation x2:input x1:output
once x1 and x2 are deployed, juju debug-log will show that every 5 seconds (api server watcher polling limit) the relation-changed hook fires on each unit.
now, increase the number of units
juju add-unit -n5 x1
juju add-unit -n5 x2
Then watch the log, uniters on will start to fail with errors from the API server related to provider rate limiting.
At this point, it will be quite difficult to recover as your ec2 credentials are also used for the aws console, which makes it hard to manually terminate the environment.
Why does the uniter make calls to the provider for this simple relation changed hook ?
% cat xplod/hooks/
#!/bin/bash
set -xe
V=$(relation-get v)
V=${V:-0}
relation-set v=$(expr $V + 1)
Changed in juju-core: | |
status: | New → Triaged |
importance: | Undecided → High |
tags: | added: rate-limit |
tags: | added: performance |
Changed in juju-core: | |
status: | Fix Committed → Fix Released |
I haven't dug into this deeply, but I'm guessing the problem is something like "what is the IP address for one end of the relation" which ends up calling Instance.DNSName which involves an API call to the ec2 provider.
Right now I'm debugging the case where every unit that starts up ends up triggering 2 calls (one for the StateAddress and one for the APIAddress). This may or may not actually help, but it is probably a place to start looking.