Nested quota -1 limit race condition
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Cinder |
Won't Fix
|
Undecided
|
Unassigned |
Bug Description
There's a race condition when updating quota limits to/from -1 if the project or it's children are being actively used with volume create/delete requests.
Take the example with a project hierarchy of:
B (limit = -1, in-use = 3) C (limit = 2, in-use = 0)
Now we update the quota limit of B = 5, and at the same time volumes are being deleted from B. The race condition occurs because, in order to update the quota to B, we need to update A so that B is now contributing exactly 5 volumes to A's allocated value from B. Since B limit is -1, we'll subtract the difference between 5 and B's current usage from A. But this happens at the same time that B's usage value is changing (because volumes are getting deleted) causing the race condition. See the code here - https:/
There is no locking between checking the usage of a project and updating the allocated value of it's parent. In order to fix this, it seems like we'd need to add locking so we can:
1) Get the current usage of the project (e.g. B has limit of 3)
2) The quota limit of the project is updated, this way new reservation requests will stop at the appropriate place in the hierarchy (e.g. if B had a child with -1 limit, those reservations should no longer propagate up to A's allocated)
This would require locking in the quota_reserve code as well as any of the quota-update / quota-delete code, which does not seem great, and this would also not help if there were API services running on different servers.
One additional complication occurs because a reservation could get rolled back at a later date. In the cases of nesting -1 values, we create the reservation to affect the parent's allocated as well as the child's reserved and handle these reservations as a group. For instance, if a volume reservation was created on B, there is also a reservation created on A to affect it's allocated value and the reservations are either all committed or all rolled-back. Now if there's a reservation created for B, then we update the limit of B to no longer be -1, we will subtract this reserved volume from A's usage, but if later the reservation is rolled back, the quota of A now will incorrectly be updated.
It seems like there was a possibility of a race condition before (https:/
Can exercise this race condition by running the tests at https:/ /review. openstack. org/#/c/ 285640/ and removing "time.sleep(5)" from setUp and resource_cleanup.