maximum recursion possible while setting aggregates in placement
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Medium
|
Chris Dent |
Bug Description
It's possible for the _ensure_aggregate code in objects/
" ERROR placement.
The "getting the str" part appears to be a coincidence based on reaching a bad stack depth at that particular moment.
This happened while the placeload script was doing its thing of adding aggregates to to 1000 resource providers using asyncio, so concurrency is high and weird. See https:/
It is unlikely that this is going to happen in the real world, but it is the sort of thing it would be nice to be more robust about, perhaps by counting attempts and bailing out?
Changed in nova: | |
status: | Fix Committed → Fix Released |
This proved to be a significant issue while working on https:/ /review. openstack. org/#/c/ 619248/ , a performance measuring job that uses placeload. That uses aiohttp to make concurrent connections to placement. At high concurrency _ensure_aggregate loops a great deal and causes the server to block enough that the client starts experiencing errors because it cannot make a good connection.
I fixed it by preheating the aggregates so that _ensure_aggregate almost always returns after getting the aggregate id, rather than looping to try to create it. With that in place things are very smooth.
That experience suggests we should fix this, because it seems likely that operators might like to do mass aggregate management and use asyncio-based tools to do it, or maybe something from languages like go where similar behaviour might happen.