[RFE] Make controllers with different list of supported API extensions to behave identically
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Won't Fix
|
Wishlist
|
Unassigned |
Bug Description
The idea is to make controllers behave the same on API layer irrespective of the fact whether they, due to their different major versions, or because of different configuration files, support different lists of API extensions.
The primary use case here is when controllers are upgraded in rolling mode, when you have different major versions running and probably serving API requests in round-robin implemented by a frontend load balancer. If version N exposes extensions A,B,C,D, while N+1 exposes A,B,C,D,E, then during upgrade when both versions are running, API /extensions/ endpoint should return [A,B,C,D]. After all controllers get to the new major version, they can switch to [A,B,C,D,E].
This proposal implies there is mutual awareness of controller services about each other and their lists of supported extensions that will be achieved by storing lists in a new servers table, similar to agents tables we have.
On service startup, controllers will discover information about other controllers from the table and load only those extensions that are supported by all controller peers. We may also introduce a mechanism where a signal triggers reload of extensions based on current table info state, or a periodic reloading thread that will look at the table e.g. every 60 seconds. (An alternative could be discovering that info on each API request, but that would be too consuming.)
This proposal does not handle case where we drop an extension in a span of a single cycle (like replacing timestamp extension with timestamp_core). We may need to handle those cases by some other means (the easiest being not allowing such drastic in-place replacement of attribute format).
Changed in neutron: | |
assignee: | nobody → Ihar Hrachyshka (ihar-hrachyshka) |
importance: | Undecided → Wishlist |
tags: | added: rfe |
tags: | added: api |
Changed in neutron: | |
status: | New → Triaged |
As an aside, many of the rolling upgrade conversations with deployers have involved abandoning round-robin (which is currently the typical load balancing strategy) in favor of sticky sessions, which (in the base case, at least) prevents clients from seeing API extensions appear and disappear "randomly." Disregarding the other upsides and downsides of sticky sessions, that choice may mitigate the impact of this issue.
The proposal here describes a clustering behavior, which is (in my experience) relatively complicated (versus a non-clustered service), difficult to get correct (what controls the servers table, and what happens when it fails?), and complicates the deployer experience (order of operations, recovering from failure modes, etc).
Is there any reason why a simpler approach along the lines of feature flags would not solve the same issue? For example, if you assert that new features (i.e. API extensions) are not automatically exposed as a result of the upgrade process, but instead as a result of configuration changes, then new features like API extensions could be deployed as a result of canary deployment processes intended explicitly to roll out and test new features, rather than as automatically appearing along with upgraded code.