Unknown quota resource security_group_rule in neutron-rpc-server
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Committed
|
Medium
|
Rodolfo Alonso |
Bug Description
When restarting our linuxbridge-agents, we see exceptions for some of the
networks: Unknown quota resources ['security_
linuxbridge-agent from fully bringing up that network.
Prerequisites:
* run api-server and rpc-server in different process
We have neutron-server running with uWSGI and start the neutron-rpc-server in another container.
Steps to reproduce:
* have a project with server/
* have an unused default security group
* delete the default security group
* restart the appropriate linuxbridge-agent
Version:
* Ussuri with custom patches on top: https:/
Expected behavior:
linuxbridge-agent should bring up all networks even if the user deleted the
default security group.
Either don't create a default security-group when called via the
linuxbridge-agent instead of the API or make the quota available in the
rpc-server so the default security-group can be created.
Creating/updating a port or creating a network via API will create the default
security group and fix the problem on the linuxbridge-agent, too. I just don't
think that's acceptable to have the user/admin do some API actions in case the
user did something they maybe shouldn't have.
We've also seen the same exception from a dhcp-agent. Attached both a traceback
from linuxbridge as well as from dhcp-agent.
Trying to debug this, we found that no quota resources are registered in neutron-rpc-server. This can be seen when using the eventlet backdoor by these commands:
from neutron.quota import resource_registry;
resource_
tags: | added: linuxbridge |
Changed in neutron: | |
status: | New → Fix Committed |
Hello:
Since [1], the SG DB mixin checks the available SG rule has enough quota. This patch was tested in an environment where the RPC worker was spawned along with the API workers. In this case, the extensions (including the SG and the SG rules) are loaded and the resources are registered [2]
In the case you are presenting, the RPC worker runs in a single service and the quota resources are not loaded. This is indeed a legit issue that affects to any deployment, regardless of the backend.
Since [3] (from Xena) this issue is not present anymore because we no longer make a reservation but just check the quota limit. If the method does not find the tracked resource, it just returns without raising any exception.
This is, with not doubt, my fault when implementing this patch. However I would say that having a CI with two independent workers, API and RPC, would have catch this issue. But I don't know if this is feasible currently.
I'll push a patch for stable releases.
Regards.
[1]https:/ /review. opendev. org/c/openstack /neutron/ +/701565 /github. com/openstack/ neutron/ blob/91decc9514 494fa691cdf5ffc eb86b775b729164 /neutron/ extensions/ securitygroup. py#L330 /review. opendev. org/q/Id7336857 6a948f78a043d7c f0be16661a65626 a9
[2]https:/
[3]https:/