Unknown quota resource security_group_rule in neutron-rpc-server

Bug #1992161 reported by Johannes Kulik
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Committed
Medium
Rodolfo Alonso

Bug Description

When restarting our linuxbridge-agents, we see exceptions for some of the
networks: Unknown quota resources ['security_group_rule']. This stops the
linuxbridge-agent from fully bringing up that network.

Prerequisites:
* run api-server and rpc-server in different process
  We have neutron-server running with uWSGI and start the neutron-rpc-server in another container.

Steps to reproduce:
* have a project with server/network/ports
* have an unused default security group
* delete the default security group
* restart the appropriate linuxbridge-agent

Version:
* Ussuri with custom patches on top: https://github.com/sapcc/neutron

Expected behavior:
linuxbridge-agent should bring up all networks even if the user deleted the
default security group.

Either don't create a default security-group when called via the
linuxbridge-agent instead of the API or make the quota available in the
rpc-server so the default security-group can be created.

Creating/updating a port or creating a network via API will create the default
security group and fix the problem on the linuxbridge-agent, too. I just don't
think that's acceptable to have the user/admin do some API actions in case the
user did something they maybe shouldn't have.

We've also seen the same exception from a dhcp-agent. Attached both a traceback
from linuxbridge as well as from dhcp-agent.

Trying to debug this, we found that no quota resources are registered in neutron-rpc-server. This can be seen when using the eventlet backdoor by these commands:
  from neutron.quota import resource_registry;
  resource_registry.get_all_resources()

Revision history for this message
Johannes Kulik (jkulik) wrote :
tags: added: linuxbridge
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello:

Since [1], the SG DB mixin checks the available SG rule has enough quota. This patch was tested in an environment where the RPC worker was spawned along with the API workers. In this case, the extensions (including the SG and the SG rules) are loaded and the resources are registered [2]

In the case you are presenting, the RPC worker runs in a single service and the quota resources are not loaded. This is indeed a legit issue that affects to any deployment, regardless of the backend.

Since [3] (from Xena) this issue is not present anymore because we no longer make a reservation but just check the quota limit. If the method does not find the tracked resource, it just returns without raising any exception.

This is, with not doubt, my fault when implementing this patch. However I would say that having a CI with two independent workers, API and RPC, would have catch this issue. But I don't know if this is feasible currently.

I'll push a patch for stable releases.

Regards.

[1]https://review.opendev.org/c/openstack/neutron/+/701565
[2]https://github.com/openstack/neutron/blob/91decc9514494fa691cdf5ffceb86b775b729164/neutron/extensions/securitygroup.py#L330
[3]https://review.opendev.org/q/Id73368576a948f78a043d7cf0be16661a65626a9

Changed in neutron:
assignee: nobody → Rodolfo Alonso (rodolfo-alonso-hernandez)
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/neutron/+/864765

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/neutron/+/864766

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/neutron/+/864767

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/864435
Committed: https://opendev.org/openstack/neutron/commit/02bdd0470246dd768227affa2d6a8dd8328d3463
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 02bdd0470246dd768227affa2d6a8dd8328d3463
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Mon Nov 14 17:43:27 2022 +0000

    [stable-only] Do not fail making reservation when creating a SG

    Do not fail during the creation of a security group when trying to
    make a quota reservation for the security group rules. This feature
    was added in [1], in order to prevent the rule quota excess during
    the security group creation.

    However, as reported in LP#1992161, this method can be called from
    the RPC worker. If this RPC worker is spawned alone (not with the API
    workers), the extensions are not loaded and the security group rule
    quota resources are not created. That means the quota engine does not
    have the security group rules as managed resources (in this worker).

    When a new network (and the first subnet) is created, the DHCP agent
    (or agents) handling this network will try to create the DHCP port.
    If, as commented in the LP bug, the default security group is not
    created, the RPC worker will try to create it. In this case this
    patch skips the quota check.

    This patch is for stable releases only. Since Xena, this check is
    done using a new method called "quota_limit_check" [2]. This method
    does not fail in the related case.

    [1]https://review.opendev.org/q/I0a9b91b09d6260ff96fdba2f0a455de53bbc1f00
    [2]https://review.opendev.org/q/Id73368576a948f78a043d7cf0be16661a65626a9

    Closes-Bug: #1992161
    Related-Bug: #1858680
    Change-Id: I0f20b17c1b13c3cf56de70588fca4a6956d276df

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/864765
Committed: https://opendev.org/openstack/neutron/commit/90865c06afe9780ac3116be9e527da9a75944c96
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 90865c06afe9780ac3116be9e527da9a75944c96
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Mon Nov 14 17:43:27 2022 +0000

    [stable-only] Do not fail making reservation when creating a SG

    Do not fail during the creation of a security group when trying to
    make a quota reservation for the security group rules. This feature
    was added in [1], in order to prevent the rule quota excess during
    the security group creation.

    However, as reported in LP#1992161, this method can be called from
    the RPC worker. If this RPC worker is spawned alone (not with the API
    workers), the extensions are not loaded and the security group rule
    quota resources are not created. That means the quota engine does not
    have the security group rules as managed resources (in this worker).

    When a new network (and the first subnet) is created, the DHCP agent
    (or agents) handling this network will try to create the DHCP port.
    If, as commented in the LP bug, the default security group is not
    created, the RPC worker will try to create it. In this case this
    patch skips the quota check.

    This patch is for stable releases only. Since Xena, this check is
    done using a new method called "quota_limit_check" [2]. This method
    does not fail in the related case.

    [1]https://review.opendev.org/q/I0a9b91b09d6260ff96fdba2f0a455de53bbc1f00
    [2]https://review.opendev.org/q/Id73368576a948f78a043d7cf0be16661a65626a9

    Conflicts:
          neutron/db/securitygroups_db.py

    Closes-Bug: #1992161
    Related-Bug: #1858680
    Change-Id: I0f20b17c1b13c3cf56de70588fca4a6956d276df
    (cherry picked from commit 02bdd0470246dd768227affa2d6a8dd8328d3463)

tags: added: in-stable-victoria
tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/864766
Committed: https://opendev.org/openstack/neutron/commit/49980e3d120ca1f0cdd761477a96b27415b3aefe
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit 49980e3d120ca1f0cdd761477a96b27415b3aefe
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Mon Nov 14 17:43:27 2022 +0000

    [stable-only] Do not fail making reservation when creating a SG

    Do not fail during the creation of a security group when trying to
    make a quota reservation for the security group rules. This feature
    was added in [1], in order to prevent the rule quota excess during
    the security group creation.

    However, as reported in LP#1992161, this method can be called from
    the RPC worker. If this RPC worker is spawned alone (not with the API
    workers), the extensions are not loaded and the security group rule
    quota resources are not created. That means the quota engine does not
    have the security group rules as managed resources (in this worker).

    When a new network (and the first subnet) is created, the DHCP agent
    (or agents) handling this network will try to create the DHCP port.
    If, as commented in the LP bug, the default security group is not
    created, the RPC worker will try to create it. In this case this
    patch skips the quota check.

    This patch is for stable releases only. Since Xena, this check is
    done using a new method called "quota_limit_check" [2]. This method
    does not fail in the related case.

    [1]https://review.opendev.org/q/I0a9b91b09d6260ff96fdba2f0a455de53bbc1f00
    [2]https://review.opendev.org/q/Id73368576a948f78a043d7cf0be16661a65626a9

    Conflicts:
          neutron/db/securitygroups_db.py

    Closes-Bug: #1992161
    Related-Bug: #1858680
    Change-Id: I0f20b17c1b13c3cf56de70588fca4a6956d276df
    (cherry picked from commit 02bdd0470246dd768227affa2d6a8dd8328d3463)
    (cherry picked from commit 90865c06afe9780ac3116be9e527da9a75944c96)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/train)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/864767
Committed: https://opendev.org/openstack/neutron/commit/03abe3848bdc3d2c1edb4b26ff5545cbdd5e4bc3
Submitter: "Zuul (22348)"
Branch: stable/train

commit 03abe3848bdc3d2c1edb4b26ff5545cbdd5e4bc3
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Mon Nov 14 17:43:27 2022 +0000

    [stable-only] Do not fail making reservation when creating a SG

    Do not fail during the creation of a security group when trying to
    make a quota reservation for the security group rules. This feature
    was added in [1], in order to prevent the rule quota excess during
    the security group creation.

    However, as reported in LP#1992161, this method can be called from
    the RPC worker. If this RPC worker is spawned alone (not with the API
    workers), the extensions are not loaded and the security group rule
    quota resources are not created. That means the quota engine does not
    have the security group rules as managed resources (in this worker).

    When a new network (and the first subnet) is created, the DHCP agent
    (or agents) handling this network will try to create the DHCP port.
    If, as commented in the LP bug, the default security group is not
    created, the RPC worker will try to create it. In this case this
    patch skips the quota check.

    This patch is for stable releases only. Since Xena, this check is
    done using a new method called "quota_limit_check" [2]. This method
    does not fail in the related case.

    [1]https://review.opendev.org/q/I0a9b91b09d6260ff96fdba2f0a455de53bbc1f00
    [2]https://review.opendev.org/q/Id73368576a948f78a043d7cf0be16661a65626a9

    Conflicts:
          neutron/db/securitygroups_db.py

    Closes-Bug: #1992161
    Related-Bug: #1858680
    Change-Id: I0f20b17c1b13c3cf56de70588fca4a6956d276df
    (cherry picked from commit 02bdd0470246dd768227affa2d6a8dd8328d3463)
    (cherry picked from commit 90865c06afe9780ac3116be9e527da9a75944c96)

Changed in neutron:
status: New → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron train-eol

This issue was fixed in the openstack/neutron train-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron ussuri-eol

This issue was fixed in the openstack/neutron ussuri-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron victoria-eom

This issue was fixed in the openstack/neutron victoria-eom release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron wallaby-eom

This issue was fixed in the openstack/neutron wallaby-eom release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.