Performance issue using network rbac rules

Bug #2071374 reported by Roberto Bartzen Acosta
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Roberto Bartzen Acosta

Bug Description

Hello everyone.

We noticed a strange behavior on Neutron when we added network rbacs. Basically, we create network rbacs for subnet pools, address-scope and network. So, after we include the network rbac rules, the SQL queries without filters on the subnet and network tables took minutes, as well as attaching gw router ports.

The network topologhy is basically like this:

- rbacs
[project SERVICE-1] address-scope rbac for [address-scope type-1]
[project SERVICE-1] subnetpool rbac for [subnet pool type-1]

- subnet
[project SERVICE-1] subnet using [subnet pool type-1] via project rbac

- inter projects network rbacs...
[project tenant-1] network rbac to (network X of [project SERVICE-1])

It's not so different from common usage, with the difference that we have thousands of projects associated with some types of address-scopes and subnet-pools.

openstack address scope list | grep -v "+" | wc -l
8
openstack subnet pool list | grep pool | wc -l
9
openstack subnet list | wc -l
5343
openstack network list | wc -l
6813
openstack router list | wc -l
3808
openstack network rbac list | wc -l
6770
openstack network rbac list | grep address_scope | wc -l
2254
openstack network rbac list | grep subnetpool | wc -l
2254
openstack network rbac list | grep network | wc -l
2258
openstack project list | wc -l
3804

We enabled the slow query log in mysql and started seeing huge queries like this:

# Time: 240627 15:11:37
# User@Host: neutron[neutron] @ srv-0001 [10.1.2.3]
# Thread_id: 353060 Schema: neutron QC_hit: No
# Query_time: 58.846345 Lock_time: 0.000528 Rows_sent: 583345 Rows_examined: 1750117
# Rows_affected: 0 Bytes_sent: 412285556
SET timestamp=1719501097;
SELECT subnets.project_id AS subnets_project_id, subnets.id AS subnets_id, subnets.in_use AS subnets_in_use, subnets.name AS subnets_name, subnets.network_id AS subnets_network_id, subnets.segment_id AS subnets_segment_id, subnets.subnetpool_id AS subnets_subnetpool_id, subnets.ip_version AS subnets_ip_version, subnets.cidr AS subnets_cidr, subnets.gateway_ip AS subnets_gateway_ip, subnets.enable_dhcp AS subnets_enable_dhcp, subnets.ipv6_ra_mode AS subnets_ipv6_ra_mode, subnets.ipv6_address_mode AS subnets_ipv6_address_mode, subnets.standard_attr_id AS subnets_standard_attr_id, subnetpools_1.shared AS subnetpools_1_shared, subnetpoolrbacs_1.project_id AS subnetpoolrbacs_1_project_id, subnetpoolrbacs_1.id AS subnetpoolrbacs_1_id, subnetpoolrbacs_1.target_project AS subnetpoolrbacs_1_target_project, subnetpoolrbacs_1.action AS subnetpoolrbacs_1_action, subnetpoolrbacs_1.object_id AS subnetpoolrbacs_1_object_id, standardattributes_1.id AS standardattributes_1_id, standardattributes_1.resource_type AS standardattributes_1_resource_type, standardattributes_1.description AS standardattributes_1_description, standardattributes_1.revision_number AS standardattributes_1_revision_number, standardattributes_1.created_at AS standardattributes_1_created_at, standardattributes_1.updated_at AS standardattributes_1_updated_at, tags_1.standard_attr_id AS tags_1_standard_attr_id, tags_1.tag AS tags_1_tag, subnetpools_1.project_id AS subnetpools_1_project_id, subnetpools_1.id AS subnetpools_1_id, subnetpools_1.name AS subnetpools_1_name, subnetpools_1.ip_version AS subnetpools_1_ip_version, subnetpools_1.default_prefixlen AS subnetpools_1_default_prefixlen, subnetpools_1.min_prefixlen AS subnetpools_1_min_prefixlen, subnetpools_1.max_prefixlen AS subnetpools_1_max_prefixlen, subnetpools_1.is_default AS subnetpools_1_is_default, subnetpools_1.default_quota AS subnetpools_1_default_quota, subnetpools_1.hash AS subnetpools_1_hash, subnetpools_1.address_scope_id AS subnetpools_1_address_scope_id, subnetpools_1.standard_attr_id AS subnetpools_1_standard_attr_id, networkrbacs_1.project_id AS networkrbacs_1_project_id, networkrbacs_1.id AS networkrbacs_1_id, networkrbacs_1.target_project AS networkrbacs_1_target_project, networkrbacs_1.action AS networkrbacs_1_action, networkrbacs_1.object_id AS networkrbacs_1_object_id, standardattributes_2.id AS standardattributes_2_id, standardattributes_2.resource_type AS standardattributes_2_resource_type, standardattributes_2.description AS standardattributes_2_description, standardattributes_2.revision_number AS standardattributes_2_revision_number, standardattributes_2.created_at AS standardattributes_2_created_at, standardattributes_2.updated_at AS standardattributes_2_updated_at, tags_2.standard_attr_id AS tags_2_standard_attr_id, tags_2.tag AS tags_2_tag, subnet_dns_publish_fixed_ips_1.subnet_id AS subnet_dns_publish_fixed_ips_1_subnet_id, subnet_dns_publish_fixed_ips_1.dns_publish_fixed_ip AS subnet_dns_publish_fixed_ips_1_dns_publish_fixed_ip
FROM subnets LEFT OUTER JOIN subnetpools AS subnetpools_1 ON subnets.subnetpool_id = subnetpools_1.id LEFT OUTER JOIN subnetpoolrbacs AS subnetpoolrbacs_1 ON subnetpools_1.id = subnetpoolrbacs_1.object_id LEFT OUTER JOIN standardattributes AS standardattributes_1 ON standardattributes_1.id = subnetpools_1.standard_attr_id LEFT OUTER JOIN tags AS tags_1 ON standardattributes_1.id = tags_1.standard_attr_id LEFT OUTER JOIN networkrbacs AS networkrbacs_1 ON subnets.network_id = networkrbacs_1.object_id LEFT OUTER JOIN standardattributes AS standardattributes_2 ON standardattributes_2.id = subnets.standard_attr_id LEFT OUTER JOIN tags AS tags_2 ON standardattributes_2.id = tags_2.standard_attr_id LEFT OUTER JOIN subnet_dns_publish_fixed_ips AS subnet_dns_publish_fixed_ips_1 ON subnets.id = subnet_dns_publish_fixed_ips_1.subnet_id ORDER BY subnets.id ASC, subnets.standard_attr_id ASC;

Executing this query in mysql cli the result takes approximately 10 seconds:

$ time mysql --database neutron -e "SELECT subnets.project_id AS subnets_project_id, subnets.id AS subnets_id, subnets.in_use AS subnets_in_use, subnets.name AS subnets_name, subnets.network_id AS subnets_network_id, subnets.segment_id AS subnets_segment_id, subnets.subnetpool_id AS subnets_subnetpool_id, subnets.ip_version AS subnets_ip_version, subnets.cidr AS subnets_cidr, subnets.gateway_ip AS subnets_gateway_ip, subnets.enable_dhcp AS subnets_enable_dhcp, subnets.ipv6_ra_mode AS subnets_ipv6_ra_mode, subnets.ipv6_address_mode AS subnets_ipv6_address_mode, subnets.standard_attr_id AS subnets_standard_attr_id, subnetpools_1.shared AS subnetpools_1_shared, subnetpoolrbacs_1.project_id AS subnetpoolrbacs_1_project_id, subnetpoolrbacs_1.id AS subnetpoolrbacs_1_id, subnetpoolrbacs_1.target_project AS subnetpoolrbacs_1_target_project, subnetpoolrbacs_1.action AS subnetpoolrbacs_1_action, subnetpoolrbacs_1.object_id AS subnetpoolrbacs_1_object_id, standardattributes_1.id AS standardattributes_1_id, standardattributes_1.resource_type AS standardattributes_1_resource_type, standardattributes_1.description AS standardattributes_1_description, standardattributes_1.revision_number AS standardattributes_1_revision_number, standardattributes_1.created_at AS standardattributes_1_created_at, standardattributes_1.updated_at AS standardattributes_1_updated_at, subnetpools_1.project_id AS subnetpools_1_project_id, subnetpools_1.id AS subnetpools_1_id, subnetpools_1.name AS subnetpools_1_name, subnetpools_1.ip_version AS subnetpools_1_ip_version, subnetpools_1.default_prefixlen AS subnetpools_1_default_prefixlen, subnetpools_1.min_prefixlen AS subnetpools_1_min_prefixlen, subnetpools_1.max_prefixlen AS subnetpools_1_max_prefixlen, subnetpools_1.is_default AS subnetpools_1_is_default, subnetpools_1.default_quota AS subnetpools_1_default_quota, subnetpools_1.hash AS subnetpools_1_hash, subnetpools_1.address_scope_id AS subnetpools_1_address_scope_id, subnetpools_1.standard_attr_id AS subnetpools_1_standard_attr_id, networkrbacs_1.project_id AS networkrbacs_1_project_id, networkrbacs_1.id AS networkrbacs_1_id, networkrbacs_1.target_project AS networkrbacs_1_target_project, networkrbacs_1.action AS networkrbacs_1_action, networkrbacs_1.object_id AS networkrbacs_1_object_id, standardattributes_2.id AS standardattributes_2_id, standardattributes_2.resource_type AS standardattributes_2_resource_type, standardattributes_2.description AS standardattributes_2_description, standardattributes_2.revision_number AS standardattributes_2_revision_number, standardattributes_2.created_at AS standardattributes_2_created_at, standardattributes_2.updated_at AS standardattributes_2_updated_at, subnet_dns_publish_fixed_ips_1.subnet_id AS subnet_dns_publish_fixed_ips_1_subnet_id, subnet_dns_publish_fixed_ips_1.dns_publish_fixed_ip AS subnet_dns_publish_fixed_ips_1_dns_publish_fixed_ip FROM subnets LEFT OUTER JOIN subnetpools AS subnetpools_1 ON subnets.subnetpool_id = subnetpools_1.id LEFT OUTER JOIN subnetpoolrbacs AS subnetpoolrbacs_1 ON subnetpools_1.id = subnetpoolrbacs_1.object_id LEFT OUTER JOIN standardattributes AS standardattributes_1 ON standardattributes_1.id = subnetpools_1.standard_attr_id LEFT OUTER JOIN networkrbacs AS networkrbacs_1 ON subnets.network_id = networkrbacs_1.object_id LEFT OUTER JOIN standardattributes AS standardattributes_2 ON standardattributes_2.id = subnets.standard_attr_id LEFT OUTER JOIN subnet_dns_publish_fixed_ips AS subnet_dns_publish_fixed_ips_1 ON subnets.id = subnet_dns_publish_fixed_ips_1.subnet_id ORDER BY subnets.id ASC, subnets.standard_attr_id ASC;" > /tmp/output

real 0m8.150s
user 0m4.325s
sys 0m1.381s

But, the end result is a series of "repeated subnet name values" generated by the super query that orm created... with almost 600k lines.

cat /tmp/output | wc -l
583346

This seems like a classic case of “cartesian” issue since we are using joined inside joined for (subnets -> subnetpool -> rbac_rules).

From the user's perspective this is very bad, as it takes minutes to return a list of networks or subnets.

$ time openstack network list | wc -l
6813

real 2m17.439s
user 0m6.258s
sys 0m0.140s

time openstack subnet list | wc -l
5343

real 1m51.134s
user 0m4.932s
sys 0m0.116s

Revision history for this message
Roberto Bartzen Acosta (rbartzen) wrote :
Download full text (4.4 KiB)

My proposal is to replace the joined relationship for rbac_entries inside SubnetPool DB model, because we don't need a many-to-many relationship here in my point of view. So, we can use the selectin eager loading to make this relationship one-to-many and create the model with only the necessary steps, without exploding into a cartesian product caused by the "left outer join" which is unnecessary in this model.

The "total" queries of this process would be splited into a series of smaller queries (slectin design) with much better performance, and the huge resulting query would be reduced to something like this:

$ time mysql --database neutron -e "SELECT subnets.project_id AS subnets_project_id, subnets.id AS subnets_id, subnets.in_use AS subnets_in_use, subnets.name AS subnets_name, subnets.network_id AS subnets_network_id, subnets.segment_id AS subnets_segment_id, subnets.subnetpool_id AS subnets_subnetpool_id, subnets.ip_version AS subnets_ip_version, subnets.cidr AS subnets_cidr, subnets.gateway_ip AS subnets_gateway_ip, subnets.enable_dhcp AS subnets_enable_dhcp, subnets.ipv6_ra_mode AS subnets_ipv6_ra_mode, subnets.ipv6_address_mode AS subnets_ipv6_address_mode, subnets.standard_attr_id AS subnets_standard_attr_id, subnetpools_1.shared AS subnetpools_1_shared, standardattributes_1.id AS standardattributes_1_id, standardattributes_1.resource_type AS standardattributes_1_resource_type, standardattributes_1.description AS standardattributes_1_description, standardattributes_1.revision_number AS standardattributes_1_revision_number, standardattributes_1.created_at AS standardattributes_1_created_at, standardattributes_1.updated_at AS standardattributes_1_updated_at, tags_1.standard_attr_id AS tags_1_standard_attr_id, tags_1.tag AS tags_1_tag, subnetpools_1.project_id AS subnetpools_1_project_id, subnetpools_1.id AS subnetpools_1_id, subnetpools_1.name AS subnetpools_1_name, subnetpools_1.ip_version AS subnetpools_1_ip_version, subnetpools_1.default_prefixlen AS subnetpools_1_default_prefixlen, subnetpools_1.min_prefixlen AS subnetpools_1_min_prefixlen, subnetpools_1.max_prefixlen AS subnetpools_1_max_prefixlen, subnetpools_1.is_default AS subnetpools_1_is_default, subnetpools_1.default_quota AS subnetpools_1_default_quota, subnetpools_1.hash AS subnetpools_1_hash, subnetpools_1.address_scope_id AS subnetpools_1_address_scope_id, subnetpools_1.standard_attr_id AS subnetpools_1_standard_attr_id, networkrbacs_1.project_id AS networkrbacs_1_project_id, networkrbacs_1.id AS networkrbacs_1_id, networkrbacs_1.target_project AS networkrbacs_1_target_project, networkrbacs_1.action AS networkrbacs_1_action, networkrbacs_1.object_id AS networkrbacs_1_object_id, standardattributes_2.id AS standardattributes_2_id, standardattributes_2.resource_type AS standardattributes_2_resource_type, standardattributes_2.description AS standardattributes_2_description, standardattributes_2.revision_number AS standardattributes_2_revision_number, standardattributes_2.created_at AS standardattributes_2_created_at, standardattributes_2.updated_at AS standardattributes_2_updated_at, tags_2.standard_attr_id AS tags_2_standard_attr_id, tags_2.tag AS tags_2_tag, subnet_dns_p...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/922972

Changed in neutron:
status: New → In Progress
Changed in neutron:
importance: Undecided → High
assignee: nobody → Roberto Bartzen Acosta (rbartzen)
Revision history for this message
Roberto Bartzen Acosta (rbartzen) wrote (last edit ):

About the LP#2037107, I looked at this bug report trying to found similarities, however, the way to reproduce is a little different because I don't have rbacs on the subnets of public networks (providers). I mean, we don't have thousands of rbacs created for the same big public network!

The way we reproduced this bug here is relatively simple. We created 2500 tenant projects and 2500 service projects. Each project has its router, network and subnet.

Tenant projects: network -> subnet (internal subnet with DHCP)
Service projects: network -> subnet using subnet pool (subnet pool using address scope)

The RBAC is configured in the sense that tenant project X accesses the network of service project X. So, successively, the tenant project Y accesses the network of project project Y, etc. In this case we create communication between networks of different "types" of projects.

Therefore, I imagine that this is different from what is reported in bug LP#2037107, as there the relationship of RBACs within the 'subnet model' ends up creating this perfomance issue, whereas in the case of small networks per project, separated and delimited by subnet pools, the performance issue occurs in the relationship of RBACs within the subnetpool model. The origin of the issue seems to be the same, when the joined relationship is configured, the ORM explodes the combinations in the resulting query.

The solution may be to change all RBAC relationships in the network, subnet, subnetpool, qos, addresses scope, and address group models by selectin, at least the subnetpool part seems to solved when I changed from "joined" to "selectin", other parts can potentially be resolved as well.

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello:

As commented in IRC, this bug is similar to [1]. I've replicated the environment you have and the only query that takes more time than expected is the "subnet list" query. All the subnets are using a subnet pool and the subnet pool an address scope. Each network is shared to a project.

As commented in [1], the cardinality of the subnet query can be solved by decoupling the network RBAC from the subnet query. But I still don't know what could be the best approach for this.

Executing step by step the "subnet list" API call, I realized that the issue is not in the query itself but in the policy hook. As in your tests, the query could take up to 10 seconds but the "subnet list" (or "network list") command take 2 minutes. This is due to the "get_subnet" policy that has the following checks:
* base.ADMIN_OR_NET_OWNER_MEMBER
* base.PROJECT_READER,
* 'rule:shared'

The first one checks if the user is an admin or the owner of the parent resource (the network). The ``OwnerCheck`` policy check class uses a cache to store the network information but that cache is per call. In this case, we need to retrieve all networks, one by one, when checking each subnet parent ownership. That takes much more time than the "subnet list" query itself.

Having said that, the upper issue happens when:
* One single project has created all the resources and shared them to other projects. That means the resources (network, subnets, RBACs, etc) belong to this project.
* The CLI requests are done with a non-admin user, that makes no sense as this is the "creator" project. If the queries are done in the other projects, the list returned should be very limited. If the query are done inside the "creator" project with the admin user, the query won't filter by RBAC and the policy hook will return immediately.

I don't see any improvement using [2]. Actually this patch will affect negatively the subnetpool queries when filtering their own RBAC registers.

Actions:
* I'm going to propose a change in the order of the "get_subnet" policies. Changing the order of the checks, moving first the "PROJECT_READER" and "rule:shared" rules, improves a lot the policy hook. The last check (admin or net owner) will be done last. Actually the admin user tested before any policy check is done; the admin user does not need to pass any.

Regards.

[1]https://bugs.launchpad.net/neutron/+bug/2037107
[2]https://review.opendev.org/c/openstack/neutron/+/922972

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/923195

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/2024.1)

Related fix proposed to branch: stable/2024.1
Review: https://review.opendev.org/c/openstack/neutron/+/923488

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/2023.2)

Related fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/neutron/+/923494

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/2024.1)

Fix proposed to branch: stable/2024.1
Review: https://review.opendev.org/c/openstack/neutron/+/923495

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/2023.2)

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/neutron/+/923496

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/neutron/+/923497

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/923195
Committed: https://opendev.org/openstack/neutron/commit/729920da5e836fa7a27b1b85b3b2999146d905ba
Submitter: "Zuul (22348)"
Branch: master

commit 729920da5e836fa7a27b1b85b3b2999146d905ba
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Tue Jul 2 07:29:44 2024 +0000

    Reorder subnet RBAC policy check strings

    The subnet policy rule ``ADMIN_OR_NET_OWNER_MEMBER`` requires to
    retrieve the network object from the database to read the project ID.
    When retrieving a list of subnets, this operation can slow down the
    API call. This patch is reordering the subnet RBAC policy checks to
    make this check at the end.

    As reported in the related LP bug, it is usual to have a "creator"
    project where different resources are created and then shared to others;
    in this case networks and subnets. All these subnets will belong to the
    same project. If a non-admin user from this project list all the
    subnets, with the code before to this patch it would be needed to
    retrieve all the networks to read the project ID. With the current code
    it is needed only to check that the user is a project reader.

    The following benchmark has been done in a VM running a standalone
    OpenStack deployment. One project has created 400 networks and 400
    subnets (one per network). Each network has been shared with another
    project. API time to process "GET /networking/v2.0/subnets":
    * Without this patch: 5.5 seconds (average)
    * With this patch: 0.25 seconds (average)

    Related-Bug: #2071374
    Related-Bug: #2037107
    Change-Id: Ibca174213bba3c56fc18ec2732d80054ac95e859

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/922972
Committed: https://opendev.org/openstack/neutron/commit/46edf255bde0603fe88b2dd9f4e482590e384382
Submitter: "Zuul (22348)"
Branch: master

commit 46edf255bde0603fe88b2dd9f4e482590e384382
Author: Roberto Bartzen Acosta <email address hidden>
Date: Thu Jun 27 18:29:37 2024 +0000

    Change to use selectin for RBACs in SubnetPool DB load strategy

    To solve a performance issue when using network rbacs with thousands
    of entries in the subnets, networks, and networks rbacs tables, it's
    necessary to change the eager loader strategy to not create and process
    a "cartesian" product of thousands of unnecessary combinatios for the
    purpose of the relationship included between rbac rules and subnetpool
    database model.

    We don't need a many-to-many relationship here. So, we can use the
    selectin eager loading to make this relationship one-to-many and create
    the model with only the necessary steps, without exploding into a
    thousands of rows caused by the "left outer join" cascade.

    The "total" queries from this process would be divided into a series of
    smaller queries with much better performance, and the resulting huge
    select query will be resolved much faster without joined cascade,
    representing significant performance gains.

    Closes-bug: #2071374
    Change-Id: I2e4fa0ffd2ad091ab6928bdf0d440b082c37def2

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/2024.1)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/923488
Committed: https://opendev.org/openstack/neutron/commit/f25cc2f503573e2288b61e262bcc3900c62c1a04
Submitter: "Zuul (22348)"
Branch: stable/2024.1

commit f25cc2f503573e2288b61e262bcc3900c62c1a04
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Tue Jul 2 07:29:44 2024 +0000

    Reorder subnet RBAC policy check strings

    The subnet policy rule ``ADMIN_OR_NET_OWNER_MEMBER`` requires to
    retrieve the network object from the database to read the project ID.
    When retrieving a list of subnets, this operation can slow down the
    API call. This patch is reordering the subnet RBAC policy checks to
    make this check at the end.

    As reported in the related LP bug, it is usual to have a "creator"
    project where different resources are created and then shared to others;
    in this case networks and subnets. All these subnets will belong to the
    same project. If a non-admin user from this project list all the
    subnets, with the code before to this patch it would be needed to
    retrieve all the networks to read the project ID. With the current code
    it is needed only to check that the user is a project reader.

    The following benchmark has been done in a VM running a standalone
    OpenStack deployment. One project has created 400 networks and 400
    subnets (one per network). Each network has been shared with another
    project. API time to process "GET /networking/v2.0/subnets":
    * Without this patch: 5.5 seconds (average)
    * With this patch: 0.25 seconds (average)

    Related-Bug: #2071374
    Related-Bug: #2037107
    Change-Id: Ibca174213bba3c56fc18ec2732d80054ac95e859
    (cherry picked from commit 729920da5e836fa7a27b1b85b3b2999146d905ba)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/2024.1)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/923495
Committed: https://opendev.org/openstack/neutron/commit/4c70d3ea61cf6f3580e7b5e9cfbe69b2e976e810
Submitter: "Zuul (22348)"
Branch: stable/2024.1

commit 4c70d3ea61cf6f3580e7b5e9cfbe69b2e976e810
Author: Roberto Bartzen Acosta <email address hidden>
Date: Thu Jun 27 18:29:37 2024 +0000

    Change to use selectin for RBACs in SubnetPool DB load strategy

    To solve a performance issue when using network rbacs with thousands
    of entries in the subnets, networks, and networks rbacs tables, it's
    necessary to change the eager loader strategy to not create and process
    a "cartesian" product of thousands of unnecessary combinatios for the
    purpose of the relationship included between rbac rules and subnetpool
    database model.

    We don't need a many-to-many relationship here. So, we can use the
    selectin eager loading to make this relationship one-to-many and create
    the model with only the necessary steps, without exploding into a
    thousands of rows caused by the "left outer join" cascade.

    The "total" queries from this process would be divided into a series of
    smaller queries with much better performance, and the resulting huge
    select query will be resolved much faster without joined cascade,
    representing significant performance gains.

    Closes-bug: #2071374
    Change-Id: I2e4fa0ffd2ad091ab6928bdf0d440b082c37def2
    (cherry picked from commit 46edf255bde0603fe88b2dd9f4e482590e384382)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/923496
Committed: https://opendev.org/openstack/neutron/commit/a73ef23e3fca625b82d59c4e2aa10b8d8736d78b
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit a73ef23e3fca625b82d59c4e2aa10b8d8736d78b
Author: Roberto Bartzen Acosta <email address hidden>
Date: Thu Jun 27 18:29:37 2024 +0000

    Change to use selectin for RBACs in SubnetPool DB load strategy

    To solve a performance issue when using network rbacs with thousands
    of entries in the subnets, networks, and networks rbacs tables, it's
    necessary to change the eager loader strategy to not create and process
    a "cartesian" product of thousands of unnecessary combinatios for the
    purpose of the relationship included between rbac rules and subnetpool
    database model.

    We don't need a many-to-many relationship here. So, we can use the
    selectin eager loading to make this relationship one-to-many and create
    the model with only the necessary steps, without exploding into a
    thousands of rows caused by the "left outer join" cascade.

    The "total" queries from this process would be divided into a series of
    smaller queries with much better performance, and the resulting huge
    select query will be resolved much faster without joined cascade,
    representing significant performance gains.

    Closes-bug: #2071374
    Change-Id: I2e4fa0ffd2ad091ab6928bdf0d440b082c37def2
    (cherry picked from commit 46edf255bde0603fe88b2dd9f4e482590e384382)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/923497
Committed: https://opendev.org/openstack/neutron/commit/46dbdcf3f24acea2f4bbc08835471ffa16c03e81
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit 46dbdcf3f24acea2f4bbc08835471ffa16c03e81
Author: Roberto Bartzen Acosta <email address hidden>
Date: Thu Jun 27 18:29:37 2024 +0000

    Change to use selectin for RBACs in SubnetPool DB load strategy

    To solve a performance issue when using network rbacs with thousands
    of entries in the subnets, networks, and networks rbacs tables, it's
    necessary to change the eager loader strategy to not create and process
    a "cartesian" product of thousands of unnecessary combinatios for the
    purpose of the relationship included between rbac rules and subnetpool
    database model.

    We don't need a many-to-many relationship here. So, we can use the
    selectin eager loading to make this relationship one-to-many and create
    the model with only the necessary steps, without exploding into a
    thousands of rows caused by the "left outer join" cascade.

    The "total" queries from this process would be divided into a series of
    smaller queries with much better performance, and the resulting huge
    select query will be resolved much faster without joined cascade,
    representing significant performance gains.

    Closes-bug: #2071374
    Change-Id: I2e4fa0ffd2ad091ab6928bdf0d440b082c37def2
    (cherry picked from commit 46edf255bde0603fe88b2dd9f4e482590e384382)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/923494
Committed: https://opendev.org/openstack/neutron/commit/0019e448d821441e4cf49332bc710ef3470f3fa9
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit 0019e448d821441e4cf49332bc710ef3470f3fa9
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Tue Jul 2 07:29:44 2024 +0000

    Reorder subnet RBAC policy check strings

    The subnet policy rule ``ADMIN_OR_NET_OWNER_MEMBER`` requires to
    retrieve the network object from the database to read the project ID.
    When retrieving a list of subnets, this operation can slow down the
    API call. This patch is reordering the subnet RBAC policy checks to
    make this check at the end.

    As reported in the related LP bug, it is usual to have a "creator"
    project where different resources are created and then shared to others;
    in this case networks and subnets. All these subnets will belong to the
    same project. If a non-admin user from this project list all the
    subnets, with the code before to this patch it would be needed to
    retrieve all the networks to read the project ID. With the current code
    it is needed only to check that the user is a project reader.

    The following benchmark has been done in a VM running a standalone
    OpenStack deployment. One project has created 400 networks and 400
    subnets (one per network). Each network has been shared with another
    project. API time to process "GET /networking/v2.0/subnets":
    * Without this patch: 5.5 seconds (average)
    * With this patch: 0.25 seconds (average)

    Related-Bug: #2071374
    Related-Bug: #2037107
    Change-Id: Ibca174213bba3c56fc18ec2732d80054ac95e859
    (cherry picked from commit 729920da5e836fa7a27b1b85b3b2999146d905ba)
    (cherry picked from commit f25cc2f503573e2288b61e262bcc3900c62c1a04)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 24.0.1

This issue was fixed in the openstack/neutron 24.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 22.2.0

This issue was fixed in the openstack/neutron 22.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 23.2.0

This issue was fixed in the openstack/neutron 23.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 25.0.0.0rc1

This issue was fixed in the openstack/neutron 25.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.