Slownesses on neutron API with many RBAC rules
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
Medium
|
Rodolfo Alonso |
Bug Description
* Summary: Slownesses on neutron API with many RBAC rules
* High level description: Sharing several networks or security groups to project drastically increase API response time on some routes (/networks or /server/detail).
For quite some time we have observing that reponse times are increasing (slowly fur surely) on /networks calls. We have increased the number of Neutron workers, but in vain.
Lately, we're observing that it's getting worse (reponse time form 5 to 370 seconds). We discarded possible bottlenecks one by one (our service endpoint performance, neutron API configuration, etc).
But we have found that some calls in the DB takes a lot of time. It seems they are stuck in the mariadb database (10.3.10). So we have captured a slow queries in mysql.
An example of for /server/detail:
-------
http://
We can see that there are more than 2 millions of rows examinated, and around 1657 returned.
An example of for /networks:
-------
http://
Rows_sent: 517 Rows_examined: 223519
* Pre-conditions:
Database tables size:
table:
- networkrbacs 16928 rows
- securitygrouprbacs 1691 rows
- keystone.project 1713 rows
Control plane nodes are shared with some others services:
- RMQ
- mariadb
- Openstack APIs
- DHCP agents
It seems the code of those lines are based on https:/
* Step-by-step reproduction steps:
- Create a lot of projects (at least 1000)
- Create a SG in admin account
- Create fake networks (vlan, vxlan) with associated
- Share the SG and all networks with all projects
* Expected output: lower response time, less than 5 seconds (approximatively).
* Actual output: May lead to gateway timeout.
* Version:
** OpenStack version Stein releases for all components (neutron 14.2.0).
** CentOS 7.4 with kolla containers
** kolla-ansible for stein release
* Environment: We operate all services in Openstack except for Cinder.
* Perceived severity: Medium
description: | updated |
Changed in neutron: | |
status: | New → Confirmed |
importance: | Undecided → Medium |
tags: | added: db |
tags: | added: loadimpact |
LEDUC Florian (leducflorian) wrote : | #1 |
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote : | #2 |
Hello Leduc:
I can't reproduce those queries in my env. I tried creating 300 projects, 300 networks (one per project) and sharing each network with each project (not a full 300-to-300 matrix, but I have more than 2000 rbac objects). When I try to reproduce [1], I have [2]. My query only examines 10 rows.
Some questions:
- Do you have [3] in your code?
- How much time does your system take to execute "openstack network list"?
Regards.
[1]http://
[2]http://
[3]https:/
Slawek Kaplonski (slaweq) wrote : | #3 |
I'm marking this as incomplete until we will get info requested by Rodolfo in the comment #2.
Changed in neutron: | |
status: | Confirmed → Incomplete |
LEDUC Florian (leducflorian) wrote : | #4 |
Hi,
thanks for the response.
After executing mysqlcheck -o -A -p (OPTIMIZE, ANALYZE) we did not manage to decrease the load and the number of rows examined.
- Yes we have [3].
- Here's a little of context for openstack network list:
Region running Stein release:
project: 1909
rbac rules: 22241
networks: 57
security groups: 2847
REPONSE TIMES:
AS ADMIN CONTEXT:
-----------------
real 0m8.596s
user 0m1.616s
sys 0m0.184s
AS USER CONTEXT:
----------------
real 0m16.632s
user 0m1.588s
sys 0m0.175s
We have successfully migrated to Ussuri one of our region, here's some figures:
project: 770
rbac rules: 29528
networks: 94
security groups: 1071
REPONSE TIMES:
AS ADMIN CONTEXT:
-----------------
real 0m5.163s
user 0m1.611s
sys 0m0.184s
AS USER CONTEXT:
----------------
real 0m6.398s
user 0m1.602s
sys 0m0.174s
I've noticed that somes route are taking a lot of time to respond. As we did not deploy the OSprofiler in production env, we do not know is they are stuck in python, or the DB, or even elsewhere:
- openstack security group list (56s)
- openstack network rbac list (18s)
LEDUC Florian (leducflorian) wrote : | #5 |
Hi,
any update regarding my last comment ?
LEDUC Florian (leducflorian) wrote : | #6 |
Hello,
after another attempts to add several networks, we have noticed that during the performance issue we do full table SCAN.
MariaDB [neutron]> EXPLAIN SELECT networkrbacs.
-> FROM (SELECT networks.id AS networks_id
-> FROM networks LEFT OUTER JOIN networkrbacs ON networks.id = networkrbacs.
-> WHERE networkrbacs.action = 'access_
-> ;
+------
| id | select_type | table | type | possible_keys | key | key_len | ref |rows | Extra |
+------
| 1 | SIMPLE | networkrbacs | ALL | object_id | NULL | NULL | NULL |22857 | |
| 1 | SIMPLE | networks | eq_ref | PRIMARY,
| 1 | SIMPLE | networkrbacs | ref | object_id | object_id | 110 | neutron.
+------
After removing some of them, it seems we're under a certain threshold and we do use the index again.
MariaDB [neutron]> EXPLAIN SELECT networkrbacs.
-> FROM (SELECT networks.id AS networks_id
-> FROM networks LEFT OUTER JOIN networkrbacs ON networks.id = networkrbacs.
-> WHERE networkrbacs.action = 'access_
+------
Changed in neutron: | |
status: | Incomplete → Confirmed |
Tyler Stachecki (tstachecki) wrote : | #7 |
Hi, we've also run into this issue and have been running a fix for our cloud sometime now with success.
I can work on upstreaming the fix we're using for review/inclusion.
Changed in neutron: | |
assignee: | nobody → Tyler Stachecki (tstachecki) |
LEDUC Florian (leducflorian) wrote : | #8 |
Hi Tyler,
after some searching on mariadb database tuning, we found that setting "join_cache_level = 3" on mariadb is speeding the query time by 5, but it still does a full table scan and consume 100% during the processing of the request.
Another clue is that index statistics values are not well updated in that case:
-------
MariaDB [nova]> select * from mysql.innodb_
+------
| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |
+------
| neutron | networkrbacs | PRIMARY | 2021-05-07 15:32:09 | n_diff_pfx01 | 22865 | 267 | id |
| neutron | networkrbacs | PRIMARY | 2021-05-07 15:32:09 | n_leaf_pages | 1 | NULL | Number of leaf pages in the index |
| neutron | networkrbacs | PRIMARY | 2021-05-07 15:32:09 | size | 289 | NULL | Number of pages in the index |
| neutron | networkrbacs | ix_networkrbacs
| neutron | networkrbacs | ix_networkrbacs
| neutron | networkrbacs | ix_networkrbacs
| neutron | networkrbacs | ix_networkrbacs
| neutron | networkrbacs | object_id | 2021-05-07 15:32:09 | n_diff_pfx01 | 48 | 163 | object_id |
| neutron | networkrbacs | object_id | 2021-05-07 15:32:09 | n_diff_pfx02 | 22865 | 163 | object_id,id |
| neutron | networkrbacs | object_id | 2021-05-07 15:32:09 | n_leaf_pages | 1 | NULL | Number of leaf pages in the index |
| neutron | networkrbacs | object_id | 2021-05-07 15:32:09 | size | 225 | NULL | Number of pages in the index |
| neutron | networkrbacs | uniq_networkrba
| ne...
Tyler Stachecki (tstachecki) wrote : | #9 |
Sorry for the delay.
In writing up our fix, I realized that the fix I deployed locally to our cloud is not comprehensive and does not cover all possible use cases for networks/RBACs. That's OK, though - I'd still like to work on a full solution and have a good grasp of the problem at hand.
The problem is as Florian describes. By default, the SQLalchemy-
SELECT
networkrbac
networkrbacs.id AS networkrbacs_id,
networkrbac
networkrbac
networkrbac
subnets_
FROM (
SELECT
networks.id AS networks_id FROM networks
LEFT OUTER JOIN networkrbacs
ON networks.id = networkrbacs.
WHERE
)) AS anon_1
INNER JOIN subnets AS subnets_1
ON anon_1.networks_id = subnets_
INNER JOIN networkrbacs
ON subnets_
ORDER BY subnets_
EXPLAIN-ing the query, we can see the first part of the query plan has a table scan for object_id in networkrbacs. This is why as the cloud deployment accumulates a large amount of RBACs, the performance gets very slow.
*******
id: 1
select_type: SIMPLE
table: networkrbacs
partitions: NULL
type: ALL
possible_keys: object_id
key: NULL
key_len: NULL
ref: NULL
rows: *************
filtered: 100.00
Extra: Using temporary; Using filesort
*******
...
<snip>
...
The painpoint of the query is here:
WHERE
...
If the networks.project_id part of the expression is dropped or changed such that an index is used, the query completes very quickly regardless how many RBACs are in play.
This part of the query is generated in `python3-
https:/
I'll check if an index is present and see if adding one helps, though we also got by for our specific use case by adjusting the code slightly around this point as an intern measure.
Tyler Stachecki (tstachecki) wrote (last edit ): | #10 |
Sorry, been backlogged and have not been able to get around to this.
The table scan seems to be attributable to the LEFT OUTER JOIN in the subquery. Replacing that with an INNER JOIN will result in the indexes used (or, at least it does on MySQL).
On first glance, the OUTER JOIN seems necessary as the subquery contains a "OR networks.project_id = 'xxxxxxxxxxxxxx
But, at least in the case of the outer query at hand here, we see:
INNER JOIN subnets AS subnets_1
ON anon_1.networks_id = subnets_
INNER JOIN networkrbacs
ON subnets_
i.e., we INNER JOIN the subquery (anon_1) back to networkrbacs anyways, which would filter out the rows that the OUTER JOIN in the subquery affords. Thus, I wonder if the OUTER JOIN is truly needed or not.
Tyler Stachecki (tstachecki) wrote : | #11 |
I was able to come up with one solution, which is to change the subquery from:
SELECT
networks.id AS networks_id FROM networks
LEFT OUTER JOIN networkrbacs
ON networks.id = networkrbacs.
WHERE
to:
SELECT
networks.id AS networks_id FROM networks
INNER JOIN networkrbacs
ON networks.id = networkrbacs.
WHERE
UNION
SELECT
networks.id AS networks_ID FROM networks
WHERE
Thinking a little more to see if there is a better way... this feels a bit kludgy.
LEDUC Florian (leducflorian) wrote (last edit ): | #12 |
Hello Tyler,
thanks for the explanation.
Tomorrow will be pushing a fix in our production to force the index to be used for the RBAC model, securitygrouprbacs is also impacted by those slowness which kill our control plane.
Any chance we could share our findings in this thread ?
Tyler Stachecki (tstachecki) wrote : | #13 |
Gladly, the more feedback the merrier. I'm quite confident it will work out, as we have deployed something very similar to our production clouds for a few weeks now with success.
Which method are you using to force the index to be used?
LEDUC Florian (leducflorian) wrote : | #14 |
- db_model_query.diff.txt Edit (801 bytes, text/plain)
Hi,
We used SQLalchemy hints to force the use of indexes. Here's a diff of our internal pull request. We know that we cannot submit this patch upstream today, mostly because of our lack of understanding on how neutron is built. A little bit of help from the community would be appreciated.
How did you fix it on your side ?
Regards,
Tyler Stachecki (tstachecki) wrote : | #15 |
Interesting... I was unaware of the FORCE INDEX clause in MySQL. I'm not sure how much upstream will take a liking to that (the use of it would minimally have to be guarded for MySQL dialects, as things like postgres and other SQL servers don't support it AFAIK).
Similar to my last post on the subquery and changing it from a LEFT OUTER JOIN to INNER JOIN + UNION, we got along by refactoring the generated query such that the optimizer could leverage the index.
Tyler Stachecki (tstachecki) wrote (last edit ): | #16 |
Attached a rough working patch (albeit, against Rocky for now) that demonstrates where my thinking is going.
Unfortunately, after a lot of thinking today, I could not come up with a solution that does not ultimately touch both neutron-lib and neutron to some degree. The crux of the problem always seems to be that I need a way to operate on the RBAC part of the query filter separately from the model side. For performance reasons, they really need to be treated separately -- that OUTER JOIN appearing in the subqueries is a killer, and splitting them up allows us to remove it.
I tried the force index patch just now, but on Percona MySQL 5.7, I still do not see the index being used and thus no performance improvements.
OTOH, this 'intrusive' patch speeds up API calls by over 2x in my lab environment with a couple thousand RBACs. The delay is mostly all in OSC/neutron-server and the database query takes msec instead of >1sec.
rocky/master: 3.927s~4.084s
preliminary RBAC query patch: 1.474s~1.629s
LEDUC Florian (leducflorian) wrote : | #17 |
After several months of good performance with the FORCE INDEX patch. Here we go again with performance issues on RBAC model and it occured during a night without any intervention from us.
We will give it a try with your patch. Here's some figures about our status:
MariaDB [neutron]> select count(*) from networkrbacs;
+----------+
| count(*) |
+----------+
| 75512 |
+----------+
1 row in set (0.019 sec)
MariaDB [neutron]>
MariaDB [neutron]> select count(*) from keystone.project;
+----------+
| count(*) |
+----------+
| 3203 |
+----------+
1 row in set (0.001 sec)
MariaDB [neutron]> select count(*) from networks;
+----------+
| count(*) |
+----------+
| 103 |
+----------+
1 row in set (0.000 sec)
MariaDB [neutron]> select count(*) from networksegments;
+----------+
| count(*) |
+----------+
| 194 |
+----------+
1 row in set (0.000 sec)
Slawek / Rodolfo ? Can you have a closer on that issue if you can ?
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote : | #19 |
Hello Leduc/Tyler:
In c#2 I asked if you had https:/
I don't understand the explanation in c#9 and c#10 about the "networkrbacs.
In order to speed up the current query we can set "networkrbacs.
I'm going to propose a WIP patch in master. However, it could be much better if you manually set those two columns "networkrbacs.
Regards.
Tyler Stachecki (tstachecki) wrote : | #20 |
Rodolfo, yes, we have that commit/PR -- specifically, we're now on a very recent Ussuri and still have slow queries without applying workarounds to the code.
Although c#9 and c#10 may be confusing, I can absolutely attest that working around that saved us a bunch of heartache at scale - we saw OpenStack API calls that were running for upwards of one minute or more get cut down to 100s of milliseconds.
I am tied up early this week triaging some other issues, but ack your request to add an index and report back.
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master) | #21 |
Fix proposed to branch: master
Review: https:/
Changed in neutron: | |
status: | Confirmed → In Progress |
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote : | #22 |
Hello Tyler:
https:/
1) Creating manually the indexes in the DB.
2) Applying https:/
Regards.
LEDUC Florian (leducflorian) wrote : | #23 |
LEDUC Florian (leducflorian) wrote : | #24 |
Thanks guys for your quick answer. We have dumped our production database in our small lab and we will give a try to your suggestions and patch.
LEDUC Florian (leducflorian) wrote (last edit ): | #25 |
Hi,
I've backported the patch provided in c#22 to Ussuri . It ended up with an error duplicate keys as follows:
-------
(neutron-
INFO [alembic.
INFO [alembic.
Running upgrade for neutron ...
INFO [alembic.
INFO [alembic.
INFO [alembic.
Traceback (most recent call last):
File "/var/lib/
cursor, statement, parameters, context
File "/var/lib/
cursor.
File "/var/lib/
result = self._query(query)
File "/var/lib/
conn.query(q)
File "/var/lib/
self.
File "/var/lib/
result.read()
File "/var/lib/
first_packet = self.connection
File "/var/lib/
packet.
File "/var/lib/
err.
File "/var/lib/
raise errorclass(errno, errval)
pymysql.
-------
All the indexes that were attempted to be created by the alembic migration ended up with this error, so I guess they were already there.
Sorry Rodolfo, I did not respond to you message, yes patch #720051 is present on our system.
Tyler, you mentionned an migration to ussuri ealier, did you backported your patch ? Does it work well for your need?
LEDUC Florian (leducflorian) wrote (last edit ): | #26 |
Hi,
I've tested out your patch on ussuri. I've got better results:
- before patch GET /v2.0/networks -> around 40-50s
- after patch GET /v2.0/networks -> around 15s
I don't know what are the pros and cons to have this patch in our deployment. But still, thanks for sharing it with us Tyler.
Next steps, I'll the run the OS profiler in my lab to understand where all these times are spent.
Tyler Stachecki (tstachecki) wrote : | #27 |
Florian,
Yes, unfortunately, the patch does not work as well on Ussuri as it did Rocky - there are some changes to the SQLAlchemy query logic or indexes somewhere. On Rocky, we saw the latency of a "normal" Neutron call instead of 15s in your example. Also, for the same reason, please be careful -- as I am not sure about the correctness of the patch as it pertains to Ussuri given that there were underlying changes.
Unfortunately, I am on PTO this week and still have not gotten a chance to try Rodolfo's changes as there were some internal cases that I was working on for my employer.
Rodolfo,
I do intend to give your patch/suggestions a whirl next week as well.
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote : | #28 |
Hi Leduc:
Sorry, I don't know what patch are you talking about. Is it [1]?
In any case, [1] should not be necessary if [2] is in place. If all RBAC registers have "target_tenant" and "action" as indexes, the query speed in "query_with_hooks" should be improved because all comparisons are against indexed columns.
BTW, you can't backport this patch to Ussuri but create manually the indexes in the DB and add the definitions manually in the code. It is not possible to backport a DB migration in Neutron.
Regards.
[1]https:/
[2]https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master) | #29 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit f8c879ddbf7628e
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Mon Sep 20 15:55:04 2021 +0000
Add new indexes to RBAC DB models
Added two new indexes to all RBAC DB models: "target_tenant" and
"action".
The DB models affected are "networkrbacs", "qospolicyrbacs",
"securitygr
"addressgro
The goal of this patch is to speed up the model query if RBAC apply to
this object. If the object query scope is a project, [1] will be added
to the DB query. If "action" and "target_tenant" are indexed, the exact
match filtering will be faster.
Change-Id: I0a70a1a500fad5
Closes-Bug: #1918145
Changed in neutron: | |
status: | In Progress → Fix Released |
Tyler Stachecki (tstachecki) wrote : | #30 |
Apologies, finally got around to getting some data.
In a vanilla Ussuri cluster, I ran `openstack network list` with ~1k projects and ~2k projects.
@ 1k: runtime was ~6.4 sec
@ 2k: runtime was ~17.3 sec
...
I then applied:
https:/
*Before* the patch, sanity check on indexes:
mysql> SHOW INDEX FROM networkrbacs;
+------
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------
| networkrbacs | 0 | PRIMARY | 1 | id | A | 1813 | NULL | NULL | | BTREE | | |
| networkrbacs | 0 | uniq_networkrba
| networkrbacs | 0 | uniq_networkrba
| networkrbacs | 0 | uniq_networkrba
| networkrbacs | 1 | object_id | 1 | object_id | A | 7 | NULL | NULL | | BTREE | | |
| networkrbacs | 1 | ix_networkrbacs
+------
mysql> SHOW INDEX FROM qospolicyrbacs;
+------
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------
| qospolicyrbacs | 0 | PRIMARY | 1 | id | A | 0 | NULL | NULL | | BTREE | | |
| qospolicyr...
Tyler Stachecki (tstachecki) wrote (last edit ): | #31 |
Sorry, typo in the last update:
After the patch I see indexes ix_<TABLENAME>
Of course not all tables have index ix_networkrbacs_...
.e.g., for networkrbacs, though --
mysql> SHOW INDEX FROM networkrbacs;
+------
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------
| networkrbacs | 0 | PRIMARY | 1 | id | A | 1977 | NULL | NULL | | BTREE | | |
| networkrbacs | 0 | uniq_networkrba
| networkrbacs | 0 | uniq_networkrba
| networkrbacs | 0 | uniq_networkrba
| networkrbacs | 1 | object_id | 1 | object_id | A | 7 | NULL | NULL | | BTREE | | |
| networkrbacs | 1 | ix_networkrbacs
| networkrbacs | 1 | ix_networkrbacs
| networkrbacs | 1 | ix_networkrbacs
I believe I should have some time to continue chipping away at this issue this week.
Tyler Stachecki (tstachecki) wrote (last edit ): | #32 |
I've come up with a patch for Ussuri that seems to work in all cases.
I'm still doing regression testing of this patch from both a performance and correctness perspective, but it seems to work on both accounts so far.
Comparing changes with the patch to upstream in my development build:
osc network list: 0m21.095s -> 0m2.608s
osc network show <name>: 0m19.178s -> 0m2.953s
osc port create --network <name>: 0m35.204s -> 0m5.635s
osc port delete <uuid>: 0m11.437s -> 0m3.863s
Tyler Stachecki (tstachecki) wrote : | #33 |
Tyler Stachecki (tstachecki) wrote : | #34 |
Some more performance data:
Before the patch, on an internal test cluster, we observed that the database would go underwater (~0% host CPU idle) when creating/deleting even 100 VMs.
With the patch, I was able to create/delete 1k+ VMs faster than I could 100 VMs before the patch. Following that, the database load dropped from 4000% CPU (pre-patch) to only 1000% after... while doing 10x the amount of work (1k VMs vs. 100).
The creation/deletion of 1k VMs probably could have gone faster yet, but the bottleneck shifted somewhere else in the control plane; the database is now no longer a limiter.
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote : | #35 |
Hello Tyler:
Do you mind to send this patch to https:/
https:/
Regards.
Tyler Stachecki (tstachecki) wrote : | #36 |
Sure, I'll do that shortly.
One blocker to getting this PR ready at present is that there are both changes to neutron and neutron-lib. There are only a few lines of changes in the former, but I'm not sure that I'll be able to break that co-dependency.
Tyler Stachecki (tstachecki) wrote : | #37 |
After speaking with some peers, it was suggested that we start with a blueprint and go from there. I cannot break the co-dependency on the patch between the two projects (neutron, neutron-lib), and the changes to the model query class are non-trivial.
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote : | #38 |
Hello Tyler:
I don't think we need a BP here. In any case, feel free to at least push patches both for n-lib and neutron (depending on the n-lib one). That will be easier to read and test.
Regards.
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron-lib (master) | #39 |
Fix proposed to branch: master
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 20.0.0.0rc1 | #40 |
This issue was fixed in the openstack/neutron 20.0.0.0rc1 release candidate.
Slawek Kaplonski (slaweq) wrote : auto-abandon-script | #41 |
This bug has had a related patch abandoned and has been automatically un-assigned due to inactivity. Please re-assign yourself if you are continuing work or adjust the state as appropriate if it is no longer valid.
Changed in neutron: | |
assignee: | Tyler Stachecki (tstachecki) → nobody |
tags: | added: timeout-abandon |
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron-lib (master) | #42 |
Change abandoned by "Slawek Kaplonski <email address hidden>" on branch: master
Review: https:/
Reason: This review is > 4 weeks without comment, and failed Zuul jobs the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.
Slawek Kaplonski (slaweq) wrote : auto-abandon-script | #43 |
This bug has had a related patch abandoned and has been automatically un-assigned due to inactivity. Please re-assign yourself if you are continuing work or adjust the state as appropriate if it is no longer valid.
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master) | #44 |
Change abandoned by "Slawek Kaplonski <email address hidden>" on branch: master
Review: https:/
Reason: This review is > 4 weeks without comment, and failed Zuul jobs the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.
Changed in neutron: | |
assignee: | nobody → Tyler Stachecki (tstachecki) |
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron-lib (master) | #46 |
Change abandoned by "Slawek Kaplonski <email address hidden>" on branch: master
Review: https:/
Reason: This review is > 4 weeks without comment, and failed Zuul jobs the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.
Sahid Orentino (sahid-ferdjaoui) wrote : | #47 |
I think one of the first step that we can have is to remove the ORDER BY as it creates the temporary filesort that you have mentioned in #9.
I may missing something, an order by UUID does not bring any kind value?
A second step would be to understand why the possible key object_id is not used.
There is also another point, we can notice that we do filter per action, but I think that we do not have an index on it, maybe we could also investigate that point.
Changed in neutron: | |
status: | Fix Released → Confirmed |
Tyler Stachecki (tstachecki) wrote : | #48 |
Hello,
I already have a fix for this (that we have used in production for >1 year now...), but it got a -1 in code review as it makes quite impactful changes to the code. It is unclear to me what needs to be done in order to make it a candidate for merging at this time...
https:/
https:/
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote : | #49 |
Hello Tyler:
The patch was abandoned. Please restore it again, add the needed tests and fix any rebase issue you can have.
About the tests, I will insist on this here and in the patches: you need to ensure this patch is not affecting the existing code (we are not introducing new bugs), you'll need to propose a Neutron patch based on the n-lib one including a set of tests checking this new code. This code is particularly sensitive and we need to be extremely careful when modifying it. I'm 100% on improving the SQL performance but not without testing.
Please, include in the n-lib patch a documentation describing what this patch is doing, with SQL examples (that will help a lot).
Regards.
Sahid Orentino (sahid-ferdjaoui) wrote (last edit ): | #50 |
Hello Tyler,
Thank you I have noticed your patch and looks interesting, it comes with a different approach. My feeling is that the current queries look good but we need to help MySQL to select the right indexes
1/ Removing the ORDER BY will improve the performance on requests that return lot of rows as this will avoid that extra filesort.
2/ We can notice that the index object_id is mentioned but not used by MySQL for some reason. I have removed it and MySQL started to use a totally different set of indexes.
This reduce time on my env from 6,5 to 1,5 seconds.
Then I have noticed that removing the filter on securitygroup.
It's I guess the kind of metrics that we should have if MySQL would use the right indexes.
That is said removing the filter on project_id is not what we want, I'm currently investigate the issue on it.
@Tyler, anychance that you try that in your env? the significant change is removing index on object_id.
Thanks a lot,
s.
Tyler Stachecki (tstachecki) wrote : | #51 |
Rodolfo: Thank you for the direction. It may take me some time to get to everything, but I will work on it. The patch was formerly passing all existing TCs and I tested it to the best of my knowledge outside of our particular production use cases, but I revisit and add more TCs as needed to make sure there is full coverage for the change.
Sahid: Can you clarify on (1) and (2) in the context of the patch? The ORDER BY... in the sample queries above come from the SQLalchemy generated code in the context of the existing machinery - e.g. if you look at get_collection_
On your notes about the non-use of the index leading to the slowness - that is the crux of the problem here. If you look at LEDUC's former comments in this bug, there was mention of FORCE INDEX fixing the problem in some cases (I was unable to replicate a fix in Percona MySQL 5.7, though).
Going back to existing use cases of the code - I do not think removing the filter for project_id is a viable solution because it breaks existing use cases of query_with_hooks. It may make it faster, yes, but the question becomes how do we fix it without breaking existing use cases? (or, am I missing something -- if you have any code/patch to share that you'd like tested, I am more than glad to test).
Sahid Orentino (sahid-ferdjaoui) wrote : | #52 |
> On your notes about the non-use of the index leading to the slowness - that is the crux of the problem here. If you look at LEDUC's former comments in this bug, there was mention of FORCE INDEX fixing the problem in some cases (I was unable to replicate a fix in Percona MySQL 5.7, though).
It's basically the point, looks like if we remove the index on object_id MySQL will use a different approach to build and execute the query which looks to significantly improve performance without the need of changing everything. from 6s to 1s.
> Going back to existing use cases of the code - I do not think removing the filter for project_id is a viable solution because it breaks existing use cases of query_with_hooks. It may make it faster, yes, but the question becomes how do we fix it without breaking existing use cases? (or, am I missing something -- if you have any code/patch to share that you'd like tested, I am more than glad to test).
I'm not saying we should remove this filter. Just mentioning that removing it improves performance. From 1s to 0.05s looks like we should investigate the reason of that slowness.
Tyler Stachecki (tstachecki) wrote : | #53 |
On (1) - OK, interesting observation. I will consider why the object_id index was added/what may use it. But even 1s is quite slow (vs. the 0.05s you mention... there is still a big performance crunch).
On (2) - Handwaving a bit and speaking from memory - but I think the project_id filter is used extensively in the RBAC code. Its role is to select RBACs which are applicable to the context - that being RBACs which are either global *or* which are scoped to the project(_id). Any time Neutron networks, subnets, etc. are filtered or collected, the code generally flows through this query logic which itself operates on RBACs relative to the context, which is where this "global RBAC or context/
IOW, I don't think it can be removed as its use is integral to how all the Neutron code is written. This is what led to my PS which splits the logic for these two cases and does a late union of that either/or case.
Sahid Orentino (sahid-ferdjaoui) wrote : | #54 |
An other interesting point.
I have noticed a query that takes about 50s, looks like it's to retrieve networks. With a simple change this one now takes 0.5s.
For some reasons that query returns 27840 rows, but let's not focus too much on that.
```
SELECT networks.project_id AS networks_
Changed in neutron: | |
status: | Confirmed → In Progress |
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-lib (master) | #56 |
Related fix proposed to branch: master
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master) | #57 |
Related fix proposed to branch: master
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron-lib (master) | #58 |
Change abandoned by "Sahid Orentino Ferdjaoui <email address hidden>" on branch: master
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master) | #59 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit 4e71675ddbb1ad0
Author: Sahid Orentino Ferdjaoui <email address hidden>
Date: Mon Feb 6 16:21:40 2023 +0100
rbacs: add functionl test that asserts behavior on networks/users
This is adding functional test to validate shared/not shared behavior
with networks and users.
Related-bug: #1918145
Signed-off-by: Sahid Orentino Ferdjaoui <email address hidden>
Change-Id: Ic1fc820f407ef9
OpenStack Infra (hudson-openstack) wrote : | #60 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit e6de524555db92c
Author: Sahid Orentino Ferdjaoui <email address hidden>
Date: Thu Jan 19 17:40:11 2023 +0100
rbacs: filter out model that are already owned by context
Taking example of a network that have multiple rbacs. In a situation
of selecting networks that are shared to a project.
If we could could already match the one that are owned by the context
we can expect les rbacs to scan.
https:/
In an environement with about 200 00 rbacs and 200 networks this
reduce time of the request from more than 50s to less than a second.
Related-bug: #1918145
Signed-off-by: Sahid Orentino Ferdjaoui <email address hidden>
Change-Id: I54808cbd4cdccf
Signed-off-by: Sahid Orentino Ferdjaoui <email address hidden>
Change-Id: Ib155fbb3f6b325
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron-lib (master) | #61 |
Change abandoned by "Slawek Kaplonski <email address hidden>" on branch: master
Review: https:/
Reason: This review is > 4 weeks without comment, and failed Zuul jobs the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master) | #62 |
Fix proposed to branch: master
Review: https:/
Changed in neutron: | |
assignee: | Tyler Stachecki (tstachecki) → Rodolfo Alonso (rodolfo-alonso-hernandez) |
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-lib (master) | #63 |
Related fix proposed to branch: master
Review: https:/
LEDUC Florian (leducflorian) wrote : | #64 |
Regarding your last comment, we have used this approach for two ORM objects for stable/ussuri back then:
-------
--- a/neutron/
+++ b/neutron/
@@ -69,11 +69,9 @@ class SegmentHostMapp
# Add a relationship to the NetworkSegment model in order to instruct
# SQLAlchemy to eagerly load this association
- # Monkey-patch : change the lazy strategy from 'subquery' to 'joined'
- # as MariaDB 10.3 don't use an optimal execution plan with 'subquery'
network_
- lazy='joined',
+ lazy='subquery',
revises_
-------
--- a/neutron/
+++ b/neutron/
@@ -49,7 +49,7 @@ class NetworkSegment(
network = orm.relationshi
- lazy='subquery',
+ lazy='joined',
api_
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote : | #65 |
The model change by itself is not enough, we need to group the results by the resource ID. This is what is done in the n-lib patch. To improve the performance, we need both.
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron-lib (master) | #66 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit 829e97024c2b73d
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Sun May 28 17:43:30 2023 +0200
Add a "GROUP BY" clause on queries with RBAC entries
As reported in the Neutron patch [1], this change introduce a
"GROUP BY" clause on the SQL queries with RBAC entries. With [1],
all resouces with RBAC entries ('network', 'qospolicy',
'securitygr
load the RBAC entries with "joined" strategy.
Because of the low cardinality of the RBAC query when all the RBAC
registers are in one single project, this patch groups the resource
queries by the resource ID. That will reduce the results returned by
the SQL engine to only the singular registers required.
[1]https:/
Related-Bug: #1918145
Change-Id: I800e0356714d59
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-lib (stable/2023.1) | #67 |
Related fix proposed to branch: stable/2023.1
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-lib (stable/zed) | #68 |
Related fix proposed to branch: stable/zed
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-lib (stable/yoga) | #69 |
Related fix proposed to branch: stable/yoga
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-lib (stable/xena) | #70 |
Related fix proposed to branch: stable/xena
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-lib (stable/wallaby) | #71 |
Related fix proposed to branch: stable/wallaby
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron-lib (stable/2023.1) | #72 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/2023.1
commit 86ea887de64bba3
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Sun May 28 17:43:30 2023 +0200
Add a "GROUP BY" clause on queries with RBAC entries
As reported in the Neutron patch [1], this change introduce a
"GROUP BY" clause on the SQL queries with RBAC entries. With [1],
all resouces with RBAC entries ('network', 'qospolicy',
'securitygr
load the RBAC entries with "joined" strategy.
Because of the low cardinality of the RBAC query when all the RBAC
registers are in one single project, this patch groups the resource
queries by the resource ID. That will reduce the results returned by
the SQL engine to only the singular registers required.
[1]https:/
Related-Bug: #1918145
Change-Id: I800e0356714d59
(cherry picked from commit 829e97024c2b73d
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron-lib (stable/zed) | #73 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/zed
commit fdbece1bac3d4d4
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Sun May 28 17:43:30 2023 +0200
Add a "GROUP BY" clause on queries with RBAC entries
As reported in the Neutron patch [1], this change introduce a
"GROUP BY" clause on the SQL queries with RBAC entries. With [1],
all resouces with RBAC entries ('network', 'qospolicy',
'securitygr
load the RBAC entries with "joined" strategy.
Because of the low cardinality of the RBAC query when all the RBAC
registers are in one single project, this patch groups the resource
queries by the resource ID. That will reduce the results returned by
the SQL engine to only the singular registers required.
[1]https:/
Related-Bug: #1918145
Change-Id: I800e0356714d59
(cherry picked from commit 829e97024c2b73d
tags: | added: in-stable-zed |
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron-lib (stable/wallaby) | #74 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/wallaby
commit 619c0fe5345af66
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Sun May 28 17:43:30 2023 +0200
Add a "GROUP BY" clause on queries with RBAC entries
As reported in the Neutron patch [1], this change introduce a
"GROUP BY" clause on the SQL queries with RBAC entries. With [1],
all resouces with RBAC entries ('network', 'qospolicy',
'securitygr
load the RBAC entries with "joined" strategy.
Because of the low cardinality of the RBAC query when all the RBAC
registers are in one single project, this patch groups the resource
queries by the resource ID. That will reduce the results returned by
the SQL engine to only the singular registers required.
[1]https:/
Conflicts:
Related-Bug: #1918145
Change-Id: I800e0356714d59
(cherry picked from commit 829e97024c2b73d
(cherry picked from commit 4a952047823c3c3
tags: | added: in-stable-wallaby |
tags: | added: in-stable-yoga |
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron-lib (stable/yoga) | #75 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/yoga
commit 467eafd20a229e2
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Sun May 28 17:43:30 2023 +0200
Add a "GROUP BY" clause on queries with RBAC entries
As reported in the Neutron patch [1], this change introduce a
"GROUP BY" clause on the SQL queries with RBAC entries. With [1],
all resouces with RBAC entries ('network', 'qospolicy',
'securitygr
load the RBAC entries with "joined" strategy.
Because of the low cardinality of the RBAC query when all the RBAC
registers are in one single project, this patch groups the resource
queries by the resource ID. That will reduce the results returned by
the SQL engine to only the singular registers required.
[1]https:/
Related-Bug: #1918145
Change-Id: I800e0356714d59
(cherry picked from commit 829e97024c2b73d
tags: | added: in-stable-xena |
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron-lib (stable/xena) | #76 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/xena
commit 4a952047823c3c3
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Sun May 28 17:43:30 2023 +0200
Add a "GROUP BY" clause on queries with RBAC entries
As reported in the Neutron patch [1], this change introduce a
"GROUP BY" clause on the SQL queries with RBAC entries. With [1],
all resouces with RBAC entries ('network', 'qospolicy',
'securitygr
load the RBAC entries with "joined" strategy.
Because of the low cardinality of the RBAC query when all the RBAC
registers are in one single project, this patch groups the resource
queries by the resource ID. That will reduce the results returned by
the SQL engine to only the singular registers required.
[1]https:/
Conflicts:
Related-Bug: #1918145
Change-Id: I800e0356714d59
(cherry picked from commit 829e97024c2b73d
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/2023.1) | #77 |
Fix proposed to branch: stable/2023.1
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/zed) | #78 |
Fix proposed to branch: stable/zed
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/yoga) | #79 |
Fix proposed to branch: stable/yoga
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/xena) | #80 |
Fix proposed to branch: stable/xena
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/wallaby) | #81 |
Fix proposed to branch: stable/wallaby
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master) | #82 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit e9da29d16c47482
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Sun May 28 17:28:03 2023 +0200
Change RBAC relationship loading method to "joined"
This patch changes all RBAC relationship method to "joined". This change
enforces that the RBAC associated registers are loaded along with the
parent resource. The rationale of this change is to be able to control
the SQL query executed; the subquery cannot be directly managed by
Neutron.
It is very usual to create the RBAC rules from one single project that
is usually the adminitrator project. That means all RBAC rules will
belong to it. Before this change, the SQL subquery performed to
retrieve the RBAC entries was this (from a network query):
SELECT networks.id AS networks_id
FROM networks LEFT OUTER JOIN networkrbacs ON networks.id =
networkrb
WHERE networks.project_id = 'bd133e2c499c4b
OR networkrbacs.action = 'access_
AND networkrbacs.
OR networkrbacs.
OR networks.project_id = 'bd133e2c499c4b
OR networkrbacs.action IN ('access_
AND (networkrbacs.
OR networkrbacs.
This SQL result has a very low cardinality; that means there are many
duplicated registers. For example, with 10 external network, 1000
projects and 2500 RBAC rules, this query returns 1.4 million rows.
Instead if a "GROUP BY resource_id" (in this case network_id) clause is
added, the number of rows is reduced to 10 (considering this project
has a RBAC per network).
In order to introduce this "GROUP BY" clause, this patch is changing
the loading method. The clause is added in a neutron-lib patch [1].
This change by itself does not improve the query performance. The
neutron-lib patch is needed too. Although this patch does not modify
que SQL query results, the tests added will prove that the neutron-lib
patch does not introduce any regression.
[1]https:/
Closes-Bug: #1918145
Change-Id: Ic6001bd5a57493
Changed in neutron: | |
status: | In Progress → Fix Released |
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/2023.1) | #83 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/2023.1
commit 593939911c8c360
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Sun May 28 17:28:03 2023 +0200
Change RBAC relationship loading method to "joined"
This patch changes all RBAC relationship method to "joined". This change
enforces that the RBAC associated registers are loaded along with the
parent resource. The rationale of this change is to be able to control
the SQL query executed; the subquery cannot be directly managed by
Neutron.
It is very usual to create the RBAC rules from one single project that
is usually the adminitrator project. That means all RBAC rules will
belong to it. Before this change, the SQL subquery performed to
retrieve the RBAC entries was this (from a network query):
SELECT networks.id AS networks_id
FROM networks LEFT OUTER JOIN networkrbacs ON networks.id =
networkrb
WHERE networks.project_id = 'bd133e2c499c4b
OR networkrbacs.action = 'access_
AND networkrbacs.
OR networkrbacs.
OR networks.project_id = 'bd133e2c499c4b
OR networkrbacs.action IN ('access_
AND (networkrbacs.
OR networkrbacs.
This SQL result has a very low cardinality; that means there are many
duplicated registers. For example, with 10 external network, 1000
projects and 2500 RBAC rules, this query returns 1.4 million rows.
Instead if a "GROUP BY resource_id" (in this case network_id) clause is
added, the number of rows is reduced to 10 (considering this project
has a RBAC per network).
In order to introduce this "GROUP BY" clause, this patch is changing
the loading method. The clause is added in a neutron-lib patch [1].
This change by itself does not improve the query performance. The
neutron-lib patch is needed too. Although this patch does not modify
que SQL query results, the tests added will prove that the neutron-lib
patch does not introduce any regression.
[1]https:/
Closes-Bug: #1918145
Change-Id: Ic6001bd5a57493
(cherry picked from commit e9da29d16c47482
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/zed) | #84 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/zed
commit d3b403bfee17267
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Sun May 28 17:28:03 2023 +0200
Change RBAC relationship loading method to "joined"
This patch changes all RBAC relationship method to "joined". This change
enforces that the RBAC associated registers are loaded along with the
parent resource. The rationale of this change is to be able to control
the SQL query executed; the subquery cannot be directly managed by
Neutron.
It is very usual to create the RBAC rules from one single project that
is usually the adminitrator project. That means all RBAC rules will
belong to it. Before this change, the SQL subquery performed to
retrieve the RBAC entries was this (from a network query):
SELECT networks.id AS networks_id
FROM networks LEFT OUTER JOIN networkrbacs ON networks.id =
networkrb
WHERE networks.project_id = 'bd133e2c499c4b
OR networkrbacs.action = 'access_
AND networkrbacs.
OR networkrbacs.
OR networks.project_id = 'bd133e2c499c4b
OR networkrbacs.action IN ('access_
AND (networkrbacs.
OR networkrbacs.
This SQL result has a very low cardinality; that means there are many
duplicated registers. For example, with 10 external network, 1000
projects and 2500 RBAC rules, this query returns 1.4 million rows.
Instead if a "GROUP BY resource_id" (in this case network_id) clause is
added, the number of rows is reduced to 10 (considering this project
has a RBAC per network).
In order to introduce this "GROUP BY" clause, this patch is changing
the loading method. The clause is added in a neutron-lib patch [1].
This change by itself does not improve the query performance. The
neutron-lib patch is needed too. Although this patch does not modify
que SQL query results, the tests added will prove that the neutron-lib
patch does not introduce any regression.
[1]https:/
Closes-Bug: #1918145
Change-Id: Ic6001bd5a57493
(cherry picked from commit e9da29d16c47482
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/xena) | #85 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/xena
commit 4846e1939b7b22c
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Sun May 28 17:28:03 2023 +0200
Change RBAC relationship loading method to "joined"
This patch changes all RBAC relationship method to "joined". This change
enforces that the RBAC associated registers are loaded along with the
parent resource. The rationale of this change is to be able to control
the SQL query executed; the subquery cannot be directly managed by
Neutron.
It is very usual to create the RBAC rules from one single project that
is usually the adminitrator project. That means all RBAC rules will
belong to it. Before this change, the SQL subquery performed to
retrieve the RBAC entries was this (from a network query):
SELECT networks.id AS networks_id
FROM networks LEFT OUTER JOIN networkrbacs ON networks.id =
networkrb
WHERE networks.project_id = 'bd133e2c499c4b
OR networkrbacs.action = 'access_
AND networkrbacs.
OR networkrbacs.
OR networks.project_id = 'bd133e2c499c4b
OR networkrbacs.action IN ('access_
AND (networkrbacs.
OR networkrbacs.
This SQL result has a very low cardinality; that means there are many
duplicated registers. For example, with 10 external network, 1000
projects and 2500 RBAC rules, this query returns 1.4 million rows.
Instead if a "GROUP BY resource_id" (in this case network_id) clause is
added, the number of rows is reduced to 10 (considering this project
has a RBAC per network).
In order to introduce this "GROUP BY" clause, this patch is changing
the loading method. The clause is added in a neutron-lib patch [1].
This change by itself does not improve the query performance. The
neutron-lib patch is needed too. Although this patch does not modify
que SQL query results, the tests added will prove that the neutron-lib
patch does not introduce any regression.
[1]https:/
Conflicts:
Closes-Bug: #1918145
Change-Id: Ic6001bd5a57493
(cherry picked from commit e9da29d16c47482
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/wallaby) | #86 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/wallaby
commit de2ad80a3925ba0
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Sun May 28 17:28:03 2023 +0200
Change RBAC relationship loading method to "joined"
This patch changes all RBAC relationship method to "joined". This change
enforces that the RBAC associated registers are loaded along with the
parent resource. The rationale of this change is to be able to control
the SQL query executed; the subquery cannot be directly managed by
Neutron.
It is very usual to create the RBAC rules from one single project that
is usually the adminitrator project. That means all RBAC rules will
belong to it. Before this change, the SQL subquery performed to
retrieve the RBAC entries was this (from a network query):
SELECT networks.id AS networks_id
FROM networks LEFT OUTER JOIN networkrbacs ON networks.id =
networkrb
WHERE networks.project_id = 'bd133e2c499c4b
OR networkrbacs.action = 'access_
AND networkrbacs.
OR networkrbacs.
OR networks.project_id = 'bd133e2c499c4b
OR networkrbacs.action IN ('access_
AND (networkrbacs.
OR networkrbacs.
This SQL result has a very low cardinality; that means there are many
duplicated registers. For example, with 10 external network, 1000
projects and 2500 RBAC rules, this query returns 1.4 million rows.
Instead if a "GROUP BY resource_id" (in this case network_id) clause is
added, the number of rows is reduced to 10 (considering this project
has a RBAC per network).
In order to introduce this "GROUP BY" clause, this patch is changing
the loading method. The clause is added in a neutron-lib patch [1].
This change by itself does not improve the query performance. The
neutron-lib patch is needed too. Although this patch does not modify
que SQL query results, the tests added will prove that the neutron-lib
patch does not introduce any regression.
[1]https:/
Conflicts:
Closes-Bug: #1918145
Change-Id: Ic6001bd5a57493
(cherry picked from commit e9da29d16c47482
(cherry picked from commit 4846e1939b7b22c
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/yoga) | #87 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/yoga
commit 1a66962ee338e21
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Sun May 28 17:28:03 2023 +0200
Change RBAC relationship loading method to "joined"
This patch changes all RBAC relationship method to "joined". This change
enforces that the RBAC associated registers are loaded along with the
parent resource. The rationale of this change is to be able to control
the SQL query executed; the subquery cannot be directly managed by
Neutron.
It is very usual to create the RBAC rules from one single project that
is usually the adminitrator project. That means all RBAC rules will
belong to it. Before this change, the SQL subquery performed to
retrieve the RBAC entries was this (from a network query):
SELECT networks.id AS networks_id
FROM networks LEFT OUTER JOIN networkrbacs ON networks.id =
networkrb
WHERE networks.project_id = 'bd133e2c499c4b
OR networkrbacs.action = 'access_
AND networkrbacs.
OR networkrbacs.
OR networks.project_id = 'bd133e2c499c4b
OR networkrbacs.action IN ('access_
AND (networkrbacs.
OR networkrbacs.
This SQL result has a very low cardinality; that means there are many
duplicated registers. For example, with 10 external network, 1000
projects and 2500 RBAC rules, this query returns 1.4 million rows.
Instead if a "GROUP BY resource_id" (in this case network_id) clause is
added, the number of rows is reduced to 10 (considering this project
has a RBAC per network).
In order to introduce this "GROUP BY" clause, this patch is changing
the loading method. The clause is added in a neutron-lib patch [1].
This change by itself does not improve the query performance. The
neutron-lib patch is needed too. Although this patch does not modify
que SQL query results, the tests added will prove that the neutron-lib
patch does not introduce any regression.
[1]https:/
Closes-Bug: #1918145
Change-Id: Ic6001bd5a57493
(cherry picked from commit e9da29d16c47482
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 22.0.1 | #88 |
This issue was fixed in the openstack/neutron 22.0.1 release.
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 21.1.1 | #89 |
This issue was fixed in the openstack/neutron 21.1.1 release.
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 20.3.1 | #90 |
This issue was fixed in the openstack/neutron 20.3.1 release.
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron-lib (master) | #91 |
Change abandoned by "Slawek Kaplonski <email address hidden>" on branch: master
Review: https:/
Reason: This review is > 4 weeks without comment, and failed Zuul jobs the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 23.0.0.0b3 | #92 |
This issue was fixed in the openstack/neutron 23.0.0.0b3 development milestone.
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-lib (stable/ussuri) | #93 |
Related fix proposed to branch: stable/ussuri
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ussuri) | #94 |
Fix proposed to branch: stable/ussuri
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron-lib (stable/ussuri) | #95 |
Change abandoned by "Elod Illes <email address hidden>" on branch: stable/ussuri
Review: https:/
Reason: This branch of the project transitioned to End of Life. Open patches need to be abandoned.
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (stable/ussuri) | #96 |
Change abandoned by "Elod Illes <email address hidden>" on branch: stable/ussuri
Review: https:/
Reason: This branch of the project transitioned to End of Life. Open patches need to be abandoned.
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron wallaby-eom | #97 |
This issue was fixed in the openstack/neutron wallaby-eom release.
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron xena-eom | #98 |
This issue was fixed in the openstack/neutron xena-eom release.
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/2023.1) | #99 |
Related fix proposed to branch: stable/2023.1
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/2023.1) | #100 |
Reviewed: https:/
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/2023.1
commit 271bb48936ef415
Author: Sahid Orentino Ferdjaoui <email address hidden>
Date: Thu Jan 19 17:40:11 2023 +0100
rbacs: filter out model that are already owned by context
Taking example of a network that have multiple rbacs. In a situation
of selecting networks that are shared to a project.
If we could could already match the one that are owned by the context
we can expect les rbacs to scan.
https:/
In an environement with about 200 00 rbacs and 200 networks this
reduce time of the request from more than 50s to less than a second.
Related-bug: #1918145
Signed-off-by: Sahid Orentino Ferdjaoui <email address hidden>
Change-Id: I54808cbd4cdccf
Signed-off-by: Sahid Orentino Ferdjaoui <email address hidden>
Change-Id: Ib155fbb3f6b325
(cherry picked from commit e6de524555db92c
Hi,
We're currently investigating issue on the database related to EXPLAIN or ANALYZE output. It may be due to non-optimized indexes on table.
We will let you know.