stein regression listing security group rules

Bug #1863201 reported by Sam Morrison
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Rodolfo Alonso

Bug Description

Upgrading neutron from rocky -> stein and get a considerable slow down when listing all security groups for a project. Goes from ~2 seconds to almost 2 minutes. Looking into the code it looks like it is very inefficient because it gets all rules from the DB and then filters after the fact.
We have around 7000 rules in our QA env.

Very keen to get this sorted but don't know the neutron code base that well so can offer testing of patches if there are any out there already.

It looks like this happened with listing ports too for stein and found this https://bugzilla.redhat.com/show_bug.cgi?id=1737012 so wonder if this is related?

With Rocky:
time openstack security group rule list
+--------------------------------------+-------------+-----------+--------------------+------------+--------------------------------------+--------------------------------------+
| ID | IP Protocol | Ethertype | IP Range | Port Range | Remote Security Group | Security Group |
+--------------------------------------+-------------+-----------+--------------------+------------+--------------------------------------+--------------------------------------+
| 01b877cc-1621-44cd-8e69-1345ab01a1ef | None | IPv4 | 0.0.0.0/0 | | None | 3dcbd4fa-d017-4361-b0b0-b7508e923087 |
| 0c744788-6319-42e5-931a-5e7b0df166c4 | None | IPv6 | ::/0 | | None | 3dcbd4fa-d017-4361-b0b0-b7508e923087 |
| 0fc6b79d-d211-4201-ac76-60fb8ea40c9c | None | IPv4 | 0.0.0.0/0 | | None | 8f55c18b-cd8c-4d84-afef-f8b83d5eb128 |
| 17d6c8a3-7894-42a6-92f2-1bd56a30ef1d | tcp | IPv4 | 0.0.0.0/0 | 80:80 | None | ed257fd7-d825-4014-96a8-c16adfea70f0 |
| 19d3ba79-65f1-4c89-a1c2-b32049ceb25a | None | IPv6 | ::/0 | | None | 008510a7-d176-4ee5-87e2-e74da06c55ba |
| 21f1d173-b99f-47a7-9983-6926f7bc58f3 | icmp | IPv4 | 0.0.0.0/0 | | None | 008510a7-d176-4ee5-87e2-e74da06c55ba |
| 3321d5ff-11c3-4104-be13-107c789e4bf8 | None | IPv6 | ::/0 | | None | 57cb14de-dd5f-4f0c-b0cf-a7effc36fca5 |
| 381c6816-9b5c-42b7-9dd3-dae12a49c08b | None | IPv4 | 0.0.0.0/0 | | None | 3f63cfbb-87ee-4aa2-8193-7e86cb542881 |
| 3886ad10-99ea-4f60-a36c-ffbe80d92907 | None | IPv6 | ::/0 | | None | ed257fd7-d825-4014-96a8-c16adfea70f0 |
| 5be4853a-75d1-435c-87ca-56c54a243f70 | None | IPv4 | 0.0.0.0/0 | | None | 57cb14de-dd5f-4f0c-b0cf-a7effc36fca5 |
| 71656249-4454-410e-8e7d-24910df127ba | None | IPv6 | ::/0 | | None | 8f55c18b-cd8c-4d84-afef-f8b83d5eb128 |
| 783324ac-6844-4d4d-985c-936015bcb66e | icmp | IPv4 | 0.0.0.0/0 | | None | 3f63cfbb-87ee-4aa2-8193-7e86cb542881 |
| 7ca7f0cc-b4df-401f-aaa4-662f17afcfb0 | None | IPv4 | 0.0.0.0/0 | | None | 008510a7-d176-4ee5-87e2-e74da06c55ba |
| 825a33ff-b693-456d-811e-a0b494e8e308 | None | IPv6 | ::/0 | | 008510a7-d176-4ee5-87e2-e74da06c55ba | 008510a7-d176-4ee5-87e2-e74da06c55ba |
| 89fd2d18-45d3-4a86-a020-09d240912e5c | tcp | IPv4 | 128.250.116.173/32 | 22:22 | None | 008510a7-d176-4ee5-87e2-e74da06c55ba |
| 8a1f45f1-e4c8-41e4-b6f3-80ab48b7e38d | None | IPv6 | ::/0 | | None | bf7abb53-e5ca-428d-9fce-6a2e37a25ee0 |
| 9ebc6d15-e3eb-4d20-88d4-6737367ffc08 | None | IPv4 | 0.0.0.0/0 | | None | ed257fd7-d825-4014-96a8-c16adfea70f0 |
| 9f29f539-a80a-4a8d-89cc-f714224b5f8c | icmp | IPv4 | 0.0.0.0/0 | | None | 8f55c18b-cd8c-4d84-afef-f8b83d5eb128 |
| a1bc8f05-3a20-48c2-bae5-a60f4ffed514 | None | IPv4 | 0.0.0.0/0 | | 008510a7-d176-4ee5-87e2-e74da06c55ba | 008510a7-d176-4ee5-87e2-e74da06c55ba |
| bef999d6-669a-47f6-988c-e69bab6df87a | tcp | IPv4 | 0.0.0.0/0 | 22:22 | 57cb14de-dd5f-4f0c-b0cf-a7effc36fca5 | bf7abb53-e5ca-428d-9fce-6a2e37a25ee0 |
| c5ce339b-cd92-492c-9af4-6eab875027ce | tcp | IPv4 | 0.0.0.0/0 | 80:80 | 008510a7-d176-4ee5-87e2-e74da06c55ba | 008510a7-d176-4ee5-87e2-e74da06c55ba |
| d9ec0eba-d80d-4331-a588-e4f8c1c75533 | None | IPv6 | ::/0 | | None | 3f63cfbb-87ee-4aa2-8193-7e86cb542881 |
| de760e03-92a9-4183-8acc-1d82addc3604 | None | IPv4 | 0.0.0.0/0 | | None | bf7abb53-e5ca-428d-9fce-6a2e37a25ee0 |
| f4bc4616-1d18-4488-84bb-546516c053bc | tcp | IPv4 | 0.0.0.0/0 | 443:443 | None | ed257fd7-d825-4014-96a8-c16adfea70f0 |
+--------------------------------------+-------------+-----------+--------------------+------------+--------------------------------------+--------------------------------------+

real 0m2.499s
user 0m0.642s
sys 0m0.053s

With Stein:

time openstack security group rule list
+--------------------------------------+-------------+-----------+--------------------+------------+--------------------------------------+--------------------------------------+
| ID | IP Protocol | Ethertype | IP Range | Port Range | Remote Security Group | Security Group |
+--------------------------------------+-------------+-----------+--------------------+------------+--------------------------------------+--------------------------------------+
| 01b877cc-1621-44cd-8e69-1345ab01a1ef | None | IPv4 | 0.0.0.0/0 | | None | 3dcbd4fa-d017-4361-b0b0-b7508e923087 |
| 0c744788-6319-42e5-931a-5e7b0df166c4 | None | IPv6 | ::/0 | | None | 3dcbd4fa-d017-4361-b0b0-b7508e923087 |
| 0fc6b79d-d211-4201-ac76-60fb8ea40c9c | None | IPv4 | 0.0.0.0/0 | | None | 8f55c18b-cd8c-4d84-afef-f8b83d5eb128 |
| 17d6c8a3-7894-42a6-92f2-1bd56a30ef1d | tcp | IPv4 | 0.0.0.0/0 | 80:80 | None | ed257fd7-d825-4014-96a8-c16adfea70f0 |
| 19d3ba79-65f1-4c89-a1c2-b32049ceb25a | None | IPv6 | ::/0 | | None | 008510a7-d176-4ee5-87e2-e74da06c55ba |
| 21f1d173-b99f-47a7-9983-6926f7bc58f3 | icmp | IPv4 | 0.0.0.0/0 | | None | 008510a7-d176-4ee5-87e2-e74da06c55ba |
| 3321d5ff-11c3-4104-be13-107c789e4bf8 | None | IPv6 | ::/0 | | None | 57cb14de-dd5f-4f0c-b0cf-a7effc36fca5 |
| 381c6816-9b5c-42b7-9dd3-dae12a49c08b | None | IPv4 | 0.0.0.0/0 | | None | 3f63cfbb-87ee-4aa2-8193-7e86cb542881 |
| 3886ad10-99ea-4f60-a36c-ffbe80d92907 | None | IPv6 | ::/0 | | None | ed257fd7-d825-4014-96a8-c16adfea70f0 |
| 5be4853a-75d1-435c-87ca-56c54a243f70 | None | IPv4 | 0.0.0.0/0 | | None | 57cb14de-dd5f-4f0c-b0cf-a7effc36fca5 |
| 71656249-4454-410e-8e7d-24910df127ba | None | IPv6 | ::/0 | | None | 8f55c18b-cd8c-4d84-afef-f8b83d5eb128 |
| 783324ac-6844-4d4d-985c-936015bcb66e | icmp | IPv4 | 0.0.0.0/0 | | None | 3f63cfbb-87ee-4aa2-8193-7e86cb542881 |
| 7ca7f0cc-b4df-401f-aaa4-662f17afcfb0 | None | IPv4 | 0.0.0.0/0 | | None | 008510a7-d176-4ee5-87e2-e74da06c55ba |
| 825a33ff-b693-456d-811e-a0b494e8e308 | None | IPv6 | ::/0 | | 008510a7-d176-4ee5-87e2-e74da06c55ba | 008510a7-d176-4ee5-87e2-e74da06c55ba |
| 89fd2d18-45d3-4a86-a020-09d240912e5c | tcp | IPv4 | 128.250.116.173/32 | 22:22 | None | 008510a7-d176-4ee5-87e2-e74da06c55ba |
| 8a1f45f1-e4c8-41e4-b6f3-80ab48b7e38d | None | IPv6 | ::/0 | | None | bf7abb53-e5ca-428d-9fce-6a2e37a25ee0 |
| 9ebc6d15-e3eb-4d20-88d4-6737367ffc08 | None | IPv4 | 0.0.0.0/0 | | None | ed257fd7-d825-4014-96a8-c16adfea70f0 |
| 9f29f539-a80a-4a8d-89cc-f714224b5f8c | icmp | IPv4 | 0.0.0.0/0 | | None | 8f55c18b-cd8c-4d84-afef-f8b83d5eb128 |
| a1bc8f05-3a20-48c2-bae5-a60f4ffed514 | None | IPv4 | 0.0.0.0/0 | | 008510a7-d176-4ee5-87e2-e74da06c55ba | 008510a7-d176-4ee5-87e2-e74da06c55ba |
| bef999d6-669a-47f6-988c-e69bab6df87a | tcp | IPv4 | 0.0.0.0/0 | 22:22 | 57cb14de-dd5f-4f0c-b0cf-a7effc36fca5 | bf7abb53-e5ca-428d-9fce-6a2e37a25ee0 |
| c5ce339b-cd92-492c-9af4-6eab875027ce | tcp | IPv4 | 0.0.0.0/0 | 80:80 | 008510a7-d176-4ee5-87e2-e74da06c55ba | 008510a7-d176-4ee5-87e2-e74da06c55ba |
| d9ec0eba-d80d-4331-a588-e4f8c1c75533 | None | IPv6 | ::/0 | | None | 3f63cfbb-87ee-4aa2-8193-7e86cb542881 |
| de760e03-92a9-4183-8acc-1d82addc3604 | None | IPv4 | 0.0.0.0/0 | | None | bf7abb53-e5ca-428d-9fce-6a2e37a25ee0 |
| f4bc4616-1d18-4488-84bb-546516c053bc | tcp | IPv4 | 0.0.0.0/0 | 443:443 | None | ed257fd7-d825-4014-96a8-c16adfea70f0 |
+--------------------------------------+-------------+-----------+--------------------+------------+--------------------------------------+--------------------------------------+

real 1m51.921s
user 0m0.624s
sys 0m0.077s

Revision history for this message
Brian Haley (brian-haley) wrote :

Hi Sam - do you have this fix? https://review.opendev.org/#/c/670075/

If so this is a duplicate of https://bugs.launchpad.net/neutron/+bug/1830679

Revision history for this message
Sam Morrison (sorrison) wrote :

Yeah we have this commit, running latest stable/stein branch, must be something else sadly.

Revision history for this message
Sam Morrison (sorrison) wrote :

We're running python3 in case this is related

Revision history for this message
Sam Morrison (sorrison) wrote :

Just realised #1830679 is talking about listing security groups, our issue is listing security group rules so maybe a similar fix is required?

Akihiro Motoki (amotoki)
tags: added: sg-fw
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello Sam:

I can't reproduce this issue. If I'm not wrong, the example you sent correspond to a system with the default SGs and rules. I'm testing with both versions and the result is the same: aroung 1.7 seconds to retrieve the rule list.

Do you have any service plugin configured?

Just for documentation, versions I'm using:
- Rocky: f38aee9555a0f4ee820f18d3ebcb5e343e61c1df
- Stein: 6bae6789e24f1cb35413b5d89af6ad491b75f161

I can see a bit of performance degradation in the port creation operation, due to [1]. However I don't see it in the rule list operation.

Regards.

[1] https://review.opendev.org/#/c/635311/22/neutron/db/securitygroups_db.py

Revision history for this message
Sam Morrison (sorrison) wrote :

How many security groups and rules do you have in your deployment (total from all projects/users)

Try doing it on a deployment with 10,000 groups and 100,000 rules not owned by your test user.

I've traced the code and it looks like it iterates over every single group and rule by all projects/users.

See https://github.com/openstack/neutron/blob/stable/stein/neutron/db/securitygroups_db.py#L683

Revision history for this message
Sam Morrison (sorrison) wrote :

OK I have found the culprit!

https://review.opendev.org/#/c/688716/ causes the performance regression

Changed in neutron:
importance: Undecided → High
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello Sam:

I was asking you for a quick way to reproduce it. Now you pointed out the possible culprit I think I can reproduce the error and try to fix it.

Regards.

Changed in neutron:
assignee: nobody → Rodolfo Alonso (rodolfo-alonso-hernandez)
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello:

The solution for this bug should be backported up to Rocky.

Regards.

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

@Rodolfo - even up to Queens :/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/708695

Changed in neutron:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/708695
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=d874c46bff7045ba25f5dd6e790f7ddb209cb224
Submitter: Zuul
Branch: master

commit d874c46bff7045ba25f5dd6e790f7ddb209cb224
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Tue Feb 18 17:08:22 2020 +0000

    Filter by owner SGs when retrieving the SG rules

    Retrieving the SG rules now is used the admin context. This allows to
    get all possible rules, independently of the user calling. The filters
    passed and the RBAC policies filter those results, returning only:
    - The SG rules belonging to the user.
    - The SG rules belonging to a SG owned by the user.

    However, if the SG list is too long, the query can take a lot of time.
    Instead of this, the filtering is done in the DB query. If no filters
    are passed to "get_security_group_rules" and the context is not the
    admin context, only the rules specified in the first paragraph will
    be retrieved.

    Because overwriting the method "get_objects" is too complex, an
    intermediate query is done to retrieve the SG rule IDs. Those IDs
    will be used as a filter in the "get_objects" call.

    Closes-Bug: #1863201

    Change-Id: I25d3da929f8d0b6ee15d7b90ec59b9d58a4ae6a5

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/720049

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/720051

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/720137

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/rocky)

Reviewed: https://review.opendev.org/720137
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=1afe935de81bbfca6ea29c239c55f5768d74410d
Submitter: Zuul
Branch: stable/rocky

commit 1afe935de81bbfca6ea29c239c55f5768d74410d
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Tue Feb 18 17:08:22 2020 +0000

    Filter by owner SGs when retrieving the SG rules

    Retrieving the SG rules now is used the admin context. This allows to
    get all possible rules, independently of the user calling. The filters
    passed and the RBAC policies filter those results, returning only:
    - The SG rules belonging to the user.
    - The SG rules belonging to a SG owned by the user.

    However, if the SG list is too long, the query can take a lot of time.
    Instead of this, the filtering is done in the DB query. If no filters
    are passed to "get_security_group_rules" and the context is not the
    admin context, only the rules specified in the first paragraph will
    be retrieved.

    Because overwriting the method "get_objects" is too complex, an
    intermediate query is done to retrieve the SG rule IDs. Those IDs
    will be used as a filter in the "get_objects" call.

    Conflicts:
          neutron/objects/securitygroup.py
          neutron/tests/unit/db/test_securitygroups_db.py
          neutron/tests/unit/objects/test_securitygroup.py

    Closes-Bug: #1863201

    Change-Id: I25d3da929f8d0b6ee15d7b90ec59b9d58a4ae6a5
    (cherry picked from commit d874c46bff7045ba25f5dd6e790f7ddb209cb224)
    (cherry picked from commit d3905264b7659b1d10a68e3629861d5f0ba13568)
    (cherry picked from commit 61dc621c1ba40efcedabdfb9f3a1854cea227d2c)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/train)

Reviewed: https://review.opendev.org/720049
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=d3905264b7659b1d10a68e3629861d5f0ba13568
Submitter: Zuul
Branch: stable/train

commit d3905264b7659b1d10a68e3629861d5f0ba13568
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Tue Feb 18 17:08:22 2020 +0000

    Filter by owner SGs when retrieving the SG rules

    Retrieving the SG rules now is used the admin context. This allows to
    get all possible rules, independently of the user calling. The filters
    passed and the RBAC policies filter those results, returning only:
    - The SG rules belonging to the user.
    - The SG rules belonging to a SG owned by the user.

    However, if the SG list is too long, the query can take a lot of time.
    Instead of this, the filtering is done in the DB query. If no filters
    are passed to "get_security_group_rules" and the context is not the
    admin context, only the rules specified in the first paragraph will
    be retrieved.

    Because overwriting the method "get_objects" is too complex, an
    intermediate query is done to retrieve the SG rule IDs. Those IDs
    will be used as a filter in the "get_objects" call.

    Conflicts:
          neutron/db/securitygroups_db.py
          neutron/objects/securitygroup.py

    Closes-Bug: #1863201

    Change-Id: I25d3da929f8d0b6ee15d7b90ec59b9d58a4ae6a5
    (cherry picked from commit d874c46bff7045ba25f5dd6e790f7ddb209cb224)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/720686

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/queens)

Reviewed: https://review.opendev.org/720686
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=093b861bb4edcd2af0cdd31351158ba4e2fa2435
Submitter: Zuul
Branch: stable/queens

commit 093b861bb4edcd2af0cdd31351158ba4e2fa2435
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Tue Feb 18 17:08:22 2020 +0000

    Filter by owner SGs when retrieving the SG rules

    Retrieving the SG rules now is used the admin context. This allows to
    get all possible rules, independently of the user calling. The filters
    passed and the RBAC policies filter those results, returning only:
    - The SG rules belonging to the user.
    - The SG rules belonging to a SG owned by the user.

    However, if the SG list is too long, the query can take a lot of time.
    Instead of this, the filtering is done in the DB query. If no filters
    are passed to "get_security_group_rules" and the context is not the
    admin context, only the rules specified in the first paragraph will
    be retrieved.

    Because overwriting the method "get_objects" is too complex, an
    intermediate query is done to retrieve the SG rule IDs. Those IDs
    will be used as a filter in the "get_objects" call.

    Conflicts:
          neutron/objects/securitygroup.py
          neutron/tests/unit/db/test_securitygroups_db.py
          neutron/tests/unit/objects/test_securitygroup.py

    Closes-Bug: #1863201

    Change-Id: I25d3da929f8d0b6ee15d7b90ec59b9d58a4ae6a5
    (cherry picked from commit d874c46bff7045ba25f5dd6e790f7ddb209cb224)
    (cherry picked from commit d3905264b7659b1d10a68e3629861d5f0ba13568)
    (cherry picked from commit 61dc621c1ba40efcedabdfb9f3a1854cea227d2c)

tags: added: in-stable-queens
tags: added: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/stein)

Reviewed: https://review.opendev.org/720051
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=61dc621c1ba40efcedabdfb9f3a1854cea227d2c
Submitter: Zuul
Branch: stable/stein

commit 61dc621c1ba40efcedabdfb9f3a1854cea227d2c
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Tue Feb 18 17:08:22 2020 +0000

    Filter by owner SGs when retrieving the SG rules

    Retrieving the SG rules now is used the admin context. This allows to
    get all possible rules, independently of the user calling. The filters
    passed and the RBAC policies filter those results, returning only:
    - The SG rules belonging to the user.
    - The SG rules belonging to a SG owned by the user.

    However, if the SG list is too long, the query can take a lot of time.
    Instead of this, the filtering is done in the DB query. If no filters
    are passed to "get_security_group_rules" and the context is not the
    admin context, only the rules specified in the first paragraph will
    be retrieved.

    Because overwriting the method "get_objects" is too complex, an
    intermediate query is done to retrieve the SG rule IDs. Those IDs
    will be used as a filter in the "get_objects" call.

    Conflicts:
          neutron/objects/securitygroup.py

    Closes-Bug: #1863201

    Change-Id: I25d3da929f8d0b6ee15d7b90ec59b9d58a4ae6a5
    (cherry picked from commit d874c46bff7045ba25f5dd6e790f7ddb209cb224)
    (cherry picked from commit d3905264b7659b1d10a68e3629861d5f0ba13568)

tags: added: in-stable-stein
tags: removed: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron queens-eol

This issue was fixed in the openstack/neutron queens-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron rocky-eol

This issue was fixed in the openstack/neutron rocky-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.