Deleting a port on a system with 1K ports takes too long

Bug #1779882 reported by Daniel Alvarez on 2018-07-03
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
neutron
High
Rodolfo Alonso

Bug Description

When attempting to delete a port on a system with 1K ports, it takes around 35 seconds to complete:

$ time openstack port delete port60_2

real 0m34.367s
user 0m3.497s
sys 0m0.187s

Log is *full* of the following messages when I issue the CLI:

neutron-server[324]: DEBUG neutron.pecan_wsgi.hooks.policy_enforcement [None req-a936bb85-d881-441b-aa07-74c4779d1771 demo demo] Attributes excluded by policy engine: [u'binding:profile', u'binding:vif_details', u'binding:vif_type', u'binding:host_id'] {{(pid=342) _exclude_attributes_by_policy /opt/stack/neutron/neutron/pecan_wsgi/hooks/policy_enforcement.py:256}}

To be precise: 896 messages like this ^

$ sudo journalctl -u devstack@q-svc | grep "Attributes excluded by policy engine" | wc -l
33626

$ time openstack port delete port60_2

real 0m34.367s
user 0m3.497s
sys 0m0.187s

$ sudo journalctl -u devstack@q-svc | grep "Attributes excluded by policy engine" | wc -l
34522

I'm using networking-ovn as mechanism-driver but looks unrelated to the backend :?

Daniel Alvarez (dalvarezs) wrote :

Deleting the port from admin user doesn't cause those messages and the operation takes around 5 seconds instead of 35.

zhaobo (zhaobo6) wrote :

Thank you.

You mentioned that create 1K "free" port, just db records, right?

If that, I don't think the policy engine or md driver would be the issue.
As the port delete is a very complex operation. Every step(l3 pre-check, callbacks, port pre/postcommit to md drivers, delete the port record in db) requires to cost the time. So this bug sound like a very wide range problem..

It's better to test, which step may cost the much time, then we can focus on..

Changed in neutron:
status: New → Opinion
importance: Undecided → Wishlist
tags: added: api db
Changed in neutron:
status: Opinion → Incomplete
Miguel Angel Ajo (mangelajo) wrote :

sorry folks, I disagree with the importance, I just checked and I can confirm.

It doesn't happen if you are admin. it seems to be related to the policy checking, but we need somebody to dig and find the real culprit, see if it's a bug or if it needs optimization. 30s to delete a port is unacceptable.

Changed in neutron:
status: Incomplete → Confirmed
importance: Wishlist → High
Daniel Alvarez (dalvarezs) wrote :

| You mentioned that create 1K "free" port, just db records, right?
Yes, not bound.

I think the log shows clearly where the time's spent.
Why do you say it's incomplete? Let me know what else you need.

I agree with @ajo here, 35 seconds is unacceptable.

Hello:

When a show/delete operation is executed using OpenStack Client, the OpenStack SDK will try first to retrieve the object.

If the identification given is the ID, Neutron server will call plugin.NeutronDBPluginV2.get_port(id) and will return the object or the exception NotFound.

If the identification given is the name, the Neutron server will return always the exception NotFound. Then the SDK will retrieve all ports (long query and subsequent process) and then will select those ones with the requested name. The process of retrieving the whole port list is inefficient and takes a lot of time if the port list is too big. In my case, 1K ports --> 40 secs.

I'll propose a list of patches to solve this problem.

Changed in neutron:
assignee: nobody → Rodolfo Alonso (rodolfo-alonso-hernandez)

Fix proposed to branch: master
Review: https://review.openstack.org/637237

Changed in neutron:
status: Confirmed → In Progress

Change abandoned by Rodolfo Alonso Hernandez (<email address hidden>) on branch: master
Review: https://review.openstack.org/637235
Reason: Abandoned in favor of a more generic modification in OSclient to retrieve an element using the ID or the name. In the second case, a "list" (GET -> get_objects) command with filters (name=id_or_name) will be sent instead of a generic "list" command, to reduce the number of elements retrieved.

Change abandoned by Rodolfo Alonso Hernandez (<email address hidden>) on branch: master
Review: https://review.openstack.org/637237
Reason: Abandoned in favor of a more generic modification in OSclient to retrieve an element using the ID or the name. In the second case, a "list" (GET -> get_objects) command with filters (name=id_or_name) will be sent instead of a generic "list" command, to reduce the number of elements retrieved.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers