excessive SQL query fanout on port list with many trunk ports

Bug #1842150 reported by Dirk Mueller
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
New
Undecided
Nate Johnston

Bug Description

when doing something like

openstack port list --project=$uuid

in a project that has many (~500) trunk ports defined, the api request takes a very long
tiem to complete (> 60s) and returns an excessively large json document.

this is due to

neutron/services/trunk/plugin.py

   def _extend_port_trunk_details(port_res, port_db):
        """Add trunk details to a port."""
        if port_db.trunk_port:
            subports = {
                x.port_id: {'segmentation_id': x.segmentation_id,
                            'segmentation_type': x.segmentation_type,
                            'port_id': x.port_id}
                for x in port_db.trunk_port.sub_ports
            }
            core_plugin = directory.get_plugin()
            ports = core_plugin.get_ports(
                context.get_admin_context(), filters={'id': subports})
            for port in ports:
                subports[port['id']]['mac_address'] = port['mac_address']
            trunk_details = {'trunk_id': port_db.trunk_port.id,
                             'sub_ports': [x for x in subports.values()]}
            port_res['trunk_details'] = trunk_details

        return port_res

which is doing a SQL query (via core_plugin.get_ports()) for each trunk port (!) to return its list of subports in the trunk_details section.

This trunk_details section is not actually shown by the openstack client, so it does so with no good reason. (it might be in --long output, didn't check).

It isn't entirely clear to me how to solve this, but I would suggest that we skip the _extend_port*details callouts for each port type when "port list" is used and *only* do those in "port show $uuid" type queries (where the overhead of looking up details on a single port is neglectible). The fanout of those subqueries (which can in return again fanout to further sql queries) for listings (especially in projects with 10000+ active ports) is excessivly and cloggs database and neutron service processes for a long amount of time.

Another way would be to add a flag to skip those to the api and make sure Nova (which triggers those listings) is not using those detailed lookups (it entirely ignores the result anyway).

Revision history for this message
Lajos Katona (lajos-katona) wrote :

Hi, thanks for the bug report.
Some questions/suggestions: which version on neutron are you using?
There's some original measurements about trunkports: https://wiki.openstack.org/wiki/Neutron/TrunkPort#Performance_.2F_Scaling
It can be useful to read.

Generally trunkports has scaling issues if you have many subports per trunks, but I think your case is not this one.

tags: added: trunk
Changed in neutron:
assignee: nobody → Nate Johnston (nate-johnston)
Revision history for this message
Nate Johnston (nate-johnston) wrote :

What version of OpenStack are you running?

The _extend_trunk_port_details function is part of a general purpose set of functionality called resource_extend that allows plugins like trunk ports to enrich any calls for a specific piece of data. In this case, this function extends the concept of a "port" to include trunk port information.

I can see how this would not scale well; it would be better to have a single database transaction that would get all trunk ports loaded up at once and then just collate the information between that data structure and the set of ports. But because for each port the resource_extend framework is being called separately there is little opportunity to optimize this for larger quantities of transaction.

I will raise this with the upstream community and see if we can brainstorm some other ways we can address this concern.

Revision history for this message
Dirk Mueller (dmllr) wrote :

I'm looking at stable/pike (sorry) and stable/rocky, the performance characteristics are similar there.

Agree, the main fundamental issue here is the fanout of SQL queries. I was wondering if we could change this to not do the _extend_trunk_port_details in "port list" type api calls and only on port show. alternatively we need a extra flag to suppress this by default.

Especially nova doing these calls are difficult from a scale perspective (but it completely ignores the result)

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Hi,

According to Your proposal to not return this extended port details in /GET/ports call is not good. Please keep in mind that this is defined and stable API, documented in https://docs.openstack.org/api-ref/network/v2/?expanded=list-ports-detail#ports and we have to ensure that API works in this way.

tags: added: loadimpact
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.