neutron API list calls taking lot of time

Bug #1236704 reported by Ravi Chunduru
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
neutron
Invalid
Undecided
Unassigned
Grizzly
Invalid
High
Salvatore Orlando
Havana
Invalid
High
Salvatore Orlando

Bug Description

Neutron API calls are taking lot of time compared to nova or keystone service APIs.
In our deployment - its considerable that we had to increase neutron_url_timeout in nova.conf to 120s. Required for nova list to succeed.

In our analysis we found that DB access was quick enough but considerable time spent in the following code

https://github.com/openstack/neutron/blob/master/neutron/api/v2/base.py#L236

Here is the code for reference
if do_authz:
            # FIXME(salvatore-orlando): obj_getter might return references to
            # other resources. Must check authZ on them too.
            # Omit items from list that should not be visible
            obj_list = [obj for obj in obj_list
                        if policy.check(request.context,
                                        self._plugin_handlers[self.SHOW],
                                        obj,
                                        plugin=self._plugin)]

There is a clear comment from Salvatore to fix the above code.

# FIXME(salvatore-orlando): obj_getter might return references to
            # other resources. Must check authZ on them too.
            # Omit items from list that should not be visible

Need to fix it or improve the neutron API response time for list calls.
Commenting the above code improved in my devstack setup for port list to 6 seconds against 18 seconds for about 500 ports.
This issue is reproduced in Grizzly and I am sure it is an issue for Havana too.

Revision history for this message
Aaron Rosen (arosen) wrote :

We actually made a number of db related performance improvements in havana. Any chance you can try with the havana code base? I've run port-list with 100's of ports which return in a few seconds.

Revision history for this message
Ravi Chunduru (ravivsn) wrote : Re: [Bug 1236704] Re: neutron API calls taking lot of time

Hi Aaron,
  We are having production deployment with openstack in Grizzly. So the
improvements must be backported if any.
From my observations - it is not DB related but the additional policy
checks done on references to other resources in the response.

Thanks,
-Ravi.

On Mon, Oct 7, 2013 at 11:04 PM, Aaron Rosen <email address hidden> wrote:

> We actually made a number of db related performance improvements in
> havana. Any chance you can try with the havana code base? I've run port-
> list with 100's of ports which return in a few seconds.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1236704
>
> Title:
> neutron API calls taking lot of time
>
> Status in OpenStack Neutron (virtual network service):
> New
>
> Bug description:
> Neutron API calls are taking lot of time compared to nova or keystone
> service APIs.
> In our deployment - its considerable that we had to increase
> neutron_url_timeout in nova.conf to 120s. Required for nova list to
> succeed.
>
> In our analysis we found that DB access was quick enough but
> considerable time spent in the following code
>
>
> https://github.com/openstack/neutron/blob/master/neutron/api/v2/base.py#L236
>
> Here is the code for reference
> if do_authz:
> # FIXME(salvatore-orlando): obj_getter might return
> references to
> # other resources. Must check authZ on them too.
> # Omit items from list that should not be visible
> obj_list = [obj for obj in obj_list
> if policy.check(request.context,
> self._plugin_handlers[self.SHOW],
> obj,
> plugin=self._plugin)]
>
> There is a clear comment from Salvatore to fix the above code.
>
> # FIXME(salvatore-orlando): obj_getter might return references to
> # other resources. Must check authZ on them too.
> # Omit items from list that should not be visible
>
> Need to fix it or improve the neutron API response time for list calls.
> Commenting the above code improved in my devstack setup for port list to
> 6 seconds against 18 seconds for about 500 ports.
> This issue is reproduced in Grizzly and I am sure it is an issue for
> Havana too.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/neutron/+bug/1236704/+subscriptions
>

--
Ravi

Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote : Re: neutron API calls taking lot of time

This issue indeed does not affect Havana - the root cause is the fact that the policy engine makes a db access for every item returned by a get.
The fix for Grizzly (and Folsom) is rather extensive, and possibly not acceptable as a backport.

I can try and work out a smaller fix for reducing the number of db accesses performed by this operation.

Changed in neutron:
assignee: nobody → Salvatore Orlando (salvatore-orlando)
status: New → Invalid
Revision history for this message
Ravi Chunduru (ravivsn) wrote :

I think there is a still valid bug in havana. Need to get the reason behind doing policy checks again on the response object. There is definitely need for optimization. It could have been done in the plugin itself rather than duplicating with a general approach on all list API calls.

Looking at response time for neutron port-list - IMHO, it should not ignored and must be optimized further.

Ravi Chunduru (ravivsn)
summary: - neutron API calls taking lot of time
+ neutron API list calls taking lot of time
Revision history for this message
Raj Geda (rgeda) wrote :

The code works perfect with we run under 400-500 ports. when we have more ports we see this issue. The response time is increased based on how may ports we list. for example response times
 30s for 440 ports
 40s for 600 ports
 1m for 900 ports
 1m30s for 1440 ports

we defiantly need to do some code optimization. when i do the SQL query to get port list what been called from the code and the response is less than 1s for 1440 ports.

SELECT ports.tenant_id AS ports_tenant_id, ports.id AS ports_id, ports.name AS ports_name, ports.network_id AS ports_network_id, ports.mac_address AS ports_mac_address, ports.admin_state_up AS ports_admin_state_up, ports.status AS ports_status, ports.device_id AS ports_device_id, ports.device_owner AS ports_device_owner
FROM ports.

Changed in neutron:
status: Invalid → Confirmed
Revision history for this message
Raj Geda (rgeda) wrote :

one thing i am not clear that the code comments has FIXME, which means we have to fix/address this issue. This means the author has intention to improve it. Reference code below.

 # FIXME(salvatore-orlando): obj_getter might return references to
            # other resources. Must check authZ on them too.
            # Omit items from list that should not be visible
            obj_list = [obj for obj in obj_list
                        if policy.check(request.context,
                                        self._plugin_handlers[self.SHOW],
                                        obj,
                                        plugin=self._plugin)]

Revision history for this message
Rodney Peck (ropeck) wrote :

Commenting to emphasize the need for this fix in both Grizzly and Havana. From our debugging, this isn't a database performance issue. It's making too many needless calls to the db when doing the policy checks as shown in the code snippets above. If this isn't corrected, the load on the db from a nova list or nova instance booting will make the entire cluster unresponsive.

Again, just making the db calls more efficient or improving the db isn't the complete solution. The number of db calls made needs to be drastically reduced to just the ones that make sense.

Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

I did not realize this bug was still open.
Every performance issue deserves high priority.

In this case I have no idea about what would be deemed acceptable; I will update this bug report soon.

As far as I know there are no DB accesses in policy checks.

Changed in neutron:
importance: Undecided → High
Changed in neutron:
assignee: Salvatore Orlando (salvatore-orlando) → Anirudh Vedantam (anirudh-vedantam)
Revision history for this message
Anirudh Vedantam (anirudh-vedantam) wrote :

What I want to know is that why at the first place is policy check is necessary, for the object list that is retreived from the database??

Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

An issue concerning slowness of list operations has been recently identified and fixed: https://bugs.launchpad.net/neutron/+bug/1302467

The above issue will be part of the upcoming Icehouse release and backported to Havana.
The issue reported in https://bugs.launchpad.net/neutron/+bug/1302611, less important than the previous one, will instead be available on the first Icehouse stable release, and likely backported to havana too.

The policy checks are still performed after the DB object are retrieved, but they're now much faster.
We are working on further improvements, but it is not yet clear whether they will be backportable.

This bug is now going to be marked as invalid, as we have new trackers for this issue.
Please reopen if you wish so.

Changed in neutron:
status: Confirmed → Invalid
importance: High → Undecided
assignee: Anirudh Vedantam (anirudh-vedantam) → nobody
status: Invalid → Incomplete
assignee: nobody → Salvatore Orlando (salvatore-orlando)
assignee: Salvatore Orlando (salvatore-orlando) → Anirudh Vedantam (anirudh-vedantam)
Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

Keeping it incomplete as the bug has a current assignee, and perhaps the assignee is working on something related as well.

Changed in neutron:
assignee: Anirudh Vedantam (anirudh-vedantam) → nobody
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

This bug is > 240 days without activity. We are unsetting assignee and milestone and setting status to Incomplete in order to allow its expiry in 60 days.

If the bug is still valid, then update the bug status.

Revision history for this message
Bernard Cafarelli (bcafarel) wrote :

Updating status for bug and also branch-specific status these are EOL and many changes were merged performance-wise since

Changed in neutron:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.