findall in novaclient/base.py is inefficient

Bug #1202179 reported by Christian Berendt
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
python-novaclient
Fix Released
Undecided
Joe Gordon

Bug Description

As mentioned in the comments ("This isn't very efficient: it loads the entire list then filters on the Python side.") findall (and so also find) in novaclient/base.py isn't very efficient. When calling "nova show HOSTNAME" on a tenant holding a lot of instances (> 500) the run time is too long. I wrote a simple patch for findall() and hope that it solves the issue without destroying anything else.

Changed in python-novaclient:
assignee: nobody → Christian Berendt (berendt)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to python-novaclient (master)

Fix proposed to branch: master
Review: https://review.openstack.org/37487

Changed in python-novaclient:
status: New → In Progress
Revision history for this message
Christian Berendt (berendt) wrote :

The proposed patch doesn't solve the issue of filtering on Python side. It just reduces the overhead (using /resources instead of /resources/detail) of fetching the details of every resource, resulting in a better run time performance.

Revision history for this message
Christian Berendt (berendt) wrote :

Resulting in a better run time performance when having a lot of resources assigned to a tenant/project.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to python-novaclient (master)

Reviewed: https://review.openstack.org/37487
Committed: http://github.com/openstack/python-novaclient/commit/5fe9408d2e53ac24a578d0ed568a4cede303fe58
Submitter: Jenkins
Branch: master

commit 5fe9408d2e53ac24a578d0ed568a4cede303fe58
Author: Christian Berendt <email address hidden>
Date: Wed Jul 17 16:48:04 2013 +0200

    make findall in novaclient/base.py more efficient

    Use /resources instead of /resources/detail to resolve
    the resource ID by the name and load the details of the
    resource in a separate step. This reduces the overhead
    to resolve the resource ID and results in a better runtime
    performance.

    This patch does not solve the issue that the name resolving
    takes place on the client side. For solving this issue new
    Nova API methods are necessary.

    fixes bug #1202179

    Change-Id: Ib753b1d090cb74b2d137c68f6899dad4ae2ec1ca

Changed in python-novaclient:
status: In Progress → Fix Committed
Changed in python-novaclient:
status: Fix Committed → Fix Released
Revision history for this message
Joe Gordon (jogo) wrote :

Findall is still very inefficient and still does a full list. nova API supports optional request parameters to limit the results. For example to just list servers with the name bob:

http://23.253.253.59:8774/v2/09a25d8076494398bd16e77863194f50/servers?name=bob we can limit the results to servers with the name bob. Although there is a bug so it appears to be doing substring matching

Revision history for this message
Joe Gordon (jogo) wrote :

results from nova --timing show carl showing we are doing a full list of all servers and filtering client side.

http://paste.ubuntu.com/8228743/

Revision history for this message
Joe Gordon (jogo) wrote :

re-opening as we should use our server side filtering instead of doing client side.

Changed in python-novaclient:
status: Fix Released → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to python-novaclient (master)

Fix proposed to branch: master
Review: https://review.openstack.org/118953

Changed in python-novaclient:
assignee: Christian Berendt (berendt) → Joe Gordon (jogo)
status: Triaged → In Progress
Revision history for this message
Joe Gordon (jogo) wrote :

Infra gave us some numbers on a detailed nova list: doing a 'servers/detail' in rax and HP:

460 instances: 10.6 seconds
184 instances: 4.7 seconds
84 instances: 2.2 seconds

So not doing a full list whenever possible should help considerably (assuming the regex query isn't too bad).

Will run further tests in a fake-virt devstack VM

Revision history for this message
Joe Gordon (jogo) wrote :

Using the fake virt driver: and 202 ACTIVE instances, here is the output of 'nova --timing show bob'

without the fix https://review.openstack.org/#/c/118953/3: http://paste.ubuntu.com/8294432/

+------------------------------------------------------------------------------------------------------------------+----------------+
| url | seconds |
+------------------------------------------------------------------------------------------------------------------+----------------+
| POST http://104.130.138.241:5000/v2.0/tokens | 0.163534879684 |
| GET http://104.130.138.241:8774/v2/754a4f21a49642ab9e912d6d16038b8e/servers | 0.85037112236 |
| GET http://104.130.138.241:8774/v2/754a4f21a49642ab9e912d6d16038b8e/servers/4d345b05-e7c4-4dbd-8398-1a3ff27710c3 | 0.055340051651 |
| GET http://104.130.138.241:8774/v2/754a4f21a49642ab9e912d6d16038b8e/flavors/42 | 0.015331029892 |
| GET http://104.130.138.241:8774/v2/754a4f21a49642ab9e912d6d16038b8e/images/777afb5e-e30b-4598-95da-896829ccba9b | 0.412953853607 |
| Total | 1.49753093719 |
+------------------------------------------------------------------------------------------------------------------+----------------+

 GET http://104.130.138.241:8774/v2/754a4f21a49642ab9e912d6d16038b8e/servers takes 0.85 seconds

With https://review.openstack.org/#/c/118953/3: http://paste.ubuntu.com/8294428/

+------------------------------------------------------------------------------------------------------------------+-----------------+
| url | seconds |
+------------------------------------------------------------------------------------------------------------------+-----------------+
| POST http://104.130.138.241:5000/v2.0/tokens | 0.157011032104 |
| GET http://104.130.138.241:8774/v2/754a4f21a49642ab9e912d6d16038b8e/servers?name=%5Eone%24 | 0.201536178589 |
| GET http://104.130.138.241:8774/v2/754a4f21a49642ab9e912d6d16038b8e/servers/4d345b05-e7c4-4dbd-8398-1a3ff27710c3 | 0.0527729988098 |
| GET http://104.130.138.241:8774/v2/754a4f21a49642ab9e912d6d16038b8e/flavors/42 | 0.0162160396576 |
| GET http://104.130.138.241:8774/v2/754a4f21a49642ab9e912d6d16038b8e/images/777afb5e-e30b-4598-95da-896829ccba9b | 0.251439809799 |
| Total | 0.67897605896 |
+------------------------------------------------------------------------------------------------------------------+-----------------+

GET http://104.130.138.241:8774/v2/754a4f21a49642ab9e912d6d16038b8e/servers?name=%5Eone%24 takes 0.2 seconds.

Revision history for this message
Joe Gordon (jogo) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to python-novaclient (master)

Reviewed: https://review.openstack.org/118953
Committed: https://git.openstack.org/cgit/openstack/python-novaclient/commit/?id=07260236ab2179579e0d0d2f9b7fb8027652dc32
Submitter: Jenkins
Branch: master

commit 07260236ab2179579e0d0d2f9b7fb8027652dc32
Author: Joe Gordon <email address hidden>
Date: Thu Sep 4 03:45:13 2014 +0000

    Make findall support server side filtering

    Instead of listing all servers and doing clientside filtering, use the servers
    filtering on name.the server's list already supports filtering
    so just pass a search_opts dictionary into list().

    This should speed up nova commands when a user has large numbers of servers.

    Change-Id: I6deea8523754ff213f43bd059fb00f34fc0e1a12
    Closes-Bug: #1202179

Changed in python-novaclient:
status: In Progress → Fix Committed
Michael Still (mikal)
Changed in python-novaclient:
milestone: none → 2.20.0
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.