Too many lazy-loads in predictable situations

Bug #1540526 reported by Dan Smith
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Expired
Medium
Unassigned

Bug Description

During a normal tempest run, way (way) too many object lazy-loads are being triggered, which causes extra RPC and database traffic. In a given tempest run, we should be able to pretty much prevent any lazy-loads in that predictable situation. The only case where we might want to have some is where we are iterating objects and conditionally taking action that needs to load extra information.

On a random devstack-tempest job run sampled on 1-Feb-2016, a lot of lazy loads were seen:

  grep 'Lazy-loading' screen-n-cpu.txt.gz -c
  624

We should be able to vastly reduce this number without much work.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/274869

Changed in nova:
assignee: nobody → Dan Smith (danms)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/274870

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/274887

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/275288

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/274869
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=2e96f7f2d7912f9c1b84d86065aa55a00da4b4ed
Submitter: Jenkins
Branch: master

commit 2e96f7f2d7912f9c1b84d86065aa55a00da4b4ed
Author: Dan Smith <email address hidden>
Date: Mon Feb 1 10:56:05 2016 -0800

    Prevent _heal_instance_info_cache() periodic lazy-loads

    This fixes _heal_instance_info_cache() so that it speculatively loads
    the flavor information for an instance during initial query, instead of
    always depending on lazy-load to get it (which triggers an extra RPC
    and DB call for every instance).

    Change-Id: Ia1d78fb48d3f131487bd7a98d3a50774c85a1fa2
    Partial-Bug: #1540526

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/274870
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7a9942c5924d7ccfe321c54d0b5e70cfb3aede57
Submitter: Jenkins
Branch: master

commit 7a9942c5924d7ccfe321c54d0b5e70cfb3aede57
Author: Dan Smith <email address hidden>
Date: Mon Feb 1 11:55:15 2016 -0800

    Improve efficiency of Migration.instance property

    In the resource tracker, we get a list of migrations and rely on the
    Migration.instance property to look up the related instance. This
    doesn't actually cache the instance, which means repeated calls will
    continue to trigger RPC and DB hits. Further, in resource tracker,
    we have already pulled a list of instances and then proceed to re-query
    them through lazy-loads from the migration objects, even though we
    already have them. To add insult to injury, neither this first list
    of instances, nor the ones pulled from Migration.instance contain
    instances with the 'flavor' or 'migration_context' properties
    populated, which means we trigger *two* more lazy loads to pull those.

    This patch does a few things:
     - Makes the Migration object cache its instance property
     - Makes the resource tracker request instances with flavors and the
       migration_context populated
     - Stitches the instance list into the migration list to avoid the
       two subsequent lazy loads of Instance and Instance.flavor

    Change-Id: Ifc7dcde8a659710acecb1967da15c632c69d675c
    Partial-Bug: #1540526

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/274887
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=06784253b186b5f79aa09c3abbf8ae41513e1ab0
Submitter: Jenkins
Branch: master

commit 06784253b186b5f79aa09c3abbf8ae41513e1ab0
Author: Dan Smith <email address hidden>
Date: Mon Feb 1 13:13:29 2016 -0800

    Optimize the instance fetched by floating_ips API

    The floating_ips API fetches an instance without flavor information
    and then passes it to the Network API, which eventually calls
    _get_instrance_nw_info() which requires access to the flavor. This
    triggers a lazy load in 100% of the situations, which is really not
    necessary and wasteful.

    This patch adds flavor to the list of pre-queried attributes on the
    instances fetched by this API to help reduce that waste.

    Change-Id: I15eae4e8c03739a34cef9c90989ad0615a96a963
    Partial-Bug: #1540526

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/276860

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/276861

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/275288
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=50bcf101f3daf7ef4d3de11c7451c50fe78c69eb
Submitter: Jenkins
Branch: master

commit 50bcf101f3daf7ef4d3de11c7451c50fe78c69eb
Author: Dan Smith <email address hidden>
Date: Tue Feb 2 07:50:38 2016 -0800

    Optimize servers path by pre-joining numa_topology

    Paths like server delete pass the instance down to the compute node,
    which expects numa_topology to be joined. This currently causes a
    lazy-load on every such operation. In a recent tempest run, numa_topology
    was lazy-loaded 134 times (not all of which are in server delete,
    necessarily).

    This patch adds it to the list of attributes we pre-query when
    starting an operation like delete, update, etc.

    Change-Id: Ia4317fb1690535f467e1b295fbab64c9ec2f6f76
    Partial-Bug: #1540526

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/276860
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3fe9771ad27233dd82625adc6d4cd7a396c0974f
Submitter: Jenkins
Branch: master

commit 3fe9771ad27233dd82625adc6d4cd7a396c0974f
Author: Dan Smith <email address hidden>
Date: Fri Feb 5 09:55:09 2016 -0800

    Join flavor when re-querying instance for floating ip association

    The network api code re-queries the instance to get the "current" version of
    it before doing a floating ip association. This query does not include flavor,
    which causes us to almost immediately lazy-load it in the call to
    update_instance_with_nw_info(). This patch fixes that by asking for it
    in the first place.

    Change-Id: I4606ca5ec280a191697f1e100d8f09e5fb03260a
    Partial-bug: #1540526

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/276861
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3a761270581d1ac61a3b4669c130d211f1ad5a17
Submitter: Jenkins
Branch: master

commit 3a761270581d1ac61a3b4669c130d211f1ad5a17
Author: Dan Smith <email address hidden>
Date: Fri Feb 5 10:04:39 2016 -0800

    Avoid lazy-loads in metadata requests

    The metadata server currently doesn't pre-query for metadata and system_metadata,
    which ends up generating *two* lazy-loads on many requests. Since especially
    user metadata is almost definitely one of the things an instance is going to fetch
    from the metadata server, this is fairly inefficient.

    This patch makes us always pre-query the metadata twins when looking up (and
    caching) the instance.

    Change-Id: I8e565b6ce976fcbe6194b5960a3ebf2b76a45416
    Partial-bug: #1540526

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/284839

Changed in nova:
assignee: Dan Smith (danms) → Matt Riedemann (mriedem)
Matt Riedemann (mriedem)
Changed in nova:
importance: Undecided → Medium
Revision history for this message
Matt Riedemann (mriedem) wrote :

Dan has a test patch for sniffing these out here:

https://review.openstack.org/#/c/274714/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/284945

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/284945
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7fbd9ec0c6bcda33c77f67169403939e7befff5d
Submitter: Jenkins
Branch: master

commit 7fbd9ec0c6bcda33c77f67169403939e7befff5d
Author: Matt Riedemann <email address hidden>
Date: Thu Feb 25 16:42:09 2016 -0500

    Add specific method to lazy-load instance.pci_devices

    The pci_devices field is generically lazy-loaded today by
    getting the instance record from the database with a join
    on the pci_devices table. We can make this smarter by just
    getting the PciDeviceList via the instance uuid directly
    so we don't need to do the join and query on the instances
    table.

    Change-Id: I90f6c7a17bb773798ae77d19d744dcac02de215c
    Related-Bug: #1540526

Changed in nova:
assignee: Matt Riedemann (mriedem) → Dan Smith (danms)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/291898
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c5a5a7f322e2b2f76dab2647af2f15f88c2abac3
Submitter: Jenkins
Branch: master

commit c5a5a7f322e2b2f76dab2647af2f15f88c2abac3
Author: Dan Smith <email address hidden>
Date: Fri Mar 11 12:31:01 2016 -0800

    Avoid lazy-loads of ec2_ids on Instance

    Right now, we only create the ec2_ids for an Instance when they are first
    accessed, not at instance creation time. However, we always touch them
    later for building metadata, configdrive, etc, which means we end up with
    an expensive lazy-load later.

    This makes us always create ec2_ids at instance create time. Since it's in
    instance.create(), it runs on conductor, straight to the database, avoiding
    a round-trip later (assuming this instance is passed to the compute node
    for create and/or that the field is specified in a later call under
    expected_attrs).

    Partial-Bug: #1540526
    Change-Id: I29bc317f990bfa0f01b4f9615699debcc3a3c2c6

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Matt Riedemann (<email address hidden>) on branch: master
Review: https://review.openstack.org/284839

Revision history for this message
Sean Dague (sdague) wrote :

There are no currently open reviews on this bug, changing
the status back to the previous state and unassigning. If
there are active reviews related to this bug, please include
links in comments.

Changed in nova:
status: In Progress → New
assignee: Dan Smith (danms) → nobody
Revision history for this message
Sean Dague (sdague) wrote :

Is this still considered an active bug to work against. Most of what I see in the gate is pci_devices, which seems appropriate.

We should at minimum narrow what the criteria for acceptable vs. not for lazy load, as in the current state the bug isn't very actionable.

Changed in nova:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/689846

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/689846
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c15e36e5849b6baddebb4b39475a6bf03ec5908b
Submitter: Zuul
Branch: master

commit c15e36e5849b6baddebb4b39475a6bf03ec5908b
Author: Matt Riedemann <email address hidden>
Date: Mon Oct 21 13:00:59 2019 -0400

    Join migration_context and flavor in Migration.instance

    This builds on Ifc7dcde8a659710acecb1967da15c632c69d675c
    by joining the Migration.instance migration_context and
    flavor to avoid lazy-loading those later.

    When tracking an incoming migration, the ResourceTracker
    _pair_instances_to_migrations can hit a KeyError since it's
    not yet tracking the instance on that dest host yet. Then
    _update_usage_from_migrations will lazy-load the Migration.instance
    field and access the migration_context and flavor fields on the
    instance, which get lazy-loaded, which kind of defeats part of
    the purpose of that optimization.

    Change-Id: I613ad054f77b1a0a9d2e7718c0c531d11525283c
    Related-Bug: #1540526

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/train)

Related fix proposed to branch: stable/train
Review: https://review.opendev.org/694408

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/train)

Reviewed: https://review.opendev.org/694408
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e2b4e3346e20615473328e7ae90b5083500961ca
Submitter: Zuul
Branch: stable/train

commit e2b4e3346e20615473328e7ae90b5083500961ca
Author: Matt Riedemann <email address hidden>
Date: Mon Oct 21 13:00:59 2019 -0400

    Join migration_context and flavor in Migration.instance

    This builds on Ifc7dcde8a659710acecb1967da15c632c69d675c
    by joining the Migration.instance migration_context and
    flavor to avoid lazy-loading those later.

    When tracking an incoming migration, the ResourceTracker
    _pair_instances_to_migrations can hit a KeyError since it's
    not yet tracking the instance on that dest host yet. Then
    _update_usage_from_migrations will lazy-load the Migration.instance
    field and access the migration_context and flavor fields on the
    instance, which get lazy-loaded, which kind of defeats part of
    the purpose of that optimization.

    Change-Id: I613ad054f77b1a0a9d2e7718c0c531d11525283c
    Related-Bug: #1540526
    (cherry picked from commit c15e36e5849b6baddebb4b39475a6bf03ec5908b)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/stein)

Related fix proposed to branch: stable/stein
Review: https://review.opendev.org/696083

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/stein)

Reviewed: https://review.opendev.org/696083
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=603171bd5c5ce354086ce32e980f9bb8383069bf
Submitter: Zuul
Branch: stable/stein

commit 603171bd5c5ce354086ce32e980f9bb8383069bf
Author: Matt Riedemann <email address hidden>
Date: Mon Oct 21 13:00:59 2019 -0400

    Join migration_context and flavor in Migration.instance

    This builds on Ifc7dcde8a659710acecb1967da15c632c69d675c
    by joining the Migration.instance migration_context and
    flavor to avoid lazy-loading those later.

    When tracking an incoming migration, the ResourceTracker
    _pair_instances_to_migrations can hit a KeyError since it's
    not yet tracking the instance on that dest host yet. Then
    _update_usage_from_migrations will lazy-load the Migration.instance
    field and access the migration_context and flavor fields on the
    instance, which get lazy-loaded, which kind of defeats part of
    the purpose of that optimization.

    Change-Id: I613ad054f77b1a0a9d2e7718c0c531d11525283c
    Related-Bug: #1540526
    (cherry picked from commit c15e36e5849b6baddebb4b39475a6bf03ec5908b)
    (cherry picked from commit e2b4e3346e20615473328e7ae90b5083500961ca)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/rocky)

Related fix proposed to branch: stable/rocky
Review: https://review.opendev.org/696572

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/rocky)

Reviewed: https://review.opendev.org/696572
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e48fb84e1246aae8c7f7fd22a9c72ccbfa92d9ee
Submitter: Zuul
Branch: stable/rocky

commit e48fb84e1246aae8c7f7fd22a9c72ccbfa92d9ee
Author: Matt Riedemann <email address hidden>
Date: Mon Oct 21 13:00:59 2019 -0400

    Join migration_context and flavor in Migration.instance

    This builds on Ifc7dcde8a659710acecb1967da15c632c69d675c
    by joining the Migration.instance migration_context and
    flavor to avoid lazy-loading those later.

    When tracking an incoming migration, the ResourceTracker
    _pair_instances_to_migrations can hit a KeyError since it's
    not yet tracking the instance on that dest host yet. Then
    _update_usage_from_migrations will lazy-load the Migration.instance
    field and access the migration_context and flavor fields on the
    instance, which get lazy-loaded, which kind of defeats part of
    the purpose of that optimization.

    Change-Id: I613ad054f77b1a0a9d2e7718c0c531d11525283c
    Related-Bug: #1540526
    (cherry picked from commit c15e36e5849b6baddebb4b39475a6bf03ec5908b)
    (cherry picked from commit e2b4e3346e20615473328e7ae90b5083500961ca)
    (cherry picked from commit 603171bd5c5ce354086ce32e980f9bb8383069bf)

tags: added: in-stable-rocky
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.