Stacks have reference loops

Bug #1454873 reported by Zane Bitter
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Heat
Fix Released
Medium
Zane Bitter
Kilo
Fix Released
Medium
Unassigned

Bug Description

Python objects that make up a Stack have reference loops that prevent them from ever reaching a 0 reference count. This means that substantially all of the memory allocated during a stack operation will remain in use until the first garbage collector run after the operation completes. This is bad for memory use and fragmentation, and likely also bad for performance as it requires massive garbage collector cleanup work.

The two cycles I've been able to detect from inspecting the code are caused by references from Resource -> Stack and Function -> Stack. It's possible that analysis with Python's gc debugging tools could reveal other cycles. Please leave a comment if you find others.

Changed in heat:
status: New → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to heat (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/183214

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.openstack.org/183215

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.openstack.org/183216

Changed in heat:
assignee: nobody → Zane Bitter (zaneb)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to heat (master)

Fix proposed to branch: master
Review: https://review.openstack.org/183217

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to heat (master)

Reviewed: https://review.openstack.org/183214
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=2032548c4e501ebf90604bf60a69fdf9f64b981d
Submitter: Jenkins
Branch: master

commit 2032548c4e501ebf90604bf60a69fdf9f64b981d
Author: Zane Bitter <email address hidden>
Date: Wed May 13 18:56:32 2015 -0400

    Reference the parent stack, not parent resource in Stack

    In order to eliminate circular references, we need to define the Stack as
    the top of any reference hierarchy. This means we can't hold a reference to
    a Resource directly without also holding a reference to its Stack.

    Change-Id: I7430b109bbe1c5d6d64be9b8c778b394e9cff269
    Related-Bug: #1454873

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/183215
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=203b8e8ecf91ad7c75dd7072c85930de4f8c1c42
Submitter: Jenkins
Branch: master

commit 203b8e8ecf91ad7c75dd7072c85930de4f8c1c42
Author: Zane Bitter <email address hidden>
Date: Thu May 14 16:44:58 2015 -0400

    Retain references to stacks in all unit tests

    This will allow Resources to hold weak references to Stacks (so as to avoid
    circular references) without causing the Stack object to be prematurely
    deleted.

    Change-Id: Ia76da7bc51042fb3598ef2a660d6fbf78137a37b
    Related-Bug: #1454873

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/183216
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=d29bb1927489d5590f17d2f3ef2904b6f5edd0d4
Submitter: Jenkins
Branch: master

commit d29bb1927489d5590f17d2f3ef2904b6f5edd0d4
Author: Zane Bitter <email address hidden>
Date: Thu May 14 16:50:36 2015 -0400

    Pass stack to thread in resource_signal

    This is the only thread that gets started without an explicit reference to
    the stack. This prevents us from using a weakref in the resource, as it
    will cause the Stack's reference count to hit 0 before the thread is run.
    Now we explicitly pass a reference to the stack.

    Change-Id: Ie51be7b54d97ef184e401e395a7e7e3a26ce003b
    Related-Bug: #1454873

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to heat (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/183723

Revision history for this message
Zane Bitter (zaneb) wrote :

Ryan's testing with TripleO reduced the peak memory usage after the update of a TripleO overcloud stack from ~540MB to ~133MB. That suggests that we found all the loops. The excessive memory usage before was likely due to all of the extra stacks loaded as a result of bug 1455589, with most of them remaining in memory for long enough to completely fragment Python's memory pools as a result of them not being garbage collected in time.

tags: added: kilo-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to heat (master)

Reviewed: https://review.openstack.org/183217
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=60300b014952f5b2fa4c2a9010876f5b871f0f66
Submitter: Jenkins
Branch: master

commit 60300b014952f5b2fa4c2a9010876f5b871f0f66
Author: Zane Bitter <email address hidden>
Date: Fri May 15 17:18:12 2015 -0400

    Get rid of circular references in Resource and Function

    Circular references cause practically every bit of data that Heat uses to
    remain in memory until the completion of an operation, and even then to
    only be freed once the loop is detected by the garbage collector. By
    breaking all of the loops using weak references, we can ensure that things
    will get freed when they are no longer referenced without the need to wait
    for garbage collection (which should also take a lot less time). This
    change removes the loops from Resource and Function objects back to the
    Stack.

    Change-Id: Ibf80e95e69a2f27ed29754a2e0f1125e8eed0775
    Closes-Bug: #1454873

Changed in heat:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to heat (stable/kilo)

Reviewed: https://review.openstack.org/183723
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=a5297fec8a470d2aef8e41c1c4bee16f986aff74
Submitter: Jenkins
Branch: stable/kilo

commit a5297fec8a470d2aef8e41c1c4bee16f986aff74
Author: Zane Bitter <email address hidden>
Date: Wed May 13 18:56:32 2015 -0400

    Get rid of circular references in Resource and Function

    Circular references cause practically every bit of data that Heat uses to
    remain in memory until the completion of an operation, and even then to
    only be freed once the loop is detected by the garbage collector. By
    breaking all of the loops using weak references, we can ensure that things
    will get freed when they are no longer referenced without the need to wait
    for garbage collection (which should also take a lot less time). This
    change removes the loops from Resource and Function objects back to the
    Stack.

    Change-Id: Ibf80e95e69a2f27ed29754a2e0f1125e8eed0775
    Closes-Bug: #1454873
    (cherry-picked from commits 2032548c4e501ebf90604bf60a69fdf9f64b981d,
                                203b8e8ecf91ad7c75dd7072c85930de4f8c1c42,
                                d29bb1927489d5590f17d2f3ef2904b6f5edd0d4,
                                60300b014952f5b2fa4c2a9010876f5b871f0f66)

Thierry Carrez (ttx)
Changed in heat:
milestone: none → liberty-1
status: Fix Committed → Fix Released
tags: added: in-stable-kilo
removed: kilo-backport-potential
Thierry Carrez (ttx)
Changed in heat:
milestone: liberty-1 → 5.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.