max_resources_per_stack causes scaling issues on large stacks

Bug #1489548 reported by Steven Hardy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Heat
Fix Released
High
Steve Baker

Bug Description

I've been doing some testing, trying to mimic some performance issues observed on TripleO, which makes heavy use of ResourceGroups containing nested stacks.

Here's my minimal reproducer, which I've been testing on devstack:

heat_template_version: 2015-04-30
description: >
  Stress test, create many stacks in a RG
resources:
  NovaComputes:
    type: OS::Heat::ResourceGroup
    properties:
      count: 400
      resource_def:
        type: dummy_node.yaml

-bash-4.3$ cat dummy_node.yaml·
heat_template_version: 2015-04-30
description: >
  Single cirros node
parameters:
resources:
  server:
    type: OS::Heat::RandomString

All it does is create 400 nested stacks inside a ResourceGroup, each containing one RandomString resource.

Initially, this took around 13minutes (!) on my devstack (core i7 laptop with 6 heat-engine workers)

$ heat event-list twostress2
+---------------+--------------------------------------+-------------------------------------+--------------------+---------------------+
| resource_name | id | resource_status_reason | resource_status | event_time |
+---------------+--------------------------------------+-------------------------------------+--------------------+---------------------+
| twostress2 | 0fa22280-f3c8-4b16-b231-dfa37952203e | Stack CREATE started | CREATE_IN_PROGRESS | 2015-08-27T16:30:39 |
| NovaComputes | 491cf434-2340-49d6-bb05-49a4aed82f0d | state changed | CREATE_IN_PROGRESS | 2015-08-27T16:30:39 |
| NovaComputes | edeaa3ff-266c-40a0-a484-b54d02c3073e | state changed | CREATE_COMPLETE | 2015-08-27T16:43:42 |
| twostress2 | d8127910-138e-4cc9-b4b0-ae6e04415e9f | Stack CREATE completed successfully | CREATE_COMPLETE | 2015-08-27T16:43:42 |
+---------------+--------------------------------------+-------------------------------------+--------------------+---------------------+

I then set max_resources_per_stack = -1 to avoid the _validate_nested_resources tests in stack_resource.py, which I'd identified as a bottleneck via some basic profiling, now creating the exact same stack only takes 2.5 minutes!!

$ heat event-list twostress2
+---------------+--------------------------------------+-------------------------------------+--------------------+---------------------+
| resource_name | id | resource_status_reason | resource_status | event_time |
+---------------+--------------------------------------+-------------------------------------+--------------------+---------------------+
| twostress2 | 19a65c8f-c1cb-4deb-a438-1ae5aed8cab0 | Stack CREATE started | CREATE_IN_PROGRESS | 2015-08-27T16:52:38 |
| NovaComputes | 02663989-a504-4494-aff7-f67ccfd18461 | state changed | CREATE_IN_PROGRESS | 2015-08-27T16:52:38 |
| NovaComputes | 2aceaa54-481b-47bb-adfb-931ad710165b | state changed | CREATE_COMPLETE | 2015-08-27T16:55:08 |
| twostress2 | 4b00e49f-4acb-4b5a-ac41-162cb3e59769 | Stack CREATE completed successfully | CREATE_COMPLETE | 2015-08-27T16:55:08 |
+---------------+--------------------------------------+-------------------------------------+--------------------+---------------------+

I really think we should consider setting the default to be -1, and including a warning in the option documentation that it's potentially very costly to enable. The services called by heat implement their own quotas, so I think this resource count is mostly pretty arbitrary anyway.

Revision history for this message
Steven Hardy (shardy) wrote :

Ok, I know this has been previously discussed and rejected, but I think we should at least revisit that discussion, because this makes our performance suck for non-trivial deployments by default, so I think we should at least give consideration to disabling it unless we can make the check much less expensive.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to heat (master)

Fix proposed to branch: master
Review: https://review.openstack.org/217821

Changed in heat:
assignee: nobody → Steven Hardy (shardy)
status: New → In Progress
Revision history for this message
Zane Bitter (zaneb) wrote : Re: max_resources_per_stack should be disabled by default

IMHO the problem is that we are trying to count the max resources *per stack*. That's a pretty meaningless figure anyway, as well as being fairly expensive to calculate. We should be counting the max resources *per tenant*, which would be both meaningful and super quick to calculate from the DB assuming we had the right indices.

Changed in heat:
importance: Undecided → High
Revision history for this message
Steve Baker (steve-stevebaker) wrote :

I think Zane is right and we should deprecate max_resources_per_stack and move to a tenant limit. And we should consider db schema changes to make that calculation as efficient as possible.

Having said that, last round of optimisation for stack resource count was aimed at memory use only, and avoided schema changes to lower the backporting impact. Now we know it is causing different scaling issues we still have a window to optimise *with* schema changes before liberty, and chances are this would benefit a tenant resource count too.

Likely the above template is slow purely due to database round trips. For the current implementation these are for each resource creation:
1) a db select per parent stack to navigate to the root stack
2) a db select per stack to discover all stack ids in the tree
3) a db count query of all resources in the

If the resource table had a root_stack_id column, the above would be replaced with a single count query. And if the stack table had a root_stack_id column then populating each resource.root_stack_id would not require extra queries.

Revision history for this message
Ryan Brown (sb-p) wrote :

I don't agree with moving the limit to "no limit" by default is the best strategy here. Of course, it sucks a lot that the default (having a limit) is such a performance killer.

Categorically, I'm opposed to "unlimited-by-default," but given the performance impact an exception is probably warranted, as I'd rather people hit external resource limits than have complaints that Heat is too slow/doesn't scale to large stacks.

Revision history for this message
Steven Hardy (shardy) wrote :

@Ryan: Basically my opinion aligns with Zane's, which is that *a* limit makes sense, but the per-stack limit is basically pointless, we should probably deprecate the resources-per-stack and number-of-stacks limits for a quota of total resources per-tenant.

If there are ways to optimise such as mentioned by stevebaker, then we can try that, but I think it's worth stepping back and remembering why these limits even exist.

IIRC they were added because folks wanted a way to stop unbounded DoS attacks via users maliciously creating too many stacks, or too many resources (or both). We somehow picked some really arbitrary numbers like:

max_resources_per_stack = 1000
max_stacks_per_tenant = 100

We then realized that recursion is actually far worse and more expensive than the sheer number of resources/stacks, so we did:

max_nested_stack_depth = 5

My position is basically that a limit of 100,000 resources per tenant with the potential for very-expensive levels of nesting is basically the same as "unlimited" - if you had even a tiny number of malicious tenants attempt to DoS heat they will probably succeed, and one of the main reasons for that is this hugely expensive but ultimately pointless limit check.

summary: - max_resources_per_stack should be disabled by default
+ max_resources_per_stack causes scaling issues on large stacks
Changed in heat:
milestone: none → liberty-rc1
Changed in heat:
assignee: Steven Hardy (shardy) → Steve Baker (steve-stevebaker)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to heat (master)

Fix proposed to branch: master
Review: https://review.openstack.org/226134

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/226135

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to heat (master)

Reviewed: https://review.openstack.org/226134
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=1b2cd7495dd91b0dc9a7edd4e2f946c99e3a460f
Submitter: Jenkins
Branch: master

commit 1b2cd7495dd91b0dc9a7edd4e2f946c99e3a460f
Author: Steve Baker <email address hidden>
Date: Mon Sep 21 16:18:57 2015 +1200

    Add resource.root_stack_id column

    This change adds a root_stack_id column to the resource
    record to allow a subsequent change enforce
    max_resources_per_stack with a single query instead of the
    many it currently requires.

    This change includes the following:
    - Data migration to add the resource.root_stack_id column
      and populate all existing resources with their calculated
      root stack
    - Make new resources aquire and set their root_stack_id on
      store or update.
    - StackResource._validate_nested_resources use the stored
      root_stack_id resulting in a ~15% performance improvement
      for the creation time of a test stack containing 40 nested
      stacks.

    Change-Id: I2b00285514235834131222012408d2b5b2b37d30
    Partial-Bug: 1489548

Changed in heat:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/226135
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=5576ffd249e251abab87fc954f778e2f9b003cfe
Submitter: Jenkins
Branch: master

commit 5576ffd249e251abab87fc954f778e2f9b003cfe
Author: Steve Baker <email address hidden>
Date: Tue Sep 22 10:50:23 2015 +1200

    Use root_stack_id for stack_count_total_resources

    The current implementation of stack_count_total_resources
    scales very poorly for stacks with many nested stacks.

    This change replaces the implementation with a single count
    query filtering on the resource.root_stack_id column.

    The following template was used for performance testing:

    heat_template_version: 2015-04-30
    description: >
      Stress test, create many stacks in a RG
    parameters:
      count:
        type: number
        default: 4
    resources:
      dummies:
        type: OS::Heat::ResourceGroup
        properties:
          count: {get_param: count}
          resource_def:
            type: dummy_node.yaml

     # dummy_node.yaml
    heat_template_version: 2015-04-30
    parameters:
    resources:
      random:
        type: OS::Heat::RandomString
      randoms:
        type: OS::Heat::ResourceGroup
        properties:
          count: 1
          resource_def:
            type: OS::Heat::RandomString

    This template was used to time the stack creation with
    count=40 and count=80 against heat-master, the previous
    change I2b00285514235834131222012408d2b5b2b37d30
    and this change. Here are the results:

     Stack heat-master-40 CREATE_COMPLETE
    real 1m9.103s

     Stack root-stack-id-40 CREATE_COMPLETE
    real 0m59.233s

     Stack count-total-resources-40 CREATE_COMPLETE
    real 0m43.308s

     Stack heat-master-80 CREATE_COMPLETE
    real 2m47.190s

     Stack root-stack-id-80 CREATE_COMPLETE
    real 2m16.743s

     Stack count-total-resources-80 CREATE_COMPLETE
    real 1m15.288s

    Also, the test template in bug #1489548 took 3 minutes
    to create (vs the originally reported 13 minutes).

    Change-Id: Iab3eaaba3ece16e14db3231f1c725bca3c8985c2
    Closes-Bug: 1489548

Thierry Carrez (ttx)
Changed in heat:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on heat (master)

Change abandoned by Steven Hardy (<email address hidden>) on branch: master
Review: https://review.openstack.org/217821
Reason: stevebaker proposed an alternative patch improving the efficiency somewhat which has now merged

Thierry Carrez (ttx)
Changed in heat:
milestone: liberty-rc1 → 5.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.