Autoscaling does not respect min_size and max_size

Bug #1439754 reported by Giuseppe Civitella
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Heat
Expired
Undecided
Unassigned

Bug Description

When testing an autoscaling template I found than min_size and max_size are not respected.
In my template I have an AutoScalingGroup of web servers with min_size=1 and max_size=3
When I deploy the stack it happens to get at the beginning up to 6 web instances.
This is a devstack env running inside a big vm, so web instances get an high cpu load while they boot and trigger a high cpu alarm.
When things calm down the scale down policy resizes the web group to 3 instances instead of resizing to 1.
In the heat engine log I find the following:
http://paste.openstack.org/show/197943/
It seems the engine does not notice that it has to scale down two more instances.
This is my template:
http://paste.openstack.org/show/197942/

My devstack was pointing to master repositorties.

Revision history for this message
Rabi Mishra (rabi) wrote :

The template has one mysql and nfs server along with the AutoScalingGroup for the web server. Therefore after the scale down of the web server group, you would see 3 instances.

Changed in heat:
status: New → Invalid
status: Invalid → Incomplete
Revision history for this message
Giuseppe Civitella (gcivitella) wrote :

After the scale down of the instances I can see 5 instances: 3 web instances, a mysql instance and an nfs instance:
http://img.ctrlv.in/img/15/04/02/551d6eb01b7f9.png

Revision history for this message
Giuseppe Civitella (gcivitella) wrote :

Systems does not seem loaded:
http://paste.openstack.org/show/197956/
and there is a scale down policy in progress:
http://paste.openstack.org/show/197958/

Revision history for this message
Giuseppe Civitella (gcivitella) wrote :

In the Heat engine's log I can see this:
http://paste.openstack.org/show/197961/
It seem to be on teh point of reducing the number of instances but nothing happens.

Revision history for this message
Rabi Mishra (rabi) wrote :

I'll investigate a little more on this.

Changed in heat:
status: Incomplete → New
Revision history for this message
Giuseppe Civitella (gcivitella) wrote :

In my template I was using a random string to populate instances metadata:
...
  metadata_common_id:
    type: OS::Heat::RandomString
...
  web_group:
    ...
          database_password: { get_attr: [mysql, database_password] }
          metadata_common_id: {get_resource: metadata_common_id}
          web_security_group: {get_resource: web_security_group}
...
  cpu_alarm_high:
    ...
    matching_metadata: {'metadata.user_metadata.stack': {get_resource: metadata_common_id}

I changed the random string with OS::stack_id value and the autoscaling group began to work properly. Now min_size and max_size are respected.

Revision history for this message
Pavlo Shchelokovskyy (pshchelo) wrote :

As this is in fact the "official" way of doing alarms in Heat - Guiseppe, can we close the bug already?

Revision history for this message
Giuseppe Civitella (gcivitella) wrote :

Hi Pavlo,

I'm doing a few more tests, I'll tell You ASAP.

Revision history for this message
Giuseppe Civitella (gcivitella) wrote :

Hi,

I guess I spoke too soon. I did some more testing and the problem shows up again: with a min_size=1 and a max_size=3, the stack-create operation ends with 4 web instances due to the scaleup policy triggered by the high load of the booting vm and the long time they take to boot.
The scaledown policy resizes the autoscaling group to 2 web instances instead of 1.
When I try to delete the stack I receive this error:
http://paste.openstack.org/show/198053/
It seems the Heat thinks that one of the web instances has been added manually to the security group as if at some point it has lost track of it.

Revision history for this message
Giuseppe Civitella (gcivitella) wrote :

When I already have 4 web instances I can read in the log:
"truncating growth to 3 _calculate_new_capacity"

http://paste.openstack.org/show/198144/

Revision history for this message
Steve Baker (steve-stevebaker) wrote :

It looks like you may need to tune your cooldown values to match the "long time they take to boot"

Changed in heat:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for heat because there has been no activity for 60 days.]

Changed in heat:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.