Bug #1224453 “min_count ignored for instance create” : Bugs : OpenStack Compute (nova)

Mike BRIGHT (k5-openstack) on 2013-09-12

Changed in nova:
assignee:	nobody → Mike BRIGHT (k5-openstack)

Matt Riedemann (mriedem) on 2013-09-12

tags:

added: api

Revision history for this message

Christopher Yeoh (cyeoh-0) wrote on 2013-09-17:

#1

Did you see this problem with Havana? Do you have a testcase?

It looks like _check_num_instances_quota does adjust the number of instances if the quota reserveration fails and then recursively call itself with the lower number.

Changed in nova:
status:	New → Incomplete

Revision history for this message

Mike BRIGHT (k5-openstack) wrote on 2013-09-17:

#2

I was not able to recreate the problem using DevStack (so current trunk).

It would be useful if the submitter could tell us how to reproduce.

Revision history for this message

Phil Day (philip-day) wrote on 2013-09-17:

#3

Point taken - let me see if I can reproduce in DevStack. We hit the problem in our production system and I thought I'd spotted the flaw in the code, but I'd missed the recurrsion Chris has pointed out.

Revision history for this message

Phil Day (philip-day) wrote on 2013-09-25:

#4

To reproduce in DevStack:

i) Set the default cores quota to -1 (unlimited). i.e add "quota_cores=-1" to nova.conf
ii) Restart the Nova API server
iii) Request a number of instances that spans the allowed quota of instances (default 10)

ubuntu@list-all:/mnt/devstack$ nova boot --image cd4e2b6c-8df9-4f31-a553-b83082dc6977 --num-instances 12 --flavor 1 philx
ERROR: Quota exceeded for instances: Requested 1-12, but already used 0 of 10 instances (HTTP 413) (Request-ID: req-42d0f729-c360-452b-b92f-ff02e673e9d4)

The issue seems to be that the code in compute/api.py _check_num_instances_quota() when it detects an over quota request doesn't correctly handle "-1" values for quotas.

By contrast the code in quota.py limit_check has specific checks for quota values to be >=0 when making checks.

Revision history for this message

Mike BRIGHT (mjbrightfr) wrote on 2013-09-25: Re: [Bug 1224453] Re: min_count ignored for instance create

#5

Cool I got a bug to fix!

Thanks Phil, I'll have a look.

On 25 September 2013 18:15, Phil Day <email address hidden> wrote:

> To reproduce in DevStack:
>
> i) Set the default cores quota to -1 (unlimited). i.e add
> "quota_cores=-1" to nova.conf
> ii) Restart the Nova API server
> iii) Request a number of instances that spans the allowed quota of
> instances (default 10)
>
> ubuntu@list-all:/mnt/devstack$ nova list
> +----+------+--------+------------+-------------+----------+
> | ID | Name | Status | Task State | Power State | Networks |
> +----+------+--------+------------+-------------+----------+
> +----+------+--------+------------+-------------+----------+
>
> ubuntu@list-all:/mnt/devstack$ nova boot --image
> cd4e2b6c-8df9-4f31-a553-b83082dc6977 --num-instances 12 --flavor 1 philx
> ERROR: Quota exceeded for instances: Requested 1-12, but already used 0 of
> 10 instances (HTTP 413) (Request-ID:
> req-42d0f729-c360-452b-b92f-ff02e673e9d4)
>
> The issue seems to be that the code in compute/api.py
> _check_num_instances_quota() when it detects an over quota request
> doesn't correctly handle "-1" values for quotas.
>
> By contrast the code in quota.py limit_check has specific checks for
> quota values to be >=0 when making checks.
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1224453
>
> Title:
> min_count ignored for instance create
>
> Status in OpenStack Compute (Nova):
> Incomplete
>
> Bug description:
> The server create API takes min_count and max_count values for the
> number of instances to be created, where the actual number to be
> created should be the highest value allowed by quota between these
> limits.
>
> However the code in compute/api.py does a single check against
> max_count and then treats the exeception as a failure - resulting in
> messages such as:
>
> min_count=1
> max_count= (quota+1)
>
> "Quota exceeded for instances: Requested 1, but already used 13 of 40
> instances"
>
>
> The code in _check_num_instances_quota() looks like it has most of the
> logic for adjusting the values when it gets an OverQuota exception
> from the initial reservation request based on max_count - but always
> ends up raising TooManyInstances .
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/nova/+bug/1224453/+subscriptions
>

Cool I got a bug to fix!

Thanks Phil, I'll have a look.

On 25 September 2013 18:15, Phil Day <1224453@bugs.launchpad.net> wrote:

> To reproduce in DevStack:
>
> i) Set the default cores quota to -1 (unlimited).  i.e add
>  "quota_cores=-1"  to nova.conf
> ii) Restart the Nova API server
> iii) Request a number of instances that spans the allowed quota of
> instances (default 10)
>
> ubuntu@list-all:/mnt/devstack$ nova list
>  +----+------+--------+------------+-------------+----------+
>  | ID | Name | Status | Task State | Power State | Networks |
>  +----+------+--------+------------+-------------+----------+
>  +----+------+--------+------------+-------------+----------+
>
> ubuntu@list-all:/mnt/devstack$ nova boot --image
> cd4e2b6c-8df9-4f31-a553-b83082dc6977 --num-instances 12 --flavor 1 philx
> ERROR: Quota exceeded for instances: Requested 1-12, but already used 0 of
> 10 instances (HTTP 413) (Request-ID:
> req-42d0f729-c360-452b-b92f-ff02e673e9d4)
>
> The issue seems to be that the code in compute/api.py
> _check_num_instances_quota()  when it detects an over quota request
> doesn't correctly handle "-1" values for quotas.
>
> By contrast the code in quota.py limit_check has specific checks for
> quota values to be >=0  when making checks.
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1224453
>
> Title:
>   min_count ignored for instance create
>
> Status in OpenStack Compute (Nova):
>   Incomplete
>
> Bug description:
>   The server create API takes min_count and max_count values for the
>   number of instances to  be created, where the actual number to be
>   created should be the highest value allowed by quota between these
>   limits.
>
>   However the code in compute/api.py does a single check against
>   max_count and then treats the exeception as a failure - resulting in
>   messages such as:
>
>   min_count=1
>   max_count= (quota+1)
>
>   "Quota exceeded for instances: Requested 1, but already used 13 of 40
>   instances"
>
>
>   The code in _check_num_instances_quota() looks like it has most of the
>   logic for adjusting the values when it gets an OverQuota exception
>   from the initial reservation request based on max_count - but always
>   ends up raising TooManyInstances .
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/nova/+bug/1224453/+subscriptions
>

Revision history for this message

Phil Day (philip-day) wrote on 2013-09-26:

#6

Instead of trying to fix OverQuota logic in _check_num_instances_quota(), why not just repeatedly call QUOTAS.reserve with a decrementing valued until it works ? That would leave all of the quota checking logic inside the quota class - which is better from an abstraction perspective.

Revision history for this message

Mike BRIGHT (k5-openstack) wrote on 2013-10-02:

#7

OK, I can see the value of having the OverQuota logic completely in QUOTAS.reserve, but that seems incongruent to me with the idea of calling QUOTAS.reserve more than once as done at the moment, and more times as you suggest.
(Maybe I just need to get with the program and accept that call/Exception is a nice way of doing things)

I note that very similar logic is employed in compute/api.py for the resize function.

It seems to me that currently when quota_cores or quota_ram is set to unlimited (-1) headroom doesn't reflect this.
Wouldn't it be better to preserve the semantics of headroom and make it aware of unlimited quotas?

How about creating a headroom function something like:

    def _calculate_headroom(quotas, instance_type):
            headroom = dict((res, quotas[res] -
                             (usages[res]['in_use'] + usages[res]['reserved']))
                            for res in quotas.keys())

            # If quota_cores is unlimited, set cores headroom based on instances headroom:
            if (quotas['cores'] == -1):
                if instance_type['vcpus']:
                    headroom['cores'] = headroom['instances'] * instance_type['vcpus']
                 else:
                    headroom['cores'] = headroom['instances'] # or just 0?

            # If quota_ram is unlimited, set ram headroom based on instances headroom:
            if (quotas['ram'] == -1):
                if instance_type['memory_mb']:
                    headroom['ram'] = headroom['instances'] * instance_type['memory_mb']
                else:
                    headroom['ram'] = headroom['instances'] # or just 0?

return headroom

Which would be called by _check_num_instances_quota() and also resize().

I’m not sure if headroom should be set to zero or not if the associated instance type is not set.
It would only really matter if resource is set to ‘ram’ or ‘cores’ in the line(s):
used = quotas[resource] - headroom[resource]

Thoughts?

OK, I can see the value of having the OverQuota logic completely in QUOTAS.reserve, but that seems incongruent to me with the idea of calling QUOTAS.reserve more than once as done at the moment, and more times as you suggest.
(Maybe I just need to get with the program and accept that call/Exception is a nice way of doing things)

I note that very similar logic is employed in compute/api.py for the resize function.

It seems to me that currently when quota_cores or quota_ram is set to unlimited (-1) headroom doesn't reflect this.
Wouldn't it be better to preserve the semantics of headroom and make it aware of unlimited quotas?

How about creating a headroom function something like:

def _calculate_headroom(quotas, instance_type):
            headroom = dict((res, quotas[res] -
                             (usages[res]['in_use'] + usages[res]['reserved']))
                            for res in quotas.keys())

# If quota_cores is unlimited, set cores headroom based on instances headroom:
            if (quotas['cores'] == -1):
                if instance_type['vcpus']:
                    headroom['cores'] = headroom['instances'] * instance_type['vcpus']
                 else:
                    headroom['cores'] = headroom['instances'] # or just 0?
                    
            # If quota_ram is unlimited, set ram headroom based on instances headroom:
            if (quotas['ram'] == -1):
                if instance_type['memory_mb']:
                    headroom['ram'] = headroom['instances'] * instance_type['memory_mb']        
                else:
                    headroom['ram'] = headroom['instances'] # or just 0?

return headroom

Which would be called by _check_num_instances_quota() and also resize().

I’m not sure if headroom should be set to zero or not if the associated instance type is not set.
It would only really matter if resource is set to ‘ram’ or ‘cores’ in the line(s):
                 used = quotas[resource] - headroom[resource]

Thoughts?

Mike BRIGHT (k5-openstack) on 2013-10-02

Changed in nova:
status:	Incomplete → Confirmed
tags:	added: quotas

Revision history for this message

Stanislaw Pitucha (stanislaw-pitucha) wrote on 2013-10-02:

#8

Unless I'm misunderstanding the "repeatedly call QUOTAS.reserve with a decrementing valued until it works" - wouldn't that enable DoS scenarios where user requests between 1 and 5738219047392 instances?

Since the db layer doesn't know anything about instance flavors, only resources, it looks like it still requires some logic in the calling site to translate OverQuota back into the number of instances that didn't fit. Although I do agree that trying to reserve the max first and then only adjusting on error may be a good idea. Hopefully the user is reasonable and most of the requests will be fulfilled.

Revision history for this message

Mike BRIGHT (k5-openstack) wrote on 2013-10-02:

#9

It's a good point about DoS, if repeatedly calling QUOTAS.reserve then it would be necessary to apply some reasonable limits to prevent DoS.

But what about what I proposed which would not require more than the current 2nd call to QUOTAS.reserve?

Does it make sense?

Revision history for this message

Phil Day (philip-day) wrote on 2013-10-03:

#10

Stan's point about DoS is valid - I guess we could always start from min(max_count, quota_value), but for a user with a high quota that might still be quite a lot of iterations.

My main concern with the current approach is the way the logic for interpreting quota values is split between the Quota class and the headroom calculation in _check_num_instances_quota() - which is of course the source of this bug.

What about if the headroom information was returned as part of the OverQuota exception (quotas. usage, and overs already are) - and a lack of a specific value in headroom means unlimited for that resource ?

Then the headroom calculation could be encapsulated in the Quota class

Revision history for this message

Mike BRIGHT (k5-openstack) wrote on 2013-10-07:

#11

I've been looking at this in the debugger and I see that it's in quota.py: quota_reserve() that the OverQuota exception is thrown.
I could add the current headroom calculation code in that function and return headroom as one of the "exc.kwargs" arguments.

HOWEVER, as we don't have instance_type available in that function, headroom would still require to be corrected for the case of quota_vcpus == -1, from outside the function (at two places in compute/api.py).
Because if quota_vcpus is -1 and instance_type['vcpus'] is set then we should set headroom as:
headroom['cores'] = headroom['instances'] * instance_type['vcpus']

So can someone advise here?
Is there a way of passing instance_type into QUOTAS.reserve ... down to quota_reserve() ?
I'm unsure of the plumbing.

Or should I correct headroom (yuk!) after receiving the OverQuota exception (in 2 places in compute/api.py).
In that case I still suggest the _calculate_headroom() function previously proposed here.

NOTE: As a separate issue, I have not understood Phils' suggestion that QUOTAS.reserve be called several times ...

Revision history for this message

Phil Day (philip-day) wrote on 2013-10-08:

#12

I don't really see why you need instance type passed in, you can calculate the number of vcps that are being requested for each instance as "req_cores/instances" can't you ?

Ignore my earlier suggestion about calling Quote.reserve multiple times - doing the headroom calculation inside reserve would avoid that.

Revision history for this message

Mike BRIGHT (mjbrightfr) wrote on 2013-10-08:

#13

Yes, you're right - thanks for pointing this out.

On 8 October 2013 11:16, Phil Day <email address hidden> wrote:

> I don't really see why you need instance type passed in, you can
> calculate the number of vcps that are being requested for each instance
> as "req_cores/instances" can't you ?
>
>
> Ignore my earlier suggestion about calling Quote.reserve multiple times -
> doing the headroom calculation inside reserve would avoid that.
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1224453
>
> Title:
> min_count ignored for instance create
>
> Status in OpenStack Compute (Nova):
> Confirmed
>
> Bug description:
> The server create API takes min_count and max_count values for the
> number of instances to be created, where the actual number to be
> created should be the highest value allowed by quota between these
> limits.
>
> However the code in compute/api.py does a single check against
> max_count and then treats the exeception as a failure - resulting in
> messages such as:
>
> min_count=1
> max_count= (quota+1)
>
> "Quota exceeded for instances: Requested 1, but already used 13 of 40
> instances"
>
>
> The code in _check_num_instances_quota() looks like it has most of the
> logic for adjusting the values when it gets an OverQuota exception
> from the initial reservation request based on max_count - but always
> ends up raising TooManyInstances .
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/nova/+bug/1224453/+subscriptions
>

Mike BRIGHT (k5-openstack) on 2013-10-09

Changed in nova:
status:	Confirmed → In Progress

Mike BRIGHT (k5-openstack) on 2013-11-12

Changed in nova:
status:	In Progress → Fix Committed

Nikola Đipanov (ndipanov) on 2013-11-13

Changed in nova:
milestone:	none → icehouse-1
importance:	Undecided → Medium

Revision history for this message

Thierry Carrez (ttx) wrote on 2013-11-18:

#14

@Mike: could you add a link to the review/change that fixed it ?

Revision history for this message

Mike BRIGHT (k5-openstack) wrote on 2013-11-18:

#15

Here are links to the review/changes:

on master - https://review.openstack.org/#/c/51263/

on stable/havana - https://review.openstack.org/#/c/56202/

Thierry Carrez (ttx) on 2013-12-04

Changed in nova:
status:	Fix Committed → Fix Released

Thierry Carrez (ttx) on 2014-04-17

Changed in nova:
milestone:	icehouse-1 → 2014.1

OpenStack Compute (nova)

min_count ignored for instance create

Bug Description

Other bug subscribers

Remote bug watches

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Compute (nova)	Fix Released	Medium	Mike BRIGHT	OpenStack Compute (nova) 2014.1 "icehouse"
	Havana	Fix Released	Medium	Nikola Đipanov	OpenStack Compute (nova) 2013.2.1