min_count ignored for instance create

Bug #1224453 reported by Phil Day
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Mike BRIGHT
Havana
Fix Released
Medium
Nikola Đipanov

Bug Description

The server create API takes min_count and max_count values for the number of instances to be created, where the actual number to be created should be the highest value allowed by quota between these limits.

However the code in compute/api.py does a single check against max_count and then treats the exeception as a failure - resulting in messages such as:

min_count=1
max_count= (quota+1)

"Quota exceeded for instances: Requested 1, but already used 13 of 40 instances"

The code in _check_num_instances_quota() looks like it has most of the logic for adjusting the values when it gets an OverQuota exception from the initial reservation request based on max_count - but always ends up raising TooManyInstances .

Tags: api quotas
Changed in nova:
assignee: nobody → Mike BRIGHT (k5-openstack)
Matt Riedemann (mriedem)
tags: added: api
Revision history for this message
Christopher Yeoh (cyeoh-0) wrote :

Did you see this problem with Havana? Do you have a testcase?

It looks like _check_num_instances_quota does adjust the number of instances if the quota reserveration fails and then recursively call itself with the lower number.

Changed in nova:
status: New → Incomplete
Revision history for this message
Mike BRIGHT (k5-openstack) wrote :

I was not able to recreate the problem using DevStack (so current trunk).

It would be useful if the submitter could tell us how to reproduce.

Revision history for this message
Phil Day (philip-day) wrote :

Point taken - let me see if I can reproduce in DevStack. We hit the problem in our production system and I thought I'd spotted the flaw in the code, but I'd missed the recurrsion Chris has pointed out.

Revision history for this message
Phil Day (philip-day) wrote :

To reproduce in DevStack:

i) Set the default cores quota to -1 (unlimited). i.e add "quota_cores=-1" to nova.conf
ii) Restart the Nova API server
iii) Request a number of instances that spans the allowed quota of instances (default 10)

ubuntu@list-all:/mnt/devstack$ nova list
 +----+------+--------+------------+-------------+----------+
 | ID | Name | Status | Task State | Power State | Networks |
 +----+------+--------+------------+-------------+----------+
 +----+------+--------+------------+-------------+----------+

ubuntu@list-all:/mnt/devstack$ nova boot --image cd4e2b6c-8df9-4f31-a553-b83082dc6977 --num-instances 12 --flavor 1 philx
ERROR: Quota exceeded for instances: Requested 1-12, but already used 0 of 10 instances (HTTP 413) (Request-ID: req-42d0f729-c360-452b-b92f-ff02e673e9d4)

The issue seems to be that the code in compute/api.py _check_num_instances_quota() when it detects an over quota request doesn't correctly handle "-1" values for quotas.

By contrast the code in quota.py limit_check has specific checks for quota values to be >=0 when making checks.

Revision history for this message
Mike BRIGHT (mjbrightfr) wrote : Re: [Bug 1224453] Re: min_count ignored for instance create

Cool I got a bug to fix!

Thanks Phil, I'll have a look.

On 25 September 2013 18:15, Phil Day <email address hidden> wrote:

> To reproduce in DevStack:
>
> i) Set the default cores quota to -1 (unlimited). i.e add
> "quota_cores=-1" to nova.conf
> ii) Restart the Nova API server
> iii) Request a number of instances that spans the allowed quota of
> instances (default 10)
>
> ubuntu@list-all:/mnt/devstack$ nova list
> +----+------+--------+------------+-------------+----------+
> | ID | Name | Status | Task State | Power State | Networks |
> +----+------+--------+------------+-------------+----------+
> +----+------+--------+------------+-------------+----------+
>
> ubuntu@list-all:/mnt/devstack$ nova boot --image
> cd4e2b6c-8df9-4f31-a553-b83082dc6977 --num-instances 12 --flavor 1 philx
> ERROR: Quota exceeded for instances: Requested 1-12, but already used 0 of
> 10 instances (HTTP 413) (Request-ID:
> req-42d0f729-c360-452b-b92f-ff02e673e9d4)
>
> The issue seems to be that the code in compute/api.py
> _check_num_instances_quota() when it detects an over quota request
> doesn't correctly handle "-1" values for quotas.
>
> By contrast the code in quota.py limit_check has specific checks for
> quota values to be >=0 when making checks.
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1224453
>
> Title:
> min_count ignored for instance create
>
> Status in OpenStack Compute (Nova):
> Incomplete
>
> Bug description:
> The server create API takes min_count and max_count values for the
> number of instances to be created, where the actual number to be
> created should be the highest value allowed by quota between these
> limits.
>
> However the code in compute/api.py does a single check against
> max_count and then treats the exeception as a failure - resulting in
> messages such as:
>
> min_count=1
> max_count= (quota+1)
>
> "Quota exceeded for instances: Requested 1, but already used 13 of 40
> instances"
>
>
> The code in _check_num_instances_quota() looks like it has most of the
> logic for adjusting the values when it gets an OverQuota exception
> from the initial reservation request based on max_count - but always
> ends up raising TooManyInstances .
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/nova/+bug/1224453/+subscriptions
>

Revision history for this message
Phil Day (philip-day) wrote :

Instead of trying to fix OverQuota logic in _check_num_instances_quota(), why not just repeatedly call QUOTAS.reserve with a decrementing valued until it works ? That would leave all of the quota checking logic inside the quota class - which is better from an abstraction perspective.

Revision history for this message
Mike BRIGHT (k5-openstack) wrote :

OK, I can see the value of having the OverQuota logic completely in QUOTAS.reserve, but that seems incongruent to me with the idea of calling QUOTAS.reserve more than once as done at the moment, and more times as you suggest.
(Maybe I just need to get with the program and accept that call/Exception is a nice way of doing things)

I note that very similar logic is employed in compute/api.py for the resize function.

It seems to me that currently when quota_cores or quota_ram is set to unlimited (-1) headroom doesn't reflect this.
Wouldn't it be better to preserve the semantics of headroom and make it aware of unlimited quotas?

How about creating a headroom function something like:

    def _calculate_headroom(quotas, instance_type):
            headroom = dict((res, quotas[res] -
                             (usages[res]['in_use'] + usages[res]['reserved']))
                            for res in quotas.keys())

            # If quota_cores is unlimited, set cores headroom based on instances headroom:
            if (quotas['cores'] == -1):
                if instance_type['vcpus']:
                    headroom['cores'] = headroom['instances'] * instance_type['vcpus']
                 else:
                    headroom['cores'] = headroom['instances'] # or just 0?

            # If quota_ram is unlimited, set ram headroom based on instances headroom:
            if (quotas['ram'] == -1):
                if instance_type['memory_mb']:
                    headroom['ram'] = headroom['instances'] * instance_type['memory_mb']
                else:
                    headroom['ram'] = headroom['instances'] # or just 0?

            return headroom

Which would be called by _check_num_instances_quota() and also resize().

I’m not sure if headroom should be set to zero or not if the associated instance type is not set.
It would only really matter if resource is set to ‘ram’ or ‘cores’ in the line(s):
                 used = quotas[resource] - headroom[resource]

Thoughts?

Changed in nova:
status: Incomplete → Confirmed
tags: added: quotas
Revision history for this message
Stanislaw Pitucha (stanislaw-pitucha) wrote :

Unless I'm misunderstanding the "repeatedly call QUOTAS.reserve with a decrementing valued until it works" - wouldn't that enable DoS scenarios where user requests between 1 and 5738219047392 instances?

Since the db layer doesn't know anything about instance flavors, only resources, it looks like it still requires some logic in the calling site to translate OverQuota back into the number of instances that didn't fit. Although I do agree that trying to reserve the max first and then only adjusting on error may be a good idea. Hopefully the user is reasonable and most of the requests will be fulfilled.

Revision history for this message
Mike BRIGHT (k5-openstack) wrote :

It's a good point about DoS, if repeatedly calling QUOTAS.reserve then it would be necessary to apply some reasonable limits to prevent DoS.

But what about what I proposed which would not require more than the current 2nd call to QUOTAS.reserve?

Does it make sense?

Revision history for this message
Phil Day (philip-day) wrote :

Stan's point about DoS is valid - I guess we could always start from min(max_count, quota_value), but for a user with a high quota that might still be quite a lot of iterations.

My main concern with the current approach is the way the logic for interpreting quota values is split between the Quota class and the headroom calculation in _check_num_instances_quota() - which is of course the source of this bug.

What about if the headroom information was returned as part of the OverQuota exception (quotas. usage, and overs already are) - and a lack of a specific value in headroom means unlimited for that resource ?

Then the headroom calculation could be encapsulated in the Quota class

Revision history for this message
Mike BRIGHT (k5-openstack) wrote :

I've been looking at this in the debugger and I see that it's in quota.py: quota_reserve() that the OverQuota exception is thrown.
I could add the current headroom calculation code in that function and return headroom as one of the "exc.kwargs" arguments.

HOWEVER, as we don't have instance_type available in that function, headroom would still require to be corrected for the case of quota_vcpus == -1, from outside the function (at two places in compute/api.py).
Because if quota_vcpus is -1 and instance_type['vcpus'] is set then we should set headroom as:
    headroom['cores'] = headroom['instances'] * instance_type['vcpus']

So can someone advise here?
Is there a way of passing instance_type into QUOTAS.reserve ... down to quota_reserve() ?
I'm unsure of the plumbing.

Or should I correct headroom (yuk!) after receiving the OverQuota exception (in 2 places in compute/api.py).
In that case I still suggest the _calculate_headroom() function previously proposed here.

NOTE: As a separate issue, I have not understood Phils' suggestion that QUOTAS.reserve be called several times ...

Revision history for this message
Phil Day (philip-day) wrote :

I don't really see why you need instance type passed in, you can calculate the number of vcps that are being requested for each instance as "req_cores/instances" can't you ?

Ignore my earlier suggestion about calling Quote.reserve multiple times - doing the headroom calculation inside reserve would avoid that.

Revision history for this message
Mike BRIGHT (mjbrightfr) wrote :

Yes, you're right - thanks for pointing this out.

On 8 October 2013 11:16, Phil Day <email address hidden> wrote:

> I don't really see why you need instance type passed in, you can
> calculate the number of vcps that are being requested for each instance
> as "req_cores/instances" can't you ?
>
>
> Ignore my earlier suggestion about calling Quote.reserve multiple times -
> doing the headroom calculation inside reserve would avoid that.
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1224453
>
> Title:
> min_count ignored for instance create
>
> Status in OpenStack Compute (Nova):
> Confirmed
>
> Bug description:
> The server create API takes min_count and max_count values for the
> number of instances to be created, where the actual number to be
> created should be the highest value allowed by quota between these
> limits.
>
> However the code in compute/api.py does a single check against
> max_count and then treats the exeception as a failure - resulting in
> messages such as:
>
> min_count=1
> max_count= (quota+1)
>
> "Quota exceeded for instances: Requested 1, but already used 13 of 40
> instances"
>
>
> The code in _check_num_instances_quota() looks like it has most of the
> logic for adjusting the values when it gets an OverQuota exception
> from the initial reservation request based on max_count - but always
> ends up raising TooManyInstances .
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/nova/+bug/1224453/+subscriptions
>

Changed in nova:
status: Confirmed → In Progress
Changed in nova:
status: In Progress → Fix Committed
Changed in nova:
milestone: none → icehouse-1
importance: Undecided → Medium
Revision history for this message
Thierry Carrez (ttx) wrote :

@Mike: could you add a link to the review/change that fixed it ?

Revision history for this message
Mike BRIGHT (k5-openstack) wrote :

Here are links to the review/changes:

on master - https://review.openstack.org/#/c/51263/

on stable/havana - https://review.openstack.org/#/c/56202/

Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: icehouse-1 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.