server_group_members quota check failure with multi-create

Bug #1780373 reported by Mike Chen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Matt Riedemann
Pike
Fix Committed
High
Matt Riedemann
Queens
Fix Committed
High
Matt Riedemann

Bug Description

Circumstance:
When multi-creating a quota-exceeding number of instances in a server group, it will pass server_group_members quota check.

Actual result:
Servers successfully created.

Expected result:
Raising QuotaExceeded API exception.

Reproduce steps (Queen):
1 nova server-group-create sg affinity (policy shouldn't matter)
2 set in nova.conf server_group_members=2 (so we don't need to create too many servers to test)
3 nova boot --flavor flavor --net net-name=netname --image image --max-count 3 server-name
Then we will see all 3 servers created successfully, violating server_group_members quota policy.

Mike Chen (chenn2)
Changed in nova:
assignee: nobody → Chen (chenn2)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/580684

Changed in nova:
status: New → In Progress
Revision history for this message
melanie witt (melwitt) wrote :

First, I assume your boot request included the instance group, for example:

  nova boot --flavor flavor --net net-name=netname --image image --max-count 3 --hint group=<sg uuid> server-name

I was able to reproduce this with devstack by setting quota to 2 and using the above command ^ and found the reason for the bug is that the resource count for server_group_members during the quota check counts instance records for a user, and we're not creating instance records until much later on, in conductor. So on a fresh install and a multi-create request, the quota check repeatedly counts 0 group members for a user as we add members to the instance_group_members table in the API database.

I added several comments on the patch review, but to summarize, it seems like we need to consider counting build requests in addition to instance records for multi-create scenarios (and maybe more) while de-duping instance uuids for the small window where a build request and instance record can co-exist for the same instance uuid, to avoid over-counting.

Changed in nova:
importance: Undecided → High
tags: added: quotas
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/580755

Changed in nova:
assignee: Chen (chenn2) → Matt Riedemann (mriedem)
Matt Riedemann (mriedem)
tags: added: api
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/queens)

Related fix proposed to branch: stable/queens
Review: https://review.openstack.org/581845

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/581846

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/pike)

Related fix proposed to branch: stable/pike
Review: https://review.openstack.org/581866

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/581867

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/580755
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f9874e059df50dc81803fcfdfd1045cc09624894
Submitter: Zuul
Branch: master

commit f9874e059df50dc81803fcfdfd1045cc09624894
Author: Matt Riedemann <email address hidden>
Date: Fri Jul 6 16:10:48 2018 -0400

    Add functional regressions tests for server_group_members OverQuota

    Since we started counting quotas in Pike, it is possible to bypass
    the server_group_members qouta check if either creating multiple
    servers in a single request or creating one server each in multiple
    concurrent requests. This is because the server_group_members
    count is based on existing server group members in the cell database
    and those group members (instances) don't get created in a cell until
    we get to conductor and after the scheduler picks a host. In other
    words, the server_group_members quota check in the API does not account
    for build requests.

    Change-Id: Icb268ca2f792bfcefd152ba4c6aa13270d9a7900
    Related-Bug: #1780373

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/580684
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=bbee9a26a5c64a1463bd9a9f82d735ec17c62d52
Submitter: Zuul
Branch: master

commit bbee9a26a5c64a1463bd9a9f82d735ec17c62d52
Author: Chen <email address hidden>
Date: Fri Jul 6 22:47:12 2018 +0800

    Fix server_group_members quota check

    For example there are 3 instances in a server group (quota is 5).
    When doing multi-creating of 3 more instances in this group
    (would have 6 members), current quota checking scheme will fail to
    prevent this happening, which is not expected.

    This is due to the server_group_members quota check previously
    only counting group members that existed as instance records in
    cell databases and not accounting for build requests which are
    the temporary representation of the instance in the API database
    before the instance is scheduled to a cell.

    Co-Authored-By: Matt Riedemann <email address hidden>

    Change-Id: If439f4486b8fe157c436c47aa408608e639a3e15
    Closes-Bug: #1780373

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.0.0.0b3

This issue was fixed in the openstack/nova 18.0.0.0b3 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/queens)

Reviewed: https://review.openstack.org/581845
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c7b0779632a8df7d988e9b52ae1f341af0b0df30
Submitter: Zuul
Branch: stable/queens

commit c7b0779632a8df7d988e9b52ae1f341af0b0df30
Author: Matt Riedemann <email address hidden>
Date: Fri Jul 6 16:10:48 2018 -0400

    Add functional regressions tests for server_group_members OverQuota

    Since we started counting quotas in Pike, it is possible to bypass
    the server_group_members qouta check if either creating multiple
    servers in a single request or creating one server each in multiple
    concurrent requests. This is because the server_group_members
    count is based on existing server group members in the cell database
    and those group members (instances) don't get created in a cell until
    we get to conductor and after the scheduler picks a host. In other
    words, the server_group_members quota check in the API does not account
    for build requests.

    Change-Id: Icb268ca2f792bfcefd152ba4c6aa13270d9a7900
    Related-Bug: #1780373
    (cherry picked from commit f9874e059df50dc81803fcfdfd1045cc09624894)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/queens)

Reviewed: https://review.openstack.org/581846
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1aa81ebfdc585451cbae9c9bbde8adfe339cb0dc
Submitter: Zuul
Branch: stable/queens

commit 1aa81ebfdc585451cbae9c9bbde8adfe339cb0dc
Author: Chen <email address hidden>
Date: Fri Jul 6 22:47:12 2018 +0800

    Fix server_group_members quota check

    For example there are 3 instances in a server group (quota is 5).
    When doing multi-creating of 3 more instances in this group
    (would have 6 members), current quota checking scheme will fail to
    prevent this happening, which is not expected.

    This is due to the server_group_members quota check previously
    only counting group members that existed as instance records in
    cell databases and not accounting for build requests which are
    the temporary representation of the instance in the API database
    before the instance is scheduled to a cell.

    Co-Authored-By: Matt Riedemann <email address hidden>

    Change-Id: If439f4486b8fe157c436c47aa408608e639a3e15
    Closes-Bug: #1780373
    (cherry picked from commit bbee9a26a5c64a1463bd9a9f82d735ec17c62d52)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.6

This issue was fixed in the openstack/nova 17.0.6 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/pike)

Reviewed: https://review.openstack.org/581866
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7858a84ee4d74e40adeecc0cf463e8937a1c5ada
Submitter: Zuul
Branch: stable/pike

commit 7858a84ee4d74e40adeecc0cf463e8937a1c5ada
Author: Matt Riedemann <email address hidden>
Date: Fri Jul 6 16:10:48 2018 -0400

    Add functional regressions tests for server_group_members OverQuota

    Since we started counting quotas in Pike, it is possible to bypass
    the server_group_members qouta check if either creating multiple
    servers in a single request or creating one server each in multiple
    concurrent requests. This is because the server_group_members
    count is based on existing server group members in the cell database
    and those group members (instances) don't get created in a cell until
    we get to conductor and after the scheduler picks a host. In other
    words, the server_group_members quota check in the API does not account
    for build requests.

    Change-Id: Icb268ca2f792bfcefd152ba4c6aa13270d9a7900
    Related-Bug: #1780373
    (cherry picked from commit f9874e059df50dc81803fcfdfd1045cc09624894)
    (cherry picked from commit c7b0779632a8df7d988e9b52ae1f341af0b0df30)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/pike)

Reviewed: https://review.openstack.org/581867
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=64f2104bd104638dce081d2f37ed2bd040ef7068
Submitter: Zuul
Branch: stable/pike

commit 64f2104bd104638dce081d2f37ed2bd040ef7068
Author: Chen <email address hidden>
Date: Fri Jul 6 22:47:12 2018 +0800

    Fix server_group_members quota check

    For example there are 3 instances in a server group (quota is 5).
    When doing multi-creating of 3 more instances in this group
    (would have 6 members), current quota checking scheme will fail to
    prevent this happening, which is not expected.

    This is due to the server_group_members quota check previously
    only counting group members that existed as instance records in
    cell databases and not accounting for build requests which are
    the temporary representation of the instance in the API database
    before the instance is scheduled to a cell.

    Co-Authored-By: Matt Riedemann <email address hidden>

    Change-Id: If439f4486b8fe157c436c47aa408608e639a3e15
    Closes-Bug: #1780373
    (cherry picked from commit bbee9a26a5c64a1463bd9a9f82d735ec17c62d52)
    (cherry picked from commit 1aa81ebfdc585451cbae9c9bbde8adfe339cb0dc)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.1.8

This issue was fixed in the openstack/nova 16.1.8 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.