[placement] attempt to put allocation to resource provide that does not host class of resource causes 500

Bug #1704574 reported by Chris Dent
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Chris Dent
Ocata
Fix Committed
Undecided
Chris Dent

Bug Description

I made a typo while writing some gabbi tests and uncovered a 500 in the placement service. If you try to allocate to a resource provider that does not host that class of resource it can have a KeyError during capacity checking. given the following gabbi in microversion 1.10:

- name: create a resource provider
  POST: /resource_providers
  data:
      name: an rp
  status: 201

- name: get resource provider
  GET: $LOCATION
  status: 200

- name: create a resource class
  PUT: /resource_classes/CUSTOM_GOLD
  status: 201

- name: add inventory to an rp
  PUT: /resource_providers/$HISTORY['get resource provider'].$RESPONSE['$.uuid']/inventories
  data:
      resource_provider_generation: 0
      inventories:
          VCPU:
              total: 24
          CUSTOM_GOLD:
              total: 5
  status: 200

- name: allocate some of it
  PUT: /allocations/6d9f83db-6eb5-49f6-84b0-5d03c6aa9fc8
  data:
      allocations:
          - resource_provider:
                uuid: $HISTORY['get resource provider'].$RESPONSE['$.uuid']
            resources:
                DISK_GB: 5
                CUSTOM_GOLD: 1
      project_id: 42a32c07-3eeb-4401-9373-68a8cdca6784
      user_id: 66cb2f29-c86d-47c3-8af5-69ae7b778c70
  status: 204

when DISK_GB is checked for capacity, we get:

2017-07-15 17:41:47,224 ERROR [nova.api.openstack.placement.handler] Uncaught exception
Traceback (most recent call last):
  File "nova/api/openstack/placement/handler.py", line 215, in __call__
    return dispatch(environ, start_response, self._map)
  File "nova/api/openstack/placement/handler.py", line 144, in dispatch
    return handler(environ, start_response)
  File "/home/cdent/src/nova/.tox/cover/local/lib/python2.7/site-packages/webob/dec.py", line 131, in __call__
    resp = self.call_func(req, *args, **self.kwargs)
  File "nova/api/openstack/placement/wsgi_wrapper.py", line 29, in call_func
    super(PlacementWsgify, self).call_func(req, *args, **kwargs)
  File "/home/cdent/src/nova/.tox/cover/local/lib/python2.7/site-packages/webob/dec.py", line 196, in call_func
    return self.func(req, *args, **kwargs)
  File "nova/api/openstack/placement/microversion.py", line 268, in decorated_func
    return _find_method(f, version)(req, *args, **kwargs)
  File "nova/api/openstack/placement/util.py", line 138, in decorated_function
    return f(req)
  File "nova/api/openstack/placement/handlers/allocation.py", line 285, in set_allocations
    return _set_allocations(req, ALLOCATION_SCHEMA_V1_8)
  File "nova/api/openstack/placement/handlers/allocation.py", line 249, in _set_allocations
    allocations.create_all()
  File "nova/objects/resource_provider.py", line 1851, in create_all
    self._set_allocations(self._context, self.objects)
  File "/home/cdent/src/nova/.tox/cover/local/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 979, in wrapper
    return fn(*args, **kwargs)
  File "nova/objects/resource_provider.py", line 1811, in _set_allocations
    before_gens = _check_capacity_exceeded(conn, allocs)
  File "nova/objects/resource_provider.py", line 1615, in _check_capacity_exceeded
    usage = usage_map[key]
KeyError: ('14930a42-78df-4038-aafa-c959e18111e5', 2)

Tags: placement
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/484162

Changed in nova:
assignee: nobody → Chris Dent (cdent)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/484162
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=913149249cc00f50a6219d3ddc86f3600a610c00
Submitter: Jenkins
Branch: master

commit 913149249cc00f50a6219d3ddc86f3600a610c00
Author: Chris Dent <email address hidden>
Date: Sat Jul 15 18:49:57 2017 +0100

    [placement] fix 500 error when allocating to bad class

    Adjust exception handling when calling set_allocations so that a
    KeyError in the usage_map raises an InvalidInventory. When making
    allocations against a resource provider with >1 resource classes
    and where one of those resource classes does not have inventory on the
    provider, we can attempt to get info out of the usage_map that is not
    there, and get a KeyError. This catches the KeyError and turns it into
    an InvalidInventory which eventually results in a 409 response,
    consistent with other responses to bad allocations. Since this is fixing
    a 500, no microversion required.

    Change-Id: I52fa02b56f8e62dfa206a3969a99fab250508760
    Closes-Bug: #1704574

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/485263

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.0.0.0b3

This issue was fixed in the openstack/nova 16.0.0.0b3 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/ocata)

Reviewed: https://review.openstack.org/485263
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=dc85c089702a9819fdab7c951d427de838a3dcc6
Submitter: Jenkins
Branch: stable/ocata

commit dc85c089702a9819fdab7c951d427de838a3dcc6
Author: Chris Dent <email address hidden>
Date: Sat Jul 15 18:49:57 2017 +0100

    [placement] fix 500 error when allocating to bad class

    Adjust exception handling when calling set_allocations so that a
    KeyError in the usage_map raises an InvalidInventory. When making
    allocations against a resource provider with >1 resource classes
    and where one of those resource classes does not have inventory on the
    provider, we can attempt to get info out of the usage_map that is not
    there, and get a KeyError. This catches the KeyError and turns it into
    an InvalidInventory which eventually results in a 409 response,
    consistent with other responses to bad allocations. Since this is fixing
    a 500, no microversion required.

    NOTE(mriedem): The functional test required some tweaks for Ocata since
    we didn't have the 1.7 microversion in Ocata. In addition, inventories
    in early days required that max_unit be set, otherwise it would default
    to 0 and no allocations could be made. We might wish to consider that a
    bug in older versions that we should fix?

    Change-Id: I52fa02b56f8e62dfa206a3969a99fab250508760
    Closes-Bug: #1704574
    (cherry picked from commit 913149249cc00f50a6219d3ddc86f3600a610c00)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.0.7

This issue was fixed in the openstack/nova 15.0.7 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.