RFE: allow replacing the QoS policy of bound port

Bug #1882804 reported by Balazs Gibizer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Wishlist
Lajos Katona

Bug Description

Problem
=======

Neutron and Nova support creating servers with port having QoS minimum bandwidth policy rule. Such server create request results in bandwidth resource allocation in placement and this way ensures that the server is placed to a compute host where enough bandwidth is available.

However the use case to change the minimum bandwidth guarantee of a bound port hasn't been supported yet.

The end user can update the bound port to point to a different QoS policy and Neutron will reflect such change in the resource_request of the port, but the bandwidth allocation in placement has not been updated.

Step to reproduce
=================

# Set up net, subnet, QoS policies with different min bw rules
# and a port with the first policy (qp0)
#
openstack network create net0 \
--provider-network-type vlan \
--provider-physical-network physnet0 \
--provider-segment 100 \
##
openstack subnet create subnet0 \
--network net0 \
--subnet-range 10.0.4.0/24 \
##
openstack network qos policy create qp0
openstack network qos rule create qp0 \
--type minimum-bandwidth \
--min-kbps 1000 \
--egress \
##
openstack network qos rule create qp0 \
--type minimum-bandwidth \
--min-kbps 1000 \
--ingress \
##
openstack port create port-normal-qos \
--network net0 \
--vnic-type normal \
--qos-policy qp0 \
##
openstack network qos policy create qp1
openstack network qos rule create qp1 \
--type minimum-bandwidth \
--min-kbps 2000 \
--egress \
##
openstack network qos rule create qp1 \
--type minimum-bandwidth \
--min-kbps 2000 \
--ingress \
##

# Create a nova server with the port and check the resource_request
# of the port and the resulting allocation in placement
#
openstack --os-compute-api-version 2.72 \
server create vm1 \
--flavor c1 \
--image cirros-0.4.0-x86_64-disk \
--nic port-id=port-normal-qos \
--wait \
##
openstack port show port-normal-qos \
-f table \
-c binding_profile -c resource_request -c status -c device_owner \
##
openstack --os-placement-api-version 1.30 \
resource provider usage show 1110cf59-cabf-526c-bacc-08baabbac692 \
##

# Change the QoS policy from qp0 to qp1 on the bound port
#
openstack port set port-normal-qos \
--qos-policy qp1 \
##

# The resource request of the port is updated according to qp1
#
openstack port show port-normal-qos \
-f table \
-c binding_profile -c resource_request -c status \
-c device_owner -c device_id \
##
# But the resource allocation in placement is not changed
# according to qp1
openstack --os-placement-api-version 1.30 \
resource provider usage show 1110cf59-cabf-526c-bacc-08baabbac692 \
##

Proposed Solution
=================

This use case was discussed on the Neutron-Nova cross project session of the Victoria PTG [1]. There the result was to try to implement the placement update on the Neutron side. This could be achieved by the following high level sequence

1) Neutron receives the PUT /ports/{port_id} API request for a bound port where the qos_policy_id is requested to be changed

2) Neutron calculates the difference in the resource_request of the port due to the requested policy change. If no there is no change then port update continues as today. If there is a change then

3) Neutron calls the Placement GET /allocations/{consumer_uuid} API[2] where the consumer_id is the device_id of the port. In the Placement response Neutron updates the requested resources under the "allocations".{rp_uuid} key where the rp_uuid is the value of the binding:profile["allocation"] key of the port. The resources subdictionary is updated by applying the resource request difference calculated at #2). The rest of the structure is left unchanged including the generation keys and the consumer_generation key.

4) Neutron calls the Placement PUT /allocations/{consumer_uuid} API[3] with the updated allocation structure.

4a) If Placement returns HTTP 200 then port update continues as today.

4b) If Placement returns HTTP 409 with error code "placement.concurrent_update" then Neutron needs to repeat step #3) and #4)

4c) If Placement returns HTTP 409 with other error code then there is not enough bandwidth resource available on the device and therefore the QoS policy change should be rejected.

Notes:
* use Placement API version 1.28 or higher as there the GET response and PUT request has a symmetric structure

* ports that are not bound can be skipped as they will not have resource allocation in placement

* if the Nova server is deleted during the QoS policy update then Neutron might query placement with a non-existing consumer_uuid at #3). Placement will return an empty allocation in this case and Neutron needs to fail during #3) when trying to modify the received allocation locally as Placement will blindly allow PUT-ing back a new allocation at #4 without any error resulting in a resource leak.

* A single port only allocates from one resource provider but multiple ports bound to the same Nova server might allocate from the same resource provider. So it is important to apply the resource request difference at #3) instead of copying the new resource request of the port into the allocation.

[1] https://etherpad.opendev.org/p/nova-victoria-ptg
[2] https://docs.openstack.org/api-ref/placement/?expanded=list-allocations-detail#list-allocations
[3] https://docs.openstack.org/api-ref/placement/?expanded=list-resource-provider-allocations-detail#list-resource-provider-allocations

tags: added: rfe
description: updated
description: updated
Changed in neutron:
status: New → Triaged
status: Triaged → New
tags: added: rfe-triaged
removed: rfe
Changed in neutron:
assignee: nobody → Lajos Katona (lajos-katona)
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

We discussed that rfe on last drivers team meeting http://eavesdrop.openstack.org/meetings/neutron_drivers/2020/neutron_drivers.2020-06-26-14.00.log.html#l-18 and we approved it. Please continue with implementation now.

tags: added: rfe-approved
removed: rfe-triaged
Changed in neutron:
milestone: none → victoria-2
Changed in neutron:
importance: Undecided → Wishlist
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-lib (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/741452

Changed in neutron:
milestone: victoria-2 → victoria-3
Revision history for this message
Slawek Kaplonski (slaweq) wrote :
Changed in neutron:
milestone: victoria-3 → none
Revision history for this message
Lajos Katona (lajos-katona) wrote :

I try to summarize the current approach:

* In case of the port has QoS policy with minimum bandwidth rule and update changes the QoS policy to a different minkbs value:

** neutron calls for placement (see: https://review.opendev.org/741452 ) to fetch the allocation and update the related allocation in placement's db.

** As Gibi wrote above: due to parallel placement allocation updates it is possible that
*** the call fails as the allocation was moved to another RP (server move)
*** the call must be repeated as the allocation was changed (i.e.: vCPU allocation for the same device/consumer)

** The proposal is to add the call for placement outside db transaction, as PORT AFTER_UPDATE callback.

** It is possible that the allocation change is still in progress while other operations related to port update are finished, so a new field "update_status" should be used to show the user the status of port update.

*** "port_update_status" extension should be supported now by qos_plugin.

*** The "update_status" should be "ACTIVE" for fresh new just created port.

*** In case of update where there is no QoS policy affected the "update_status" changes to ACTIVE (effectively no change)

*** In case where port update affects QoS policy with min bw minkbps, in current _validate_update_port_callback (https://opendev.org/openstack/neutron/src/branch/master/neutron/services/qos/qos_plugin.py#L175 ) the "update_status" field changed to PORT_STATUS_BUILD (good question if it is a good status for this case)

*** The "update_status" field can be changed to "ACTIVE" if the AFTER_UPDATE callback succeeds and placement allocation update is ready.

Revision history for this message
Lajos Katona (lajos-katona) wrote :

Another scenario proposed by Rodolfo (ralonsoh), is to do placement call for update allocations (PUT /allocations/{consumer_uuid} ) before the db transaction begins for port update (BEFORE_UPDATE callback for Port).

Pros:
* no change in current Port update API (synchronous behaviour, when the API returns the user can expect that all operations arefinished)
* the possibly time consuming REST call for placement is outside (before) db transaction.
* as nothing is written to the DB in case of failure there is no need for rollback.

Cons:
* in case there is a need to update allocations in placement the user can experience extra time cost for port update (this is anyway true)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/747774

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-lib (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/748279

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.opendev.org/748363

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/748507

Revision history for this message
Lajos Katona (lajos-katona) wrote :

Short summary of latest discussions with Rodolfo and experiences with PoC code:

1) Alternative one, (see: https://review.opendev.org/747774 ). The idea is to do the placement allocation update before the db transaction by subscribing to PORT BEFORE_UPDATE event.
1.1) Pros: As PUT /allocations/* happens before db transaction, if placement side fails the port update fails, so no inconsistency.
1.2) Cons:
* if the placement PUT /allocations/* succeeds but if later there is an error in neutron (like db issue or similar) which force rollback, rollback of the change in placement can't happen, thus causing neutron & placement inconsistency.

* Update can be slow due to reaching out to Placement REST API (and retries, etc...).

2) Alternative 2 is to call for placement after db transaction, with subscribing to PORT's AFTER_UPDATE event.
2.1) Pros: If neutron fails with port update we will not call PUT /allocations/*, so no inconsistency from this side.
2.2) Cons:
* if PUT /allocations/* fails no way to roll back neutron db (or I my lack of knowledge is the limit here).
** The proposed solution (which doesn't solve the issue with db rollback in case PUT allocations fails), is to add a new port attribute "update_status". This would set to UPDATE_PENDING in case a QoS min_bw change is happening and there is a need to call placement REST API. If the PUT allocations succeeds then the update_status changes back to ACTIVE, otherwise to ERROR. This way the user have feedback if all aspects of the update was successful.

* till the port's update_status goes to ACTIVE can be slow, as placement REST API is called (with retries, etc...)

3) Another option should be (Which will not solve the full consistency between neutron db and placement but as I see now not better in this area the the previous ones) to use a batch notifier for this. In case of port update where QoS policy is affected (with min_bw rules) the notifier notifies placement and in case the PUT allocations operation succeeds changes the previously set (to PENDING_UPDATE) update_status of the port to ACTIVE again, otherwise to ERROR.

3.1) Pros: no speed change in port update, as the notification is decoupled from port update change. (Perhaps similar like 2) )

3.2) Cons: really no way to rollback neutron db in case placement REST call fails.

Revision history for this message
Lajos Katona (lajos-katona) wrote :

This morning rubasov gave a really good extra viewpoint for this:
In the 1) case (do placement allocation update for BEFORE_UPDATE event notification)the possibility to have db issue or similar is smaller than the possibility of REST PUT issue towards placement. If the db error (or any issue after PUT allocations) happens that would mean overallocation most possibly and that is a "safer" problem than underallocation from admin perspective.

Revision history for this message
Bence Romsics (bence-romsics) wrote :

Yep, I believe we have to accept that some inconsistency will happen occasionally (due to the distributed and non-transactional nature of the neutron-placement system) and we have to choose which side we want to fail on: over- or underallocation. In previous cases around the base feature I believe we chose to occasionally fail by overallocation (therefore wasting some resource but still able to provide guarantees).

It seems to me the cleanup after the overallocation is simple: delete the instance.

Also consider which failure is more likely:
1) neutron db transaction succeeds but the placement transaction after is interrupted
2) the reverse order

My hunch is that (1) is more likely therefore (2) is a better choice (unless we choose (1) AND start tracking the status of the placement transaction).

Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

Do we have other than db failure scenario in case of data plane enforcement is turned on? I guess in that case neutron server needs to talk to the neutron agents which can also fail.

I agree that it is more likely that the placement update fails than the neutron db update fails from neutron perspective. Also I can accept that placement failure is more likely that a neutron agent failure (in case of data plane enforcement) However I disagree that only resource over-allocation can happen due to failure after placement update.

Case A:
* end user updates the qos policy of a bound port to new policy that requests _more_ resources than the current one.
* neutron increases the qos resource allocation in placement but then fails to update the db. I guess in this case the client's request has been rejected with an error code (I guess HTTP 500 in case of the db is dead).
* the end user waits a but and retries (maybe it was a transient error) then placement updated the second time, and as the placement update logic works based on differences not based on absolute value, the resource allocation increased again. Now the db is updated too. In this case we have resource overallocation situation.

But in Case B, the end user updates the qos policy of a bound port to a new policy that requests _less_ resources than the current one. If the placement update succeeds but the neutron db update fails, (or the neutron agent fails to update the data plane enforcement config on the interface) then there is resource under-allocation and a possible resource overuse.

I agree that deleting an instance will recover the resource allocation problem as instance delete works based on absolute terms. However we need a tool that can detect such allocation problem in the first place. Detecting the allocation problem can only be done by checking every port of a given instance to calculate the absolute resource amount the instance, and its ports, needs. This is because there can be multiple ports that are allocation from the same resource provider.

I think the existing 'nova-manage placement heal_allocation' [1] admin CLI command could be extended for detection and healing of such allocations. However today it can only detect and heal totally missing port allocations, to avoid iterating every port of the instance. See logic at [2].

Please note that detaching the problematic port will not fix the resource allocation problem as detach also works based on differences, not based on absolute value. So it will only de-allocate the amount of resource the port _should_ have allocated, not what the port actually has.

[1] https://docs.openstack.org/nova/latest/cli/nova-manage.html#placement
[2] https://github.com/openstack/nova/blob/b5d48043466b53fbdfe7b93c2e4efd449904e593/nova/cmd/manage.py#L1533

Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

Additionally, I think a migration, resize, live-migration, shelve-offload-then-unshelve, operations can also be used to re-synch the resource allocation and therefore resolve the problem. During these operations nova runs the scheduler to select the target host and such scheduling uses the latest resource_request from neutron ports.

Revision history for this message
Lajos Katona (lajos-katona) wrote :

Thanks for thinking on it.
Summary: The approach to do placement allocation update before db transaction starts for port update (BEFORE_UPDATE event) seems to be more safe to keep consistency issues in level.

To give admins tool nova-manage must be extended to handle allocation issues.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron-lib (master)

Change abandoned by Lajos Katona (<email address hidden>) on branch: master
Review: https://review.opendev.org/748363
Reason: For details see lp:
https://bugs.launchpad.net/neutron/+bug/1882804 from comment #10

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Lajos Katona (<email address hidden>) on branch: master
Review: https://review.opendev.org/748507
Reason: For details see lp:
https://bugs.launchpad.net/neutron/+bug/1882804 from comment #10

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron-lib (master)

Reviewed: https://review.opendev.org/741452
Committed: https://git.openstack.org/cgit/openstack/neutron-lib/commit/?id=41e6b9bed8e04a21ad9f81057aefce79a70c0aa3
Submitter: Zuul
Branch: master

commit 41e6b9bed8e04a21ad9f81057aefce79a70c0aa3
Author: elajkat <email address hidden>
Date: Fri Jul 10 17:57:36 2020 +0200

    Add placement client methods for allocations

    Added placement client methods for listing and updating allocations
    and for updating QoS minumum bandwidth allocation.

    Change-Id: I9dab2e9e9728183ee209a038a0347eb7b4a83a1c
    Related-Bug: #1882804

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/748279
Committed: https://git.openstack.org/cgit/openstack/neutron-lib/commit/?id=0380f0c860ec12594f43ba35e986001fa415a267
Submitter: Zuul
Branch: master

commit 0380f0c860ec12594f43ba35e986001fa415a267
Author: elajkat <email address hidden>
Date: Wed Aug 26 14:06:09 2020 +0200

    New exception QosPlacementAllocationConflict

    In case the update of placement allocation fails with conflict and with
    message like: "The requested amount would exceed the capacity" this new
    exception can indicate it for neuton users.

    Change-Id: I2460e7d8025cedd2e92abdc9d769d2ad0b84838b
    Related-Bug: #1882804

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-lib (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/750349

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron-lib (master)

Reviewed: https://review.opendev.org/750349
Committed: https://git.openstack.org/cgit/openstack/neutron-lib/commit/?id=a61f9553d2074d160ac384313971421b7f063765
Submitter: Zuul
Branch: master

commit a61f9553d2074d160ac384313971421b7f063765
Author: elajkat <email address hidden>
Date: Tue Sep 8 13:08:45 2020 +0200

    remove rc from allocation dict if value is 0

    update_qos_minbw_allocation now deletes the resource class keys if the
    value for the give resource class is zero, as placement can't handle 0
    value updates, see the referenced story.

    Change-Id: Iaf23eb313507139c2de76146b82d148bcce9b3d6
    Closes-Bug: #1894825
    Related-Bug: #1882804
    Story: 2008111

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-lib (stable/victoria)

Related fix proposed to branch: stable/victoria
Review: https://review.opendev.org/753230

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.opendev.org/747774
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=87e5131432f181459bde78b31b59204c8e5077e2
Submitter: Zuul
Branch: master

commit 87e5131432f181459bde78b31b59204c8e5077e2
Author: elajkat <email address hidden>
Date: Fri Aug 7 16:00:21 2020 +0200

    Allow replacing the QoS policy of bound port

    Change-Id: Iebdfd2b303f47ff9f049cf583ea754b35ca26f78
    Related-Bug: #1882804
    Depends-On: https://review.opendev.org/748279

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/victoria)

Related fix proposed to branch: stable/victoria
Review: https://review.opendev.org/755180

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron-lib (stable/victoria)

Reviewed: https://review.opendev.org/753230
Committed: https://git.openstack.org/cgit/openstack/neutron-lib/commit/?id=73d868b0770cc4ba8500e8058cadf66de98d7666
Submitter: Zuul
Branch: stable/victoria

commit 73d868b0770cc4ba8500e8058cadf66de98d7666
Author: elajkat <email address hidden>
Date: Tue Sep 8 13:08:45 2020 +0200

    remove rc from allocation dict if value is 0

    update_qos_minbw_allocation now deletes the resource class keys if the
    value for the give resource class is zero, as placement can't handle 0
    value updates, see the referenced story.

    Change-Id: Iaf23eb313507139c2de76146b82d148bcce9b3d6
    Closes-Bug: #1894825
    Related-Bug: #1882804
    Story: 2008111
    (cherry picked from commit a61f9553d2074d160ac384313971421b7f063765)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (stable/victoria)

Change abandoned by Slawek Kaplonski (<email address hidden>) on branch: stable/victoria
Review: https://review.opendev.org/755180
Reason: as per discussion on irc, lets keep it only for Wallaby

Changed in neutron:
milestone: none → wallaby-1
Changed in neutron:
milestone: wallaby-1 → wallaby-2
Changed in neutron:
milestone: wallaby-2 → wallaby-3
Revision history for this message
Lajos Katona (lajos-katona) wrote :

The tempest tests are merged: https://review.opendev.org/c/openstack/tempest/+/743695
so no open items for this RFE

Changed in neutron:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.