Missing allocation for ports with resource request

Bug #1819923 reported by Balazs Gibizer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Balazs Gibizer

Bug Description

Before I97f06d0ec34cbd75c182caaa686b8de5c777a576 [1] it was possible to create servers with neutron ports had resource_request (e.g. a port with QoS minimum bandwidth policy rule) without allocated the requested resources in placement. Nova did not consider the resource request of a Neutron port before microversion 2.72.

So there could be servers with port for which allocation needs to be healed in placement.

[1] https://review.openstack.org/#/c/630721/

Changed in nova:
assignee: nobody → Balazs Gibizer (balazs-gibizer)
importance: Undecided → Medium
status: New → In Progress
tags: added: nova-manage
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/655457

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.opendev.org/655458

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/655459

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Balazs Gibizer (<email address hidden>) on branch: master
Review: https://review.opendev.org/637953
Reason: superseeded by multiple refactor patches on topic bug/1819923

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/655457
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e6fc3162612e4dff73fc91519269f2318fa4e750
Submitter: Zuul
Branch: master

commit e6fc3162612e4dff73fc91519269f2318fa4e750
Author: Balazs Gibizer <email address hidden>
Date: Wed Apr 24 09:54:30 2019 +0200

    pull out functions from _heal_allocations_for_instance

    There are two separate healing is going on in
    _heal_allocations_for_instance one that heals a missing allocation and
    the other that heals the project_id and the user_id in the existing
    allocation. This patch pull the separate steps out to separate
    functions.

    Related-Bug: #1819923

    Change-Id: I4b85fc22d2e8f57f718cde90bf556384b169d635

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/667994

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/655458
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=9adcf53210747d34248702874494537ef2a46781
Submitter: Zuul
Branch: master

commit 9adcf53210747d34248702874494537ef2a46781
Author: Balazs Gibizer <email address hidden>
Date: Wed Apr 24 10:25:19 2019 +0200

    reorder conditions in _heal_allocations_for_instance

    The new order will make it simple to pull the placement update out from
    the different healing steps to a single place.

    Related-Bug: #1819923
    Change-Id: Iff5b73d8e818fb1145690d0eeff880d98424fa1d

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/637954
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=307999c58101a5920e23ef181643d3b46bbf987c
Submitter: Zuul
Branch: master

commit 307999c58101a5920e23ef181643d3b46bbf987c
Author: Balazs Gibizer <email address hidden>
Date: Mon Mar 18 16:49:57 2019 +0100

    Prepare _heal_allocations_for_instance for nested allocations

    When no allocations exist for an instance the current heal code uses a
    report client call that can only handle allocations from a single RP.
    This call is now replaced with a more generic one so in a later patch
    port allocations can be added to this code path too.

    Related-Bug: #1819923
    Change-Id: Ide343c1c922dac576b1944827dc24caefab59b74

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/655459
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e2866609bbb4e70e2b781a80ecc1ad0bccf93813
Submitter: Zuul
Branch: master

commit e2866609bbb4e70e2b781a80ecc1ad0bccf93813
Author: Balazs Gibizer <email address hidden>
Date: Wed Apr 24 13:02:01 2019 +0200

    pull out put_allocation call from _heal_*

    Both allocation healing steps calls the placement API. This patch pulls
    out the placement updating code to a single place. To do that it change
    the healing steps to only generate / update the allocation individually
    and then at the end of the healing there will be a single placement
    update with this allocation.

    This will help us to include the port related allocation into the instance
    allocation by modifying a single place in the code.

    Related-Bug: #1819923

    Change-Id: I0e9f9a488141da599c10af8cabb4f6a5d111104f

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/667994
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=87365c760ee89493db87183c4c751ba5fc7b921a
Submitter: Zuul
Branch: master

commit 87365c760ee89493db87183c4c751ba5fc7b921a
Author: Matt Riedemann <email address hidden>
Date: Thu Jun 27 13:24:33 2019 -0400

    Add integration testing for heal_allocations

    This adds a simple scenario for the heal_allocations CLI
    to the post_test_hook script run at the end of the nova-next
    job. The functional testing in-tree is pretty extensive but
    it's always good to have real integration testing.

    Change-Id: If86e4796a9db3020d4fdb751e8bc771c6f98aa47
    Related-Bug: #1819923

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/668925

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.opendev.org/669879

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/637955
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=54dea2531c887f77e4b7a8e7edb978d8f1ccfe50
Submitter: Zuul
Branch: master

commit 54dea2531c887f77e4b7a8e7edb978d8f1ccfe50
Author: Balazs Gibizer <email address hidden>
Date: Mon Mar 18 17:24:01 2019 +0100

    nova-manage: heal port allocations

    Before I97f06d0ec34cbd75c182caaa686b8de5c777a576 it was possible to
    create servers with neutron ports which had resource_request (e.g. a
    port with QoS minimum bandwidth policy rule) without allocating the
    requested resources in placement. So there could be servers for which
    the allocation needs to be healed in placement.

    This patch extends the nova-manage heal_allocation CLI to create the
    missing port allocations in placement and update the port in neutron
    with the resource provider uuid that is used for the allocation.

    There are known limiations of this patch. It does not try to reimplement
    Placement's allocation candidate functionality. Therefore it cannot
    handle the situation when there is more than one RP in the compute
    tree which provides the required traits for a port. In this situation
    deciding which RP to use would require the in_tree allocation candidate
    support from placement which is not available yet and 2) information
    about which PCI PF an SRIOV port is allocated from its VF and which RP
    represents that PCI device in placement. This information is only
    available on the compute hosts.

    For the unsupported cases the command will fail gracefully. As soon as
    migration support for such servers are implemented in the blueprint
    support-move-ops-with-qos-ports the admin can heal the allocation of
    such servers by migrating them.

    During healing both placement and neutron need to be updated. If any of
    those updates fail the code tries to roll back the previous updates for
    the instance to make sure that the healing can be re-run later without
    issue. However if the rollback fails then the script will terminate with
    an error message pointing to documentation that describes how to
    recover from such a partially healed situation manually.

    Closes-Bug: #1819923
    Change-Id: I4b2b1688822eb2f0174df0c8c6c16d554781af85

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/668925
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7a37940ce9645add3b976a3609014a2b3f8c7193
Submitter: Zuul
Branch: master

commit 7a37940ce9645add3b976a3609014a2b3f8c7193
Author: Balazs Gibizer <email address hidden>
Date: Wed Jul 3 16:35:33 2019 +0200

    Translatable output strings in heal allocation

    This patch fixes the issue with output string composition in nova-manage
    heal allocation that made translation of such string nearly impossible.

    Change-Id: I0cb5424b0861536e6349c335fda301093d8651e8
    Related-Bug: #1819923

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 20.0.0.0rc1

This issue was fixed in the openstack/nova 20.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/669879
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0044702e0dd51814be38818144c675cc8a7617ca
Submitter: Zuul
Branch: master

commit 0044702e0dd51814be38818144c675cc8a7617ca
Author: Balazs Gibizer <email address hidden>
Date: Tue Jul 9 16:05:31 2019 +0200

    Test heal port allocations in nova-next

    This patch extends the existing integration test for
    heal_allocations to test the recently implemented port
    allocation healing functionality.

    Change-Id: I993c9661c37da012cc975ee8c04daa0eb9216744
    Related-Bug: #1819923

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/stein)

Related fix proposed to branch: stable/stein
Review: https://review.opendev.org/692923

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/stein)

Reviewed: https://review.opendev.org/692923
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=06caa62441ea977a52a1ed631d8aa49a6a43d32c
Submitter: Zuul
Branch: stable/stein

commit 06caa62441ea977a52a1ed631d8aa49a6a43d32c
Author: Matt Riedemann <email address hidden>
Date: Thu Jun 27 13:24:33 2019 -0400

    Add integration testing for heal_allocations

    This adds a simple scenario for the heal_allocations CLI
    to the post_test_hook script run at the end of the nova-next
    job. The functional testing in-tree is pretty extensive but
    it's always good to have real integration testing.

    NOTE(melwitt): This is different than the Train change and does not
    use the --dry-run or --instance options because the following changes
    are not in Stein:

      Ide31957306602c1f306ebfa48d6e95f48b1e8ead
      Icf57f217f03ac52b1443addc34aa5128661a8554

    Change-Id: If86e4796a9db3020d4fdb751e8bc771c6f98aa47
    Related-Bug: #1819923
    (cherry picked from commit 87365c760ee89493db87183c4c751ba5fc7b921a)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Balazs Gibizer (<email address hidden>) on branch: master
Review: https://review.opendev.org/638207
Reason: Let's reactivate this if there is a report that confirms the need for such a cache

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.