Nova placement disregards nova aggregate metadata

Bug #1804125 reported by Jeff Albert on 2018-11-20
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Unassigned
placement-osc-plugin
Opinion
Wishlist
Unassigned

Bug Description

OpenStack 17.0.8 deployed via OpenStack-Ansible.

We have a compute environment set up with several nova aggregates, across which we wish to configure different cpu allocation ratios. Setting cpu_allocation_ratio as metadata on an aggregate and invoking the AggregateCoreFilter scheduler filter is ineffective and hosts fail to schedule past the default allocation ratio for a given compute node; debug scheduler logs suggest that the ignored compute nodes never even make it into the scheduler filter list, and indeed aren't even listed in the host state updates that scheduler logs.

It appears from this single comment I was able to find in another bug report that placement intentionally disregards aggregate metadata, and so excludes compute nodes that might well be capable of scheduling instances:

https://bugs.launchpad.net/nova/+bug/1742827/comments/13

Manually updating nova.conf on the affected compute nodes to set the intended cpu_allocation_ratio works, but means we need to set up an exception in our OSA config, and that hosts will not automatically obtain new resource allocation ratios when moved between aggregates.

Is this intended placement behavior? If so, what's the function of AggregateCoreFilter, and is there a way to restore its pre-placement functionality?

tags: added: placement
Matt Riedemann (mriedem) wrote :

It's a regression with placement being used by the nova scheduler since Ocata. There is a related nova blueprint in the stein release:

https://specs.openstack.org/openstack/nova-specs/specs/stein/approved/initial-allocation-ratios.html

There is also this which builds on that blueprint but probably won't get done in stein since it's not being worked on: https://review.openstack.org/#/c/544683/

Here is a related ML thread about the regression:

http://lists.openstack.org/pipermail/openstack-dev/2018-January/126283.html

Matt Riedemann (mriedem) wrote :
Changed in nova:
status: New → Triaged
importance: Undecided → Medium
Matt Riedemann (mriedem) wrote :

At the very least we should probably document the bug in the docs for the aggregate ram/core/disk filters.

Matt Riedemann (mriedem) wrote :

I guess there was a release note in queens: https://review.openstack.org/#/c/541018/ which was backported to pike and ocata, but the problem with release notes is they don't get noticed when you're past that old release and the issue is still not resolved...hence we need this in docs as well.

Jeff Albert (jralbert) wrote :

As an operator, I don't really have a strong opinion one way or the other on whether allocation ratios should be controlled per compute in config files or per aggregate in metadata; in general, I consider the aggregate metadata approach more agile and more in keeping with OpenStack's common pattern of API accessibility to most operations, but I can see an argument on each side.

What's more distressing is that this appears to have produced a schism between the intended, documented functions of Nova scheduler and the actual operation of those functions on several consecutive releases of OpenStack.

If the Aggregate* filters are no longer functional, and are no longer intended to be so, then I would think they should reasonably have been removed from the documentation and from the project so that deployers wouldn't expect to rely on them.

Related fix proposed to branch: master
Review: https://review.openstack.org/622588

Matt Riedemann (mriedem) wrote :

@Jeff, yeah it's gross, and taken way too long to deal with (granted, I don't think anyone noticed/appreciated this regression until ~queens, about a year after it happened).

There has been discussion about how to make the aggregate filters with the allocation_ratio metadata *work* again, discussed at the Dublin PTG:

https://etherpad.openstack.org/p/nova-ptg-rocky-placement ~L37

That solution never materialized though...

There is also this proposal:

https://review.openstack.org/#/c/544683/

Which would essentially mirror the allocation ratio metadata from the compute host aggregates API back to placement. If you have input on that spec please leave comments in the review - it's waylaid at this point.

Changed in placement-osc-plugin:
status: New → Opinion
importance: Undecided → Wishlist
Matt Riedemann (mriedem) wrote :

Added osc-placement to this bug given the previous proposal to provide a helper CLI to mirror the allocation ratio aggregate metadata in placement.

Reviewed: https://review.openstack.org/620713
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d65c18a0a9f02e6d37f2b87ff61f1740c8bfc867
Submitter: Zuul
Branch: master

commit d65c18a0a9f02e6d37f2b87ff61f1740c8bfc867
Author: Matt Riedemann <email address hidden>
Date: Wed Nov 28 17:07:11 2018 -0500

    Note the aggregate allocation ratio restriction in scheduler docs

    This borrows from the release note in change
    I01f20f275bbd5451ace5c1e6f41ab38d488dae4e to document the
    regression, introduced in Ocata, where allocation ratio settings
    in the aggregate core/ram/disk filters are not honored because
    of placement being used by the FilterScheduler.

    While there is related work going on around this in
    blueprint initial-allocation-ratios and
    blueprint placement-aggregate-allocation-ratios, it is still
    a limitation in the current code base and needs to be called
    out in the docs.

    Change-Id: Ifaf596a8572637f843f47daf5adce394b0365676
    Related-Bug: #1804125

Reviewed: https://review.openstack.org/622588
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3c50711249549a0abf84d0c0e26a3f3aae4928c4
Submitter: Zuul
Branch: master

commit 3c50711249549a0abf84d0c0e26a3f3aae4928c4
Author: Matt Riedemann <email address hidden>
Date: Tue Dec 4 15:51:51 2018 -0500

    Add docs for (initial) allocation ratio configuration

    This adds a new section to the admin scheduler configuration
    docs devoted to allocation ratios to call out the differences
    between the override config options and the initial ratio
    options, and how they interplay with the resource provider
    inventory allocation ratio override that can be performed
    via the placement REST API directly.

    This moves the note about bug 1804125 into the new section
    and also links to the docs from the initial allocation ratio
    config option help text.

    Part of blueprint initial-allocation-ratios
    Related-Bug: #1804125

    Change-Id: I7d8e822cd40dccaf5244e2cd95fa1af43fa9ed87

Reviewed: https://review.openstack.org/623546
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=899b4ca5a248eba252d3c89e373f8a787d68749c
Submitter: Zuul
Branch: stable/rocky

commit 899b4ca5a248eba252d3c89e373f8a787d68749c
Author: Matt Riedemann <email address hidden>
Date: Wed Nov 28 17:07:11 2018 -0500

    Note the aggregate allocation ratio restriction in scheduler docs

    This borrows from the release note in change
    I01f20f275bbd5451ace5c1e6f41ab38d488dae4e to document the
    regression, introduced in Ocata, where allocation ratio settings
    in the aggregate core/ram/disk filters are not honored because
    of placement being used by the FilterScheduler.

    While there is related work going on around this in
    blueprint initial-allocation-ratios and
    blueprint placement-aggregate-allocation-ratios, it is still
    a limitation in the current code base and needs to be called
    out in the docs.

    Change-Id: Ifaf596a8572637f843f47daf5adce394b0365676
    Related-Bug: #1804125
    (cherry picked from commit d65c18a0a9f02e6d37f2b87ff61f1740c8bfc867)

tags: added: in-stable-rocky

Reviewed: https://review.openstack.org/623547
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5c7638bab40bd6cfdbc199527d3a8c86bde40897
Submitter: Zuul
Branch: stable/queens

commit 5c7638bab40bd6cfdbc199527d3a8c86bde40897
Author: Matt Riedemann <email address hidden>
Date: Wed Nov 28 17:07:11 2018 -0500

    Note the aggregate allocation ratio restriction in scheduler docs

    This borrows from the release note in change
    I01f20f275bbd5451ace5c1e6f41ab38d488dae4e to document the
    regression, introduced in Ocata, where allocation ratio settings
    in the aggregate core/ram/disk filters are not honored because
    of placement being used by the FilterScheduler.

    While there is related work going on around this in
    blueprint initial-allocation-ratios and
    blueprint placement-aggregate-allocation-ratios, it is still
    a limitation in the current code base and needs to be called
    out in the docs.

    Change-Id: Ifaf596a8572637f843f47daf5adce394b0365676
    Related-Bug: #1804125
    (cherry picked from commit d65c18a0a9f02e6d37f2b87ff61f1740c8bfc867)
    (cherry picked from commit 899b4ca5a248eba252d3c89e373f8a787d68749c)

tags: added: in-stable-queens

Reviewed: https://review.openstack.org/623552
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0cb9767e90c0e1121d4e7239a60340ce6f88e413
Submitter: Zuul
Branch: stable/pike

commit 0cb9767e90c0e1121d4e7239a60340ce6f88e413
Author: Matt Riedemann <email address hidden>
Date: Wed Nov 28 17:07:11 2018 -0500

    Note the aggregate allocation ratio restriction in scheduler docs

    This borrows from the release note in change
    I01f20f275bbd5451ace5c1e6f41ab38d488dae4e to document the
    regression, introduced in Ocata, where allocation ratio settings
    in the aggregate core/ram/disk filters are not honored because
    of placement being used by the FilterScheduler.

    While there is related work going on around this in
    blueprint initial-allocation-ratios and
    blueprint placement-aggregate-allocation-ratios, it is still
    a limitation in the current code base and needs to be called
    out in the docs.

    Change-Id: Ifaf596a8572637f843f47daf5adce394b0365676
    Related-Bug: #1804125
    (cherry picked from commit d65c18a0a9f02e6d37f2b87ff61f1740c8bfc867)
    (cherry picked from commit 899b4ca5a248eba252d3c89e373f8a787d68749c)
    (cherry picked from commit 5c7638bab40bd6cfdbc199527d3a8c86bde40897)

tags: added: in-stable-pike
Alvaro Uria (aluria) on 2019-03-26
tags: added: canonical-bootstack
Chris Dent (cdent) wrote :

https://review.openstack.org/#/c/640898/ is a related osc-placement change.

Matt Riedemann (mriedem) wrote :

I think at this point we can mark the nova portion of this as won't fix with these changes to manage allocation ratios:

https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#allocation-ratios

The workaround outside of that is to rely on that osc-placement patch linked into comment 19.

Changed in nova:
status: Triaged → Won't Fix

Reviewed: https://review.opendev.org/673496
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=588194d785b1bb52e53ed159c5d8645bf3a28b7d
Submitter: Zuul
Branch: master

commit 588194d785b1bb52e53ed159c5d8645bf3a28b7d
Author: Sean Mooney <email address hidden>
Date: Tue Jul 30 12:18:13 2019 +0000

    Deprecate Aggregate[Core|Ram|Disk]Filters

    The Aggregate[Core|Ram|Disk]Filters have not worked
    correctly since ocata, this change deprecates them
    for removal next cycle.
    http://lists.openstack.org/pipermail/openstack-dev/2018-January/126283.html

    Related-Bug: #1804125
    Change-Id: Ibfbfdae9e6ec93f772631a84e8969f4e11da8aee

Reviewed: https://review.opendev.org/677472
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=77efc084fc50d38d2a1b42aca864a2a56eae6eba
Submitter: Zuul
Branch: master

commit 77efc084fc50d38d2a1b42aca864a2a56eae6eba
Author: Matt Riedemann <email address hidden>
Date: Tue Aug 20 10:07:38 2019 -0400

    doc: remove confusing docs about aggregate allocation ratios

    Change Ifaf596a8572637f843f47daf5adce394b0365676 added a note
    about the behavior change in Ocata where allocation ratios
    set on host aggregates was ignored because of placement resource
    provider allocation ratios being used.

    Later, change I7d8e822cd40dccaf5244e2cd95fa1af43fa9ed87 added
    a lot more detail about allocation ratios in the scheduler docs
    including the initial* allocation ratio config options. The note
    from the previous change was moved and as a result leads to some
    confusion since the doc starts by saying essentially, "you can
    use these aggregate filters to manage allocation ratios on a set
    of hosts" and then the immediate note says essentially, "oh btw
    that doesn't work since ocata, sorry!".

    To avoid the confusion, this simply removes the part about how
    the aggregate filters can be used to manage allocation ratios.

    Change-Id: I62710b0b8c098cca3f67020f4a6da5e684115414
    Related-Bug: #1804125

Reviewed: https://review.opendev.org/678254
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e5b304af45423e61d5d591e5deb81d3848df1798
Submitter: Zuul
Branch: stable/stein

commit e5b304af45423e61d5d591e5deb81d3848df1798
Author: Matt Riedemann <email address hidden>
Date: Tue Aug 20 10:07:38 2019 -0400

    doc: remove confusing docs about aggregate allocation ratios

    Change Ifaf596a8572637f843f47daf5adce394b0365676 added a note
    about the behavior change in Ocata where allocation ratios
    set on host aggregates was ignored because of placement resource
    provider allocation ratios being used.

    Later, change I7d8e822cd40dccaf5244e2cd95fa1af43fa9ed87 added
    a lot more detail about allocation ratios in the scheduler docs
    including the initial* allocation ratio config options. The note
    from the previous change was moved and as a result leads to some
    confusion since the doc starts by saying essentially, "you can
    use these aggregate filters to manage allocation ratios on a set
    of hosts" and then the immediate note says essentially, "oh btw
    that doesn't work since ocata, sorry!".

    To avoid the confusion, this simply removes the part about how
    the aggregate filters can be used to manage allocation ratios.

    Conflicts:
          doc/source/admin/configuration/schedulers.rst

    NOTE(mriedem): The conflict is due to not having change
    I8a0d332877fbb9794700081e7954f2501b7e7c09 in Stein.

    Change-Id: I62710b0b8c098cca3f67020f4a6da5e684115414
    Related-Bug: #1804125
    (cherry picked from commit 77efc084fc50d38d2a1b42aca864a2a56eae6eba)

tags: added: in-stable-stein
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers