Hardcoded choices for nova scheduler driver

Bug #1704788 reported by Masha Atakova on 2017-07-17
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Matt Riedemann
Ocata
High
Matt Riedemann

Bug Description

Hi everyone,

There's a driver option in nova.conf which is parsed with configuration from here:

https://github.com/openstack/nova/blob/stable/ocata/nova/conf/scheduler.py#L58

Hardcoded list of choices possible for that option specified on #60 and #61 lines blocks nova scheduler from allowing any custom scheduler driver.

Is this intentional and there's another workaround for plugging in scheduler driver? Or is it just a mistake?

Thanks for your attention.

Matt Riedemann (mriedem) wrote :

This was introduced in this change in Ocata:

https://review.openstack.org/#/c/349666/

There should have been a release note with that, and this part of the help text was wrong after that change:

"A custom scheduler driver. In this case, you will be responsible for
   creating and maintaining the entry point in your 'setup.cfg' file"

There was an older change made in Newton to stop allowing classloading of the scheduler driver and instead rely on stevedore:

https://review.openstack.org/#/c/254768/

But that meant you could still put your own scheduler driver "nova.scheduler.driver" entrypoint in setup.cfg to load a custom scheduler driver, which was then broken with the use of "choices" in https://review.openstack.org/#/c/349666/ for the config option - without warning or a deprecation period on custom scheduler driver entry points, so yeah, seems like a bug to at least communicate this better before removing it.

Note, however, that Nova is and has been for awhile removing the ability to load out of tree drivers and manager classes, so any fix here is likely to still deprecate the out of tree loading ability.

Changed in nova:
status: New → Confirmed
Sylvain Bauza (sylvain-bauza) wrote :

Agreed with Matt about a communication problem.
The Ocata consensus was about to say that we would stop using custom drivers also for scheduler, but still accepting custom filters (so people wanting to have a specific driver could just create a filter for the FilterScheduler that would be calling the 3rd-party driver).

That said, what could we do ? I'm not sure modifying the [scheduler]/driver option for accepting strings instead of choices (the string being the entrypoint name in setup.cfg) should be acceptable because that would mean we would go against the consensus we had in Ocata.

The real problem is that we previously accepted people to not use our scheduler drivers without asking them why they don't use them. Is that because something is missing for being verified ? Is that because people would want a different scheduler behaviour (for example say a shared-state scheduler or a two-level scheduler like Mesos) ?

Now that we have Placement in Ocata for making sure that hosts being verified by the filters are all accepting resource requests for RAM, disk and CPU, I still do think we can explicitly say to people wanting to use a totally separate scheduler logic to discuss with us why they need that. I honestly think there is room for improvement about custom logics.

Matt Riedemann (mriedem) wrote :

"because that would mean we would go against the consensus we had in Ocata."

Where was this consensus documented and communicated? If it was just IRC or a comment in a patch, but didn't have a deprecation period or at least a release note to communicate it, the consensus is invalid, IMO.

Matt Riedemann (mriedem) on 2017-07-18
tags: added: pike-rc-potential

Fix proposed to branch: master
Review: https://review.openstack.org/484828

Changed in nova:
assignee: nobody → Sylvain Bauza (sylvain-bauza)
status: Confirmed → In Progress
Matt Riedemann (mriedem) on 2017-07-18
Changed in nova:
importance: Undecided → High
Changed in nova:
assignee: Sylvain Bauza (sylvain-bauza) → Matt Riedemann (mriedem)

Reviewed: https://review.openstack.org/484828
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1e5c7b52a403e708dba5a069dd86b628a4cb952c
Submitter: Jenkins
Branch: master

commit 1e5c7b52a403e708dba5a069dd86b628a4cb952c
Author: Sylvain Bauza <email address hidden>
Date: Tue Jul 18 16:33:49 2017 +0200

    Accept any scheduler driver entrypoint

    We broke the possibility in Ocata with Icdcf839b6d28893694bfa1355e9dbe8dbb5ea8c3
    to use other scheduler drivers but the ones we provided in tree.

    Unfortunately, that was an incidental change without any communication.

    Removing the choices kwarg will allow operators to run their own scheduler driver.
    Whether Nova would stop supporting custom drivers would require a totally separate
    change which would clearly communicate thru a deprecation notice but that is not
    the intent for that bugfix, which aims only to bring back the capability.

    Change-Id: I346881bc3bc48794b139cc471be1de11c49b8ee3
    Closes-Bug: #1704788

Changed in nova:
status: In Progress → Fix Released
Matt Riedemann (mriedem) on 2017-08-03
tags: removed: pike-rc-potential

This issue was fixed in the openstack/nova 16.0.0.0rc1 release candidate.

Reviewed: https://review.openstack.org/490110
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3eefcce2553856a1635a62fecf73cd9a2d9097cb
Submitter: Jenkins
Branch: stable/ocata

commit 3eefcce2553856a1635a62fecf73cd9a2d9097cb
Author: Sylvain Bauza <email address hidden>
Date: Tue Jul 18 16:33:49 2017 +0200

    Accept any scheduler driver entrypoint

    We broke the possibility in Ocata with Icdcf839b6d28893694bfa1355e9dbe8dbb5ea8c3
    to use other scheduler drivers but the ones we provided in tree.

    Unfortunately, that was an incidental change without any communication.

    Removing the choices kwarg will allow operators to run their own scheduler driver.
    Whether Nova would stop supporting custom drivers would require a totally separate
    change which would clearly communicate thru a deprecation notice but that is not
    the intent for that bugfix, which aims only to bring back the capability.

    Change-Id: I346881bc3bc48794b139cc471be1de11c49b8ee3
    Closes-Bug: #1704788
    (cherry picked from commit 1e5c7b52a403e708dba5a069dd86b628a4cb952c)

This issue was fixed in the openstack/nova 15.0.7 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers