ImagePropertiesFilter: hypervisor_type matchmaking not compliant with documentation

Bug #1837756 reported by massimo.sgaravatto on 2019-07-24
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Matt Riedemann
Rocky
High
Matt Riedemann
Stein
High
Matt Riedemann

Bug Description

All compute nodes of my Cloud are configured using libvirt]/virt_type=kvm

Please note that this is not exposed in the 'openstack hypervisor list --long' output which reports QEMU as 'Hypervisor Type'.

I have some images with the following property:

hypervisor_type='qemu'

No problems scheduling instances using these images in Ocata.
After having updated to Rocky, the ImagePropertiesFilter filters out all the compute nodes, because they offer 'kvm' while 'qemu' is requested.

This is not compliant with documentation.
Both Nova (https://docs.openstack.org/nova/rocky/admin/configuration/schedulers.html) and Glance (https://docs.openstack.org/glance/latest/admin/useful-image-properties.html) documentation say that qemu is supposed to be used for both QEMU and KVM hypervisor types.

The issue is discussed in:

http://lists.openstack.org/pipermail/openstack-discuss/2019-July/thread.html#7842

where it is reported that the new behavior is likely because this change:

https://review.opendev.org/531347

summary: - ImagePropertiesFilter scheduling problems for images with
- hypervisor_type='qemu' when virt_type is "kvm"
+ ImagePropertiesFilter: hypervisor_type matchmaking not compliant with
+ documentation

Fix proposed to branch: master
Review: https://review.opendev.org/672559

Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
status: New → In Progress
Matt Riedemann (mriedem) on 2019-07-24
Changed in nova:
importance: Undecided → High
tags: added: libvirt scheduler upgrade

Reviewed: https://review.opendev.org/672559
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=743dc083bb5628554d9abfa82665738233ed47e9
Submitter: Zuul
Branch: master

commit 743dc083bb5628554d9abfa82665738233ed47e9
Author: Matt Riedemann <email address hidden>
Date: Wed Jul 24 16:22:52 2019 +0000

    Revert "[libvirt] Filter hypervisor_type by virt_type"

    This reverts commit eaa766ee2093c24fd61c61e52f46bdd9ff9e93d2.

    The change regressed the behavior of the ImagePropertiesFilter
    because existing images with hypervisor_type=QEMU, which would
    match what is reported for hypervisor_type in the API for both
    qemu/kvm virt_type nodes, will now get filtered out for hosts
    where the configured virt_type is kvm.

    Note that both the ImagePropertiesFilter docs [1] and
    hypervisor_type image property docs [2] mention that for both
    qemu and kvm nodes the value to use is qemu since that is the
    actual hypervisor.

    Presumably the change was made for a deployment with some
    hosts configured with virt_type=qemu and other hosts configured
    with virt_type=kvm and there were separate images with
    hypervisor_type=qemu and hypervisor_type=kvm to match those hosts
    for scheduler filter, but as noted this was a regression in
    behavior for something that could have been achieved using
    host aggregates and the AggregateImagePropertiesIsolation filter.

    We could even use traits and a placement request pre-filter these
    days for a more modern approach.

    Also, since the API continues to report hypervisor_type=QEMU it's
    doubly confusing for operators to have to configure their images
    to use hypervisor_type=kvm (despite the docs).

    And finally, any existing instances which have hypervisor_type=qemu
    embedded in their RequestSpec can no longer be migrated to kvm
    hosts without manually fixing the entries in the request_specs
    table in the API DB.

    Note that this is not a clean revert because of change
    I5d95bd50279a6bf903a5793ad5f3ae9d06f085f4 made in Stein.

    [1] https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#imagepropertiesfilter
    [2] https://docs.openstack.org/glance/latest/admin/useful-image-properties.html

    Change-Id: I7d761dc269f8c12c4a76ba14201ccdd82a04d01d
    Closes-Bug: #1837756

Changed in nova:
status: In Progress → Fix Released

I tried the fix on some compute nodes of my Rocky cloud and it works (both for new instances and for migration of previously created VMs). Thanks !

Reviewed: https://review.opendev.org/672723
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=45f290e36c797a271a20a3055ee866fa5b7997a8
Submitter: Zuul
Branch: stable/stein

commit 45f290e36c797a271a20a3055ee866fa5b7997a8
Author: Matt Riedemann <email address hidden>
Date: Wed Jul 24 16:22:52 2019 +0000

    Revert "[libvirt] Filter hypervisor_type by virt_type"

    This reverts commit eaa766ee2093c24fd61c61e52f46bdd9ff9e93d2.

    The change regressed the behavior of the ImagePropertiesFilter
    because existing images with hypervisor_type=QEMU, which would
    match what is reported for hypervisor_type in the API for both
    qemu/kvm virt_type nodes, will now get filtered out for hosts
    where the configured virt_type is kvm.

    Note that both the ImagePropertiesFilter docs [1] and
    hypervisor_type image property docs [2] mention that for both
    qemu and kvm nodes the value to use is qemu since that is the
    actual hypervisor.

    Presumably the change was made for a deployment with some
    hosts configured with virt_type=qemu and other hosts configured
    with virt_type=kvm and there were separate images with
    hypervisor_type=qemu and hypervisor_type=kvm to match those hosts
    for scheduler filter, but as noted this was a regression in
    behavior for something that could have been achieved using
    host aggregates and the AggregateImagePropertiesIsolation filter.

    We could even use traits and a placement request pre-filter these
    days for a more modern approach.

    Also, since the API continues to report hypervisor_type=QEMU it's
    doubly confusing for operators to have to configure their images
    to use hypervisor_type=kvm (despite the docs).

    And finally, any existing instances which have hypervisor_type=qemu
    embedded in their RequestSpec can no longer be migrated to kvm
    hosts without manually fixing the entries in the request_specs
    table in the API DB.

    Note that this is not a clean revert because of change
    I5d95bd50279a6bf903a5793ad5f3ae9d06f085f4 made in Stein.

    [1] https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#imagepropertiesfilter
    [2] https://docs.openstack.org/glance/latest/admin/useful-image-properties.html

    Change-Id: I7d761dc269f8c12c4a76ba14201ccdd82a04d01d
    Closes-Bug: #1837756
    (cherry picked from commit 743dc083bb5628554d9abfa82665738233ed47e9)

Reviewed: https://review.opendev.org/672747
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5f0497e5952eab583346c96b56e4110041ab4f27
Submitter: Zuul
Branch: stable/rocky

commit 5f0497e5952eab583346c96b56e4110041ab4f27
Author: Matt Riedemann <email address hidden>
Date: Wed Jul 24 16:22:52 2019 +0000

    Revert "[libvirt] Filter hypervisor_type by virt_type"

    This reverts commit eaa766ee2093c24fd61c61e52f46bdd9ff9e93d2.

    The change regressed the behavior of the ImagePropertiesFilter
    because existing images with hypervisor_type=QEMU, which would
    match what is reported for hypervisor_type in the API for both
    qemu/kvm virt_type nodes, will now get filtered out for hosts
    where the configured virt_type is kvm.

    Note that both the ImagePropertiesFilter docs [1] and
    hypervisor_type image property docs [2] mention that for both
    qemu and kvm nodes the value to use is qemu since that is the
    actual hypervisor.

    Presumably the change was made for a deployment with some
    hosts configured with virt_type=qemu and other hosts configured
    with virt_type=kvm and there were separate images with
    hypervisor_type=qemu and hypervisor_type=kvm to match those hosts
    for scheduler filter, but as noted this was a regression in
    behavior for something that could have been achieved using
    host aggregates and the AggregateImagePropertiesIsolation filter.

    We could even use traits and a placement request pre-filter these
    days for a more modern approach.

    Also, since the API continues to report hypervisor_type=QEMU it's
    doubly confusing for operators to have to configure their images
    to use hypervisor_type=kvm (despite the docs).

    And finally, any existing instances which have hypervisor_type=qemu
    embedded in their RequestSpec can no longer be migrated to kvm
    hosts without manually fixing the entries in the request_specs
    table in the API DB.

    Note that this is not a clean revert because of change
    I5d95bd50279a6bf903a5793ad5f3ae9d06f085f4 made in Stein.

    [1] https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#imagepropertiesfilter
    [2] https://docs.openstack.org/glance/latest/admin/useful-image-properties.html

    Change-Id: I7d761dc269f8c12c4a76ba14201ccdd82a04d01d
    Closes-Bug: #1837756
    (cherry picked from commit 743dc083bb5628554d9abfa82665738233ed47e9)
    (cherry picked from commit 45f290e36c797a271a20a3055ee866fa5b7997a8)

This issue was fixed in the openstack/nova 19.0.2 release.

This issue was fixed in the openstack/nova 18.2.2 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers