Don't assume the guest machine type to be of 'pc'

Bug #1780138 reported by Kashyap Chamarthy
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Medium
Unassigned

Bug Description

Background
----------

QEMU supports two main variants of "machine type" (a virtual chipset)
for x86 hosts: (a) 'pc', which corresponds to Intel's 'i440FX' chipset
-- that is 22 years old, and is based on PCI and IDE; and (b) 'q35',
which corresponds to Intel's 82Q35 chipset, relatively modern; still, 11
years old. (For AArch64 hosts, the machine type is called: 'virt'.)

The 'q35' machine type provides some advanced features by default:
native PCIe hotplug (which is faster than ACPI-based hotplug, which
older 'pc' machine type uses), IOMMU, faster SATA emulation, Secure Boot
and so forth. (Details: https://wiki.qemu.org/images/4/4e/Q35.pdf)

Proposed change
---------------

QEMU plans to change the default machine type to 'q35', so that they can
get rid of the legacy machine type 'pc'. Nova should be prepared to not
break when that happens. (Refer the "What will break?" section below.)

How does Nova handle machine types today?
-----------------------------------------

By default, Nova does not hard-code any machine type for x86 arch; but for non-x86 architectures, Nova *does* assume the machine type to be of 'pc' — which is the bug we're aiming to fix here.

(NB: For x86, Nova just uses whatever libvirt provides it by default — this is not desirable in the long term, and will be addressed, to pick the guest machine type based on the its capabilities, as part of a separate blueprint / specification.)

Nova allows configuring machine type in two ways:

  (1) Disk image metadata property, so that when you boot a guest from
      that disk image, it gets the configured machine type:

        $ openstack image set \
            --property hw_machine_type=x86_64=pc-i440fx-2.9 Fedora-28-Template

  (2) Per-Compute host configuration file, so that _all_ guests launched
      on that host gets the configured machine type:

       [libvirt]
       ...
       hw_machine_type=x86_64=q35

What will break?
----------------

From a discussion with libvirt and QEMU developers (thanks: Eduardo
Habkost, Daniel Berrangé), management applications like Nova will
break _only_ if we have a code pattern like:

   if guest_machine_type == q35:
       ... do something 'q35' related ...
   else:
       ... do something 'pc' related ...

As the above code pattern assumes that not providing a machine type will
result in 'pc'. So we should avoid such a pattern.

Auditing the Nova code[+], we precisely have the above pattern when
configuring PCIe ports (from nova/virt/libvirt/driver.py,
_get_guest_config() function):

        [...]
        # Add PCIe root port controllers for PCI Express machines
        # but only if their amount is configured
        if (CONF.libvirt.num_pcie_ports and
                ((caps.host.cpu.arch == fields.Architecture.AARCH64 and
                guest.os_mach_type.startswith('virt')) or
                (caps.host.cpu.arch == fields.Architecture.X86_64 and
                guest.os_mach_type is not None and
                'q35' in guest.os_mach_type))):
            self._guest_add_pcie_root_ports(guest)
        [...]

The above code is assuming when 'guest.os_mach_type' == None, then you
have 'pc' machine type -- which is _not_ going to be valid in the
future.

To fix this, Nova needs to make sure 'guest.os_mach_type' is always set.

[*] http://git.openstack.org/cgit/openstack/nova/commit/?id=a234bbf8 --
    Allow to configure amount of PCIe ports

tags: added: libvirt
description: updated
description: updated
melanie witt (melwitt)
Changed in nova:
importance: Undecided → Medium
status: New → Confirmed
Changed in nova:
assignee: nobody → Kashyap Chamarthy (kashyapc)
summary: - Gracefully handle when QEMU switches its default machine to 'q35'
+ Gracefully handle when QEMU switches its default machine type to 'q35'
description: updated
summary: - Gracefully handle when QEMU switches its default machine type to 'q35'
+ Don't assume the guest machine type to be of 'pc'
Revision history for this message
Kashyap Chamarthy (kashyapc) wrote :

Related to this problem, upstream libvirt has merged this patch:

https://www.redhat.com/archives/libvir-list/2018-August/msg00135.html -- "qemu: ensure
"pc" machine is always used as default if available"

Revision history for this message
Kashyap Chamarthy (kashyapc) wrote :

To expand on comment#1, from libvirt v4.7.0 onwards, thanks to this
libvirt commit[*] libvirt will ensure that the default machine type for
"x86" will keep as 'pc', *if* it is available. Then the assumption
(that "guest.os_mach_type is None" means, then you have 'pc' machine
type) we have today Nova's libvirt driver will hold. However, relying
on such assumptions is still not good.

Nova needs to explicitly configure a machine type for "x86"---either by
using libosinfo (/me needs to learn how to use this API) which
recommends machine type based on what is supported by the QEMU on your
host; or by getting the default machine type from libvirt (from its
getCapabilities() API).

[*] https://libvirt.org/git/?p=libvirt.git;a=commit;h=26cfb1a3 -- "qemu:
ensure default machine types don't change if QEMU changes"

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/stein)

Related fix proposed to branch: stable/stein
Review: https://review.opendev.org/663677

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/663011
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=527c452a6fc05a316eb8121796f761af97522772
Submitter: Zuul
Branch: master

commit 527c452a6fc05a316eb8121796f761af97522772
Author: Lee Yarwood <email address hidden>
Date: Tue Jun 4 11:57:18 2019 +0100

    libvirt: Use SATA bus for cdrom devices when using Q35 machine type

    The Q35 machine type no longer provides an IDE bus and will need to use
    a SATA bus to attach legacy devices such as cdroms. More details can be
    found in the following related bug:

    Don't assume the guest machine type to be of 'pc'
    https://bugs.launchpad.net/nova/+bug/1780138

    This change now ensures the blockinfo.get_disk_bus_for_device_type
    method will now return "sata" as the bus type when the Q35 machine type
    is used for cdrom devices on QEMU or KVM hosts that are not PPC, S390 or
    AArch64 based.

    To enable this the _get_machine_type method has been extracted from the
    Libvirt driver into the Libvirt utils module. This method has also been
    simplified through the removal of the caps parameter, replaced with
    calls to the get_arch utility method and additional extraction of
    architecture specific defaults into the existing
    get_default_machine_type utility method.

    Related-bug: 1780138
    Closes-bug: 1831538
    Change-Id: Id97f4baddcf2caff91599773d9b5de5181b7fdf6

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/stein)

Reviewed: https://review.opendev.org/663677
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=db40cc44cbad32353510819c0aee1d0a3c9e4357
Submitter: Zuul
Branch: stable/stein

commit db40cc44cbad32353510819c0aee1d0a3c9e4357
Author: Lee Yarwood <email address hidden>
Date: Tue Jun 4 11:57:18 2019 +0100

    libvirt: Use SATA bus for cdrom devices when using Q35 machine type

    The Q35 machine type no longer provides an IDE bus and will need to use
    a SATA bus to attach legacy devices such as cdroms. More details can be
    found in the following related bug:

    Don't assume the guest machine type to be of 'pc'
    https://bugs.launchpad.net/nova/+bug/1780138

    This change now ensures the blockinfo.get_disk_bus_for_device_type
    method will now return "sata" as the bus type when the Q35 machine type
    is used for cdrom devices on QEMU or KVM hosts that are not PPC, S390 or
    AArch64 based.

    To enable this the _get_machine_type method has been extracted from the
    Libvirt driver into the Libvirt utils module. This method has also been
    simplified through the removal of the caps parameter, replaced with
    calls to the get_arch utility method and additional extraction of
    architecture specific defaults into the existing
    get_default_machine_type utility method.

    Related-bug: 1780138
    Closes-bug: 1831538
    Change-Id: Id97f4baddcf2caff91599773d9b5de5181b7fdf6
    (cherry picked from commit 527c452a6fc05a316eb8121796f761af97522772)

tags: added: in-stable-stein
Changed in nova:
assignee: Kashyap Chamarthy (kashyapc) → nobody
Revision history for this message
Tobias Urdin (tobias-urdin) wrote :

Can't this be closed now that Nova records machine type per instance and saves it or does this also cover the actual removal of pc machine type?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.