f28: tempest fails Exhausted all hosts available for retrying build failures, due to hw_machine_type=x86_64=pc-i440fx-rhel7.6.0

Bug #1819452 reported by Sorin Sbarnea
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Unassigned

Bug Description

It seems that we have ~5 tempest tests that are failing with Fedora 28 containers deployments:

tempest.scenario.test_minimum_basic.TestMinimumBasicScenario.test_minimum_basic_scenario
tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_preserve_preexisting_port
tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_subnet_details
tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_update_instance_port_admin_state
tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_update_router_admin_state

See https://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-fedora-28-standalone-master/af35f7c/logs/undercloud/home/zuul/tempest.log.txt.gz

https://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-fedora-28-standalone-master/af35f7c/logs/undercloud/home/zuul/tempest/tempest.html.gz

The root cause of the problem is the use of a hardcoded architecture in nova which got outdated and which is not supported by fedora-28.

To see a full list of supported architectures see https://gist.github.com/ssbarnea/33f905e541a300426761797993f240c3

After few emails on openstack-discuss we now know more about which values should be used. "pc" is like a generic alias which points to the latest supported pc-version on the host operating system, making it a much better pick than a fully qualified one.

In addition to that nova people told us that it would be even better if we could use the newer "q35" hw if is afailable as is even newer than PC. As seen, "q35" is also supported by both centos-7 and fedora-28.

Our plan is to adopt "pc" first because we know it as being very low risk, as it does not really change architecture used by centos-7 (being alias).

Laster, we want to propose switcing to "q35" as recommended, but this should be done with more care as it does change it everywhere.

PS. If one want to see the list of supported architecture, run:
qemu-kvm -machine help

Current patches related to this issue https://review.openstack.org/#/q/topic:nova-arch

Sorin Sbarnea (ssbarnea)
Changed in tripleo:
status: New → Confirmed
importance: Undecided → High
description: updated
wes hayutin (weshayutin)
Changed in tripleo:
status: Confirmed → Incomplete
wes hayutin (weshayutin)
summary: - f28: tempest failures
+ f28: tempest fails Exhausted all hosts available for retrying build
+ failures, due to hw_machine_type=x86_64=pc-i440fx-rhel7.6.0
tags: added: promotion-blocker
Revision history for this message
Sorin Sbarnea (ssbarnea) wrote :
Sorin Sbarnea (ssbarnea)
description: updated
description: updated
wes hayutin (weshayutin)
Changed in tripleo:
status: Incomplete → Triaged
wes hayutin (weshayutin)
Changed in tripleo:
status: Triaged → In Progress
wes hayutin (weshayutin)
Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-quickstart-extras 2.1.1

This issue was fixed in the openstack/tripleo-quickstart-extras 2.1.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/785364
Committed: https://opendev.org/openstack/tripleo-heat-templates/commit/9b8413e79658989a7217c2e9a887af4d30eb0e64
Submitter: "Zuul (22348)"
Branch: master

commit 9b8413e79658989a7217c2e9a887af4d30eb0e64
Author: Lee Yarwood <email address hidden>
Date: Thu Apr 8 11:02:42 2021 +0100

    nova: Remove versioned default machine types

    Instead rely on the unversioned defaults provided by Nova [1].

    The use of versioned machine types should be limited to specific
    downstream releases to ensure the corner case of live migrations between
    X.Y+1 and X.Y releases of RHEL will work.

    Removal of these versioned machine types also allows TripleO to test
    with both EL based and Fedora based compute hosts that version their
    machine types differently as discussed previously in bug #1819452.

    [1] https://github.com/openstack/nova/blob/50fdbc752a9ca9c31488140ef2997ed59d861a41/nova/virt/libvirt/utils.py#L566-L575

    Change-Id: I808c033c34dfe1068ebe17dc72fdee5ef63613d8
    Related-Bug: #1819452

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.