Instance sticks in BUILD when scheduling with a port assigned invalid MAC

Bug #2015092 reported by Alexey Stupnikov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Alexey Stupnikov

Bug Description

This is a copy of downstream https://bugzilla.redhat.com/show_bug.cgi?id=1949703 caused by combination of Neutron bug (addressed in https://bugs.launchpad.net/neutron/+bug/1926273) and Nova inability to process ValueError raised by oslo_versionedobjects.

Steps to reproduce (before Neutron bug was fixed):

1. Create bad port
$ openstack port create --mac-address 40:28:0:00:2:6 --disable-port-security --network ess-dpdk-1370 test-port-bad-mac_user-dpdk2

2. Create server using bad port
$ openstack server create --flavor m1.tiny --image cirros --nic port-id=023abee9-a896-4678-b6fe-d4abae9066f0 dpdk-instance-bad-mac2

Expected outcome:
Instance is in ERROR state

Actual outcome:
Instance is sticks in BUILD state

Tags: compute
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/879350

Changed in nova:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/879350
Committed: https://opendev.org/openstack/nova/commit/dacae335e4c538f54a32f431469bd7059cba48c3
Submitter: "Zuul (22348)"
Branch: master

commit dacae335e4c538f54a32f431469bd7059cba48c3
Author: Alexey Stupnikov <email address hidden>
Date: Mon Apr 3 16:19:35 2023 +0200

    Process unlimited exceptions raised by unplug_vifs

    Currently compute manager's _cleanup_allocated_networks() method
    expects NotImplementedError or exception.NovaException when
    calling self.driver.unplug_vifs.

    In reality, other class of exception could be raised. It could break
    the Nova Compute flow and leave instance in inconsistent state. This
    patch switches from exception.NovaException to all kinds of exceptions.

    Closes-bug: #2015092
    Change-Id: Icaf3cc93edfea97ee4fa497bdeb5f7d631c8ae55

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 28.0.0.0rc1

This issue was fixed in the openstack/nova 28.0.0.0rc1 release candidate.

Revision history for this message
Doug Szumski (dszumski) wrote :

It's common to hit this when using Ironic + NGS.

For example, if an exception is raised when re-configuring baremetal ports, the deploy fails, the baremetal node is cleaned up, but the instance remains stuck in build because the last few lines of the cleanup don't run due to the port configure exception.

I think it's worth back-porting given the simplicity of the fix.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.