nova-compute service failed to start up
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Expired
|
Undecided
|
Unassigned |
Bug Description
Description
===========
nova compute service crashed and can't start up
Steps to reproduce
====
N/A
Expected result
===============
nova compute service started successfully
Actual result
=============
nova compute service failed to start, the error log looks like
7fa7cc4856fc
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
2021-03-31 15:32:52.056 59251 ERROR oslo_service.
Environment
===========
1. Exact version of OpenStack you are running. See the following
list for all releases: http://
$ dpkg -l | grep nova
ii nova-api-metadata 2:21.1.
ii nova-common 2:21.1.
ii nova-compute 2:21.1.
ii nova-compute-kvm 2:21.1.
ii nova-compute-
ii python3-nova 2:21.1.
ii python3-novaclient 2:17.0.
2. Which storage type did you use?
ceph
3. Which networking type did you use?
Neutron with OpenVSwitch
based on the fact the error is .plug_hw_veb and plugin= 'ovs',port_ profile= VIFPortProfileO penVSwitch, this would imply this port is vnic_type direct and it was bound by either the ml2/ovs or m2/ovn mechnium drivers which means the network backend is hardware offloaded ovs.
the trusted VF feature is only implemented for standard sriov and is not supported with hardware offloaded ovs but the error in this case seams to be cased by the lack of the pci address in the binding profile. can you provide the output of openstack port show and confirm that this is hardware offloaded ovs.
the failure is happening because we try to plug the vifs on agent start and if the port data is corrupted as it appears to be in this case it will fail. i believe the way to fix this would be to identify which VF is claimed in the pci_devices table for the instance and update the binding profile manually.