nova services takes several minutes to stabilize.

Bug #1823275 reported by Maria Guadalupe Perez Ibara
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Jim Gauld

Bug Description

Brief Description
-----------------
An issue was found across all BareMetal configurations, it looks like nova services takes several minutes to stabilize.

Severity
--------
Critical

Steps to Reproduce
------------------
openstack server create --flavor m1.tiny --image cirros --nic net-id=net vm1

Expected Behavior
------------------
create Vms successful

Actual Behavior
----------------
status: Error

| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+------+--------+----------+--------+----------+
| 41d660b4-cb4f-405f-b9dc-e390abef3e42 | vm2 | ERROR | | centos | m1.small |
| a7d3bf34-fd54-434a-901e-b076b74aeb7b | vm1 | ERROR | | cirros | m1.tiny |
+--------------------------------------+------+--------+----------+--------+----------+

Reproducibility
---------------
Reproducible 100% in all Baremetal configurations

System Configuration
--------------------
simplex, duplex, Standard Local Storage(2+2), Standard Dedicated Storage(2+2+2)

Branch/Pull Time/Commit
-----------------------
http://mirror.starlingx.cengn.ca/mirror/starlingx/master/centos/20190404T013000Z/outputs/iso/

Timestamp/Logs
--------------
evidence.log
controller-0_20190404.162328.tar

Test Activity
-------------
[Sanity]

Revision history for this message
Maria Guadalupe Perez Ibara (maria-gp) wrote :
Revision history for this message
Maria Guadalupe Perez Ibara (maria-gp) wrote :
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Based on the evidence.log attachment, nova is returning this error:
2019-04-04 16:22:30,902.902 168396 ERROR nova.compute.manager [req-b3bffaba-a62a-4e56-be63-3362c51a36df - - - - -] Error updating resources for node compute-0.: PciDeviceNotFoundById: PCI device 0000:b3:02.3 not found

Please confirm if the node has any non-network pci device on it.
system host-device-list <node>

Changed in starlingx:
status: New → Incomplete
Revision history for this message
Maria Guadalupe Perez Ibara (maria-gp) wrote :

[wrsroot@controller-0 ~(keystone_admin)]$ system host-device-list compute-0
+----------------+--------------+----------+-----------+-----------+------------------------+-------------------------+-------------------------------------+-----------+---------+
| name | address | class id | vendor id | device id | class name | vendor name | device name | numa_node | enabled |
+----------------+--------------+----------+-----------+-----------+------------------------+-------------------------+-------------------------------------+-----------+---------+
| pci_0000_02_00_0 | 0000:02:00.0 | 030000 | 1a03 | 2000 | VGA compatible controller| ASPEED Technology, Inc. | ASPEED Graphics Family | 0 | True |
| pci_0000_b1_00_0 | 0000:b1:00.0 | 0b4000 | 8086 | 37c8 | Co-processor | Intel Corporation | C62x Chipset QuickAssist Technology | 1 | True |
| pci_0000_b3_00_0 | 0000:b3:00.0 | 0b4000 | 8086 | 37c8 | Co-processor | Intel Corporation | C62x Chipset QuickAssist Technology | 1 | True |
| pci_0000_b5_00_0 | 0000:b5:00.0 | 0b4000 | 8086 | 37c8 | Co-processor | Intel Corporation | C62x Chipset QuickAssist Technology | 1 | True |
+----------------+--------------+----------+-----------+-----------+------------------------+-------------

[wrsroot@controller-0 ~(keystone_admin)]$ system host-device-list compute-1
+----------------+--------------+----------+-----------+-----------+------------------------+-------------------------+-------------------------------------+-----------+---------+
| name | address | class id | vendor id | device id | class name | vendor name | device name | numa_node | enabled |
+----------------+--------------+----------+-----------+-----------+------------------------+-------------------------+-------------------------------------+-----------+---------+
| pci_0000_02_00_0 | 0000:02:00.0 | 030000 | 1a03 | 2000 | VGA compatible controller | ASPEED Technology, Inc. | ASPEED Graphics Family | 0 | True
| pci_0000_b1_00 _0| 0000:b1:00.0 _0| 0b4000 | 8086 | 37c8 | Co-processor | Intel Corporation | C62x Chipset QuickAssist Technology | 1 | True |
| pci_0000_b3_00 _0| 0000:b3:00.0 | 0b4000 | 8086 | 37c8 | Co-processor | Intel Corporation | C62x Chipset QuickAssist Technology | 1 | True
| pci_0000_b5_00_0 | 0000:b5:00.0 | 0b4000 | 8086 | 37c8 | Co-processor | Intel Corporation | C62x Chipset QuickAssist Technology | 1 | True

Revision history for this message
Ghada Khalil (gkhalil) wrote :

This is a duplicate of https://bugs.launchpad.net/starlingx/+bug/1821938
A nova fix is available and will be picked up in the next starlingx docker image build. The fix should be available on April 9.

tags: added: stx.2.0 stx.distro.openstack
Changed in starlingx:
assignee: nobody → Jim Gauld (jgauld)
importance: Undecided → High
status: Incomplete → Fix Committed
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as Fix Released to align with the duplicate bug.
The fix should be included in the stable docker images, built on April 9 (afternoon/evening).

Changed in starlingx:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.