overcloud nodes stuck in build

Bug #1779295 reported by Matthias Runge
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Won't Fix
High
Unassigned

Bug Description

I'm trying master:

bash devmode.sh --no-gate --ovb -d -w /var/tmp/tripleo_local

During overcloud deploy, the process gets stuck:

(undercloud) [stack@localhost ~]$ openstack server list
+--------------------------------------+-------------------------+--------+------------------------+----------------+-----------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+-------------------------+--------+------------------------+----------------+-----------+
| 5d67039b-ce3e-4b15-afeb-d6f4c3fd4a73 | overcloud-controller-0 | ACTIVE | ctlplane=192.168.24.16 | overcloud-full | baremetal |
| ae6edb22-1edd-48d9-8566-f0b5cce18864 | overcloud-novacompute-0 | BUILD | | overcloud-full | baremetal |
+--------------------------------------+-------------------------+--------+------------------------+----------------+-----------+

At the same time, I see on cloud:
[root@kepler ~(keystone_admin)]# openstack server list --all
+--------------------------------------+-------------------+---------+----------------------------------------------------------------------------------------------------------+---------------------------------+
| ID | Name | Status | Networks | Image Name |
+--------------------------------------+-------------------+---------+----------------------------------------------------------------------------------------------------------+---------------------------------+
| f12cdfb7-e58e-4710-a1f1-7e08dbf53777 | bmc-14381 | ACTIVE | private-14381=10.0.1.10 | bmc-base |
| ea6d2248-2a93-4884-8043-42b3ee3e8ffa | baremetal-14381_1 | ACTIVE | overcloud_internal-14381=172.17.0.3; overcloud_storage_mgmt-14381=172.19.0.9; | |
| | | | provision-14381=192.168.24.11; public-14381=10.0.0.8, 10.0.0.7; overcloud_tenant-14381=172.16.0.2; | |
| | | | overcloud_storage-14381=172.18.0.5 | |
| d2effba7-4187-4d25-b2cf-967daf5fb18b | baremetal-14381_0 | SHUTOFF | overcloud_internal-14381=172.17.0.2; overcloud_storage_mgmt-14381=172.19.0.6; | |
| | | | provision-14381=192.168.24.9; public-14381=10.0.0.4, 10.0.0.10; overcloud_tenant-14381=172.16.0.1; | |
| | | | overcloud_storage-14381=172.18.0.9 | |
| 02308b95-7eff-4d6e-85c2-295a1458a31f | undercloud-14381 | ACTIVE | private-14381=10.0.1.12, 192.168.36.82; provision-14381=192.168.24.8; public-14381=10.0.0.1 | |

+--------------------------------------+-------------------+---------+----------------------------------------------------------------------------------------------------------+---------------------------------+

There is a mismatch in state "building" mentioned in undercloud, "shutoff" on the infrastructure.

And that stays until timeout. How to debug this further?

Matthias Runge (mrunge)
Changed in tripleo:
importance: Undecided → High
milestone: none → stein-1
milestone: stein-1 → rocky-3
status: New → Triaged
Revision history for this message
wes hayutin (weshayutin) wrote :

@Matthias
Hey brotha, the supported way to recreate upstream jobs has changed a bit...
Each log in the upstream or rdo jobs has a file called reproducer-quickstart.sh that deprecates devmode

Not sure which job you are trying here.. but FYI. Ping me on irc w/ questions

https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-master-upload/db23aa3/reproducer-quickstart.sh

Revision history for this message
wes hayutin (weshayutin) wrote :
Changed in tripleo:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.