Standard system compute nodes failed on install with /etc/pki error

Bug #1999588 reported by Reinildes Oliveira
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Reinildes Oliveira

Bug Description

Brief Description
-----------------
Standard system installation failed due to the compute nodes availability status being marked as failed

Severity
--------
Critical

Steps to Reproduce
------------------
install standard system with compute nodes

TC-name:

Expected Behavior
------------------
compute nodes are enabled

Actual Behavior
----------------
compute nodes failed

Reproducibility
---------------
This is the first saw this issue

System Configuration
--------------------
Multi-node system

Branch/Pull Time/Commit
-----------------------
debian 2022-12-12_22-00-13

Last Pass
---------
debian 2022-12-06_22-00-09

Timestamp/Logs
--------------
[2022-12-13 13:50:59,201] 4815 DEBUG Thread-79 system_helper.__hosts_in_states:: At least one host from ['compute-1'] has operational state(s) in disabled instead of ['enabled']
[2022-12-13 13:50:59,216] 473 DEBUG Thread-78 ssh.expect :: Output:
+----+--------------+-------------+----------------+-------------+--------------+
| id | hostname | personality | administrative | operational | availability |
+----+--------------+-------------+----------------+-------------+--------------+
| 1 | controller-0 | controller | unlocked | enabled | available |
| 2 | compute-0 | worker | unlocked | disabled | failed |
| 3 | compute-1 | worker | unlocked | disabled | failed |
| 4 | controller-1 | controller | unlocked | enabled | available |
+----+--------------+-------------+----------------+-------------+--------------+
[sysadmin@controller-0 ~(keystone_admin)]$

DM log:
ERROR controller.host user error {"request": "deployment/controller-1", "error": "failed to create OSD: Bad request with: [POST http://192.168.144.1:6385/v1/istors], error message: {\"error_message\": \"{\\\"faultcode\\\": \\\"Client\\\", \\\"faultstring\\\": \\\"Only 2 storage monitor available. At least 2 unlocked and enabled hosts with monitors are required. Please ensure hosts with monitors are unlocked and enabled.\

Test Activity
-------------
installation

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/stx-puppet/+/867582

Changed in starlingx:
status: New → In Progress
Changed in starlingx:
assignee: nobody → Reinildes Oliveira (rjosemat)
Ghada Khalil (gkhalil)
summary: - Standard system installation failed by DM config compute nodes failed
+ Standard system compute nodes failed with /etc/pki error
summary: - Standard system compute nodes failed with /etc/pki error
+ Standard system compute nodes failed on install with /etc/pki error
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (master)

Reviewed: https://review.opendev.org/c/starlingx/stx-puppet/+/867582
Committed: https://opendev.org/starlingx/stx-puppet/commit/b4ab0829e1aad8ae23b1b36eb7621271c2aee887
Submitter: "Zuul (22348)"
Branch: master

commit b4ab0829e1aad8ae23b1b36eb7621271c2aee887
Author: Rei Oliveira <email address hidden>
Date: Tue Dec 13 21:50:15 2022 -0300

    Add '/etc/pki' item for puppet ensure directory

    The ssl_ca installation is failing for worker nodes because it does not
    find the /etc/pki folder created already.

    This commit adds /etc/pki individually as an item to a puppet ensure
    directory statement. This will make sure that puppet creates this
    directory first when it does not exists. If the dir already exists
    puppet will also be satisfied and execute with success.

    Test plan:

    PASS: Add a ssl_ca certificate with system certificate-install and
          verify that certificates where added to
          etc/pki/ca-trust/source/anchors/ca-cert.crt in a compute node.
    PASS: Run a full deploy of a standard lab and verify that compute nodes
          become unlocked and available.

    Closes-Bug: 1999588

    Signed-off-by: Rei Oliveira <email address hidden>
    Change-Id: Ib59ab88a9d4d1112e35f98d92aef72cbac01af07

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → High
tags: added: stx.8.0 stx.config
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.