Unable to unlock storage node after running the restore successfully (wipe_ceph_osds=true)

Bug #2043412 reported by Felipe Sanches Zanoni
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Felipe Sanches Zanoni

Bug Description

Brief Description
-----------------
Unable to unlock storage node after running the restore successfully (wipe_ceph_osds=true)

Severity
--------
Critical

Steps to Reproduce
------------------
1. Perform backup from controller-0
2. SCP file from /opt/backups to workstation
3. Install the iso again and do not bootstrap
4. Login to Controller-0
5. SCP backup file from Workstation to controller-0
6. Restore running the restore playbook passing "wipe_ceph_osds=true"
7) Post Ansible, unlock the controller-0
8) Now unlock the controller-1
9) Ceph get broken run ceph -s to check)
10) Unlock the storage-0 node and get the message:

Can not unlock storage node. Only 1 storage monitor available. At least 2 unlocked and enabled hosts with monitors are required. Please ensure hosts with monitors are unlocked and enabled.

Expected Behavior
------------------
Should be able to unlock the storage node successfully.

Actual Behavior
----------------
Unable to unlock the storage node.

Reproducibility
---------------
100% reproducible

System Configuration
--------------------
Storage (2+2+2) with

Branch/Pull Time/Commit
-----------------------
master branch

Last Pass
---------
N/A

Timestamp/Logs
--------------
N/A

Test Activity
-------------
Regression

Workaround
----------
N/A

Changed in starlingx:
assignee: nobody → Felipe Sanches Zanoni (fsanches)
tags: added: stx.9.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (master)
Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (master)

Reviewed: https://review.opendev.org/c/starlingx/ansible-playbooks/+/900820
Committed: https://opendev.org/starlingx/ansible-playbooks/commit/7a4aad2fdd5e5b62300a2e37b560a027c703e80c
Submitter: "Zuul (22348)"
Branch: master

commit 7a4aad2fdd5e5b62300a2e37b560a027c703e80c
Author: Felipe Sanches Zanoni <email address hidden>
Date: Mon Nov 13 15:39:43 2023 -0300

    Do not restore Ceph crush map when wiping Ceph OSD disks

    When running the restore playbook, the Ceph crush map was being
    restored even if the flag wipe_ceph_osds was set to true.

    The restore playbook was not checking the status of the wipe_ceph_osds
    flag. A check is added to the whole block. Now the Ceph crush map is
    only restored if the flag is set to false and there is Ceph backend
    configured.

    Ceph will be reconfigured during the unlock process of each node.

    Test Plan:
      PASS: B&R AIO-DX with wipe_ceph_osds=false
      PASS: B&R AIO-DX with wipe_ceph_osds=true
      PASS: B&R Standard with wipe_ceph_osds=false
      PASS: B&R Standard with wipe_ceph_osds=true
      PASS: B&R Storage with wipe_ceph_osds=false
      PASS: B&R Storage with wipe_ceph_osds=true

    Closes-bug: 2043412

    Change-Id: Ib4e4c6933bf7ff6b8f23d994f19a6e79c01dd2b1
    Signed-off-by: Felipe Sanches Zanoni <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
tags: added: stx.storage
Changed in starlingx:
importance: Undecided → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.