StarlingX

Unable to unlock storage node after running the restore successfully (wipe_ceph_osds=true)

Bug #2043412 reported by Felipe Sanches Zanoni on 2023-11-13

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	StarlingX	Fix Released	High	Felipe Sanches Zanoni

Bug Description

Brief Description
-----------------
Unable to unlock storage node after running the restore successfully (wipe_ceph_osds=true)

Severity
--------
Critical

Steps to Reproduce
------------------
1. Perform backup from controller-0
2. SCP file from /opt/backups to workstation
3. Install the iso again and do not bootstrap
4. Login to Controller-0
5. SCP backup file from Workstation to controller-0
6. Restore running the restore playbook passing "wipe_ceph_osds=true"
7) Post Ansible, unlock the controller-0
8) Now unlock the controller-1
9) Ceph get broken run ceph -s to check)
10) Unlock the storage-0 node and get the message:

Can not unlock storage node. Only 1 storage monitor available. At least 2 unlocked and enabled hosts with monitors are required. Please ensure hosts with monitors are unlocked and enabled.

Expected Behavior
------------------
Should be able to unlock the storage node successfully.

Actual Behavior
----------------
Unable to unlock the storage node.

Reproducibility
---------------
100% reproducible

System Configuration
--------------------
Storage (2+2+2) with

Branch/Pull Time/Commit
-----------------------
master branch

Last Pass
---------
N/A

Timestamp/Logs
--------------
N/A

Test Activity
-------------
Regression

Workaround
----------
N/A

Tags:

Felipe Sanches Zanoni (fsanches) on 2023-11-13

Changed in starlingx:
assignee:	nobody → Felipe Sanches Zanoni (fsanches)
tags:	added: stx.9.0

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2023-11-13: Fix proposed to ansible-playbooks (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/ansible-playbooks/+/900820

Changed in starlingx:
status:	New → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2023-11-14: Fix merged to ansible-playbooks (master)

Reviewed: https://review.opendev.org/c/starlingx/ansible-playbooks/+/900820
Committed: https://opendev.org/starlingx/ansible-playbooks/commit/7a4aad2fdd5e5b62300a2e37b560a027c703e80c
Submitter: "Zuul (22348)"
Branch: master

commit 7a4aad2fdd5e5b62300a2e37b560a027c703e80c
Author: Felipe Sanches Zanoni <email address hidden>
Date: Mon Nov 13 15:39:43 2023 -0300

Do not restore Ceph crush map when wiping Ceph OSD disks

When running the restore playbook, the Ceph crush map was being
restored even if the flag wipe_ceph_osds was set to true.

    The restore playbook was not checking the status of the wipe_ceph_osds
    flag. A check is added to the whole block. Now the Ceph crush map is
    only restored if the flag is set to false and there is Ceph backend
    configured.

Ceph will be reconfigured during the unlock process of each node.

    Test Plan:
      PASS: B&R AIO-DX with wipe_ceph_osds=false
      PASS: B&R AIO-DX with wipe_ceph_osds=true
      PASS: B&R Standard with wipe_ceph_osds=false
      PASS: B&R Standard with wipe_ceph_osds=true
      PASS: B&R Storage with wipe_ceph_osds=false
      PASS: B&R Storage with wipe_ceph_osds=true

Closes-bug: 2043412

Change-Id: Ib4e4c6933bf7ff6b8f23d994f19a6e79c01dd2b1
Signed-off-by: Felipe Sanches Zanoni <email address hidden>

Changed in starlingx:
status:	In Progress → Fix Released

Ghada Khalil (gkhalil) on 2023-11-14

tags:	added: stx.storage
Changed in starlingx:
importance:	Undecided → High

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.