Bootstrap-failed by "cannot open /dev/sdb2: No such file or directory"

Bug #2044806 reported by Erickson Silva de Oliveira
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Erickson Silva de Oliveira

Bug Description

Brief Description
-----------------
Installation subcloud bootstrap-failed. the ansible log showed,
blockdev: cannot open /dev/sdb2: No such file or directory

Severity
--------
Major

Steps to Reproduce
------------------
install subcloud

TC-name:

Expected Behavior
------------------
install subcloud success

Actual Behavior
----------------
install subcloud failed

Reproducibility
---------------
This is the first time saw this issue

System Configuration
--------------------
DC subcloud SX

Branch/Pull Time/Commit
-----------------------
wrcp-master-debian 2023-11-11_19-00-08

Last Pass
---------
wrcp-master-debian 2023-11-07_23-24-02

Timestamp/Logs
--------------
[2023-11-13 08:58:29,582] 349 DEBUG MainThread ssh.send :: Send 'dcmanager --os-endpoint-type internalURL subcloud list'
[2023-11-13 08:58:29,632] 551 DEBUG MainThread ssh.exec_cmd:: Expecting [.@controller-[01] .(keystone_admin)]\$ in prompt
[2023-11-13 08:58:31,554] 471 DEBUG MainThread ssh.expect :: Output:
-----------------------------------------------------------------------------------+

id name management availability deploy status sync backup status backup datetime
-----------------------------------------------------------------------------------+

1 subcloud1 unmanaged offline bootstrapping unknown None None
2 subcloud2 unmanaged offline bootstrapping unknown None None
-----------------------------------------------------------------------------------+

[2023-11-13 08:59:31,637] 349 DEBUG MainThread ssh.send :: Send 'dcmanager --os-endpoint-type internalURL subcloud list'
[2023-11-13 08:59:31,687] 551 DEBUG MainThread ssh.exec_cmd:: Expecting [.@controller-[01] .(keystone_admin)]\$ in prompt
[2023-11-13 08:59:33,607] 471 DEBUG MainThread ssh.expect :: Output:
--------------------------------------------------------------------------------------+

id name management availability deploy status sync backup status backup datetime
--------------------------------------------------------------------------------------+

1 subcloud1 unmanaged offline bootstrapping unknown None None
2 subcloud2 unmanaged offline bootstrap-failed unknown None None
--------------------------------------------------------------------------------------+

Changed in starlingx:
assignee: nobody → Erickson Silva de Oliveira (esilvade)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (master)
Changed in starlingx:
status: New → In Progress
tags: added: stx stx.9.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (master)

Reviewed: https://review.opendev.org/c/starlingx/ansible-playbooks/+/901991
Committed: https://opendev.org/starlingx/ansible-playbooks/commit/e016a153db291685d3e7bb20e683eb495c41892e
Submitter: "Zuul (22348)"
Branch: master

commit e016a153db291685d3e7bb20e683eb495c41892e
Author: Erickson Silva de Oliveira <email address hidden>
Date: Mon Nov 27 12:13:09 2023 -0300

    Fix wipe_osds.sh script

    An inconsistency was found in the wipe_osds.sh script, where it
    reversed the ceph data and ceph journal partitions. This happened
    because the partition number is obtained through a counter, which
    is incremented with each for loop. Since the lsblk command used to
    obtain the partitions, the /dev/sdX2 partition was returned as the
    first in the list. This way, the script obtained information from
    one partition while it was processing another.

    To resolve this, sfdisk was used to obtain disk information and
    udevadm was used to know exactly which partition number is being
    processed. These commands were used based on the sysinv's
    partition_info.sh script.

    Finally, some other points in the script were improved according
    to analysis by the shellcheck tool.

    Test Plan:
      - PASS: (AIO-SX) Replace wipe_osds.sh with changes,
              run and check script output
      - PASS: (AIO-SX with multipath) Replace wipe_osds.sh
              with changes, run and check script output
      - PASS: (AIO-SX) Optimized restore with wipe ceph
      - PASS: (AIO-DX) Legacy restore with wipe ceph
      - PASS: (AIO-DX) Upgrade with Ceph storage backend

    Closes-Bug: 2044806

    Change-Id: If1629e53f9a298d8202fb881bd13b6772da9ee38
    Signed-off-by: Erickson Silva de Oliveira <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
tags: added: stx.storage
removed: stx
Changed in starlingx:
importance: Undecided → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.