StarlingX

Storage node failure state seen after unlock

Bug #1800907 reported by Maria Yousaf on 2018-10-31

This bug report is a duplicate of: Bug #1800889: Storage node fails to unlock when installing. Edit Remove

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	StarlingX	Fix Released	Medium	Daniel Badea

Bug Description

Brief Description
-----------------
After unlocking a storage node, it was observed to go into failed state. It was rebooted and then came up as available.

Severity
--------
Major

Steps to Reproduce
------------------
1. Ceph was added as a backend via system storage-backend-add
2. Storage nodes were added to the system
3. After the storage nodes were in locked/online state, the nodes were unlocked
4. On unlock, both storage nodes were observed to go into failed state.
5. They were rebooted and after reboot, they became available.
6. Looking at the logs, I see a number of failures in ceph-osd-prepare as follows:

2018-10-29T18:08:14.931 ^[[mNotice: 2018-10-29 18:08:14 +0000 /Stage[post]/Platform::Config::Post/Service[crond]: Dependency Exec[ceph-osd-prepare-/dev/disk/by-path/pci-0000:00:1f.2-ata-2.0] has failures: true^[[0m
2018-10-29T18:08:14.932 ^[[mNotice: 2018-10-29 18:08:14 +0000 /Stage[post]/Platform::Config::Post/Service[crond]: Dependency Exec[ceph-osd-prepare-/dev/disk/by-path/pci-0000:00:1f.2-ata-6.0] has failures: true^[[0m

It should be noted that a 'sudo wipedisk' was run prior to the storage nodes being provisioned.

Expected Behavior
------------------
The storage nodes do not go into available state.

Actual Behavior
----------------
Storage nodes report failed state, reboot and then become available.

Reproducibility
---------------
Tried once so far but seen on both storage nodes.

System Configuration
--------------------
Storage

Branch/Pull Time/Commit
-----------------------
master as of 2018-10-24_21-18-00

Timestamp/Logs
--------------
See above

Tags:

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2018-11-01:

Duplicate of https://bugs.launchpad.net/starlingx/+bug/1800889

Changed in starlingx:
assignee:	nobody → Daniel Badea (daniel.badea)
importance:	Undecided → Medium
status:	New → Fix Released
tags:	added: stx.2019.03 stx.config

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2018-11-01:

Fix merged as of Nov 1/2018

Ken Young (kenyis) on 2019-01-18

tags:

added: stx.2019.05
removed: stx.2019.03

Ken Young (kenyis) on 2019-04-05

tags:

added: stx.2.0
removed: stx.2019.05

Report a bug

This report contains Public information

Everyone can see this information.

Duplicate of bug #1800889 Remove

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.