ceph-disk times out when preparing an NVMe based OSD
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
Bob Church |
Bug Description
Brief Description
-----------------
When ceph-disk is used to prepare an new OSD on a NVMe drive and intermittent failure is observed.
2020-12-
2020-12-
2020-12-
2020-12-
2020-12-
2020-12-
2020-12-
2020-12-
2020-12-
2020-12-
2020-12-
2020-12-
2020-12-
2020-12-
2020-12-
2020-12-
2020-12-
2020-12-
Severity
--------
Critical: System/Feature is not usable due to the defect.
Steps to Reproduce
------------------
Add the Ceph storage backend and the add a OSD to the system:
$ system storage-backend-add ceph --confirmed
watch -n 10 system storage-
$ system host-disk-list controller-0 | awk '/\/dev\
$ system host-disk-list controller-1 | awk '/\/dev\
$ watch "system host-stor-list controller-0; system host-stor-list controller-1"
Expected Behavior
------------------
The stors (OSDs) to become "configured"
Actual Behavior
----------------
The stors (OSDs) end up in "configurtion-
Reproducibility
---------------
Intermittent.
System Configuration
-------
AIO-DX
Branch/Pull Time/Commit
-------
N/A
Last Pass
---------
N/A
Timestamp/Logs
--------------
Debugged on a live system no corresponding collect logs
Test Activity
-------------
Developer Testing
Workaround
----------
Manually execute the ceph-disk command outside of the puppet context to prepare the OSD. Then lock/unlock the host(s)
Changed in starlingx: | |
importance: | Undecided → High |
Changed in starlingx: | |
status: | In Progress → Fix Released |
tags: | added: stx.5.0 |
Fix proposed: https:/ /github. com/starlingx- staging/ stx-ceph/ pull/40