Failed to reinstall controller on AIO-DX system
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Ovidiu Poncea |
Bug Description
Brief Description
-----------------
Controller-0 enters a failed state after performing a host-reinstall on AIO-DX. Likely an issue with controller-1 as well. Appears to be related to ceph journals.
Severity
--------
Major
Steps to Reproduce
------------------
Install AIO-DX system
Swact to controller-1
Lock controller-0
issue system host-reinstall controller-0
Unlock controller-0
Expected Behavior
------------------
Controller-0 unlocks and becomes available
Actual Behavior
----------------
Controller-0 enters a failed state
Reproducibility
---------------
100% on AIO-DX
System Configuration
-------
Seen on AIO-DX systems. Not seen on 2+2 systems. Have not tested on dedicated storage systems.
Branch/Pull Time/Commit
-------
cengn 20200107T000000Z
Last Pass
---------
Unknown
Timestamp/Logs
--------------
Controller-0 Puppet
2020-01-
Controller-1 sysinv
2020-01-16 04:28:23.981 1131440 WARNING ceph_client ... [{u'outb': u'{"checks"
Test Activity
-------------
Developer Testing
Workaround
----------
lock node before reinstall, identify all ceph osd disks, wipe the journal by hand on the node you want to reinstall, then reinstall
e.g.
controller-0# dd if=/dev/zero of=/dev/sdb2 bs=1M
Changed in starlingx: | |
assignee: | Ovidiu Poncea (ovidiu.poncea) → Stefan Dinescu (stefandinescu) |
Changed in starlingx: | |
assignee: | Stefan Dinescu (stefandinescu) → Ovidiu Poncea (ovidiu.poncea) |
stx.4.0 / medium priority - workaround exists