Ceph monitor process kill failed to result in host degrade or ceph health warn

Bug #1807748 reported by Maria Yousaf
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Wei Zhou

Bug Description

Brief Description
-----------------
Ceph monitor process kill failed to result in degrade condition

Severity
--------
Major

Steps to Reproduce
------------------
1. Remove the ceph monitor and kill the ceph monitor process. This leaves two ceph monitors in the quorum.
2. No degrade condition is reported and ceph health remains okay with 2 members up in the quorum.

[2018-12-09 13:57:27,619] 389 DEBUG MainThread ssh.expect :: Output:
    cluster f6f37d71-0759-44ec-8264-e6cb1abebc29
     health HEALTH_OK
     monmap e2: 2 mons at {controller-1=192.168.205.103:6789/0,storage-0=192.168.205.108:6789/0}
            election epoch 24, quorum 0,1 controller-1,storage-0
     osdmap e181: 4 osds: 3 up, 3 in
            flags sortbitwise,require_jewel_osds
      pgmap v8390: 1728 pgs, 6 pools, 1786 MB data, 238 objects
            3707 MB used, 1762 GB / 1766 GB avail
                1728 active+clean
controller-1:~$

Need confirmation if this is the new behaviour resulting from going upstream for ceph. If so, test case needs to be modified accordingly. Or if this is a software issue.

Expected Behavior
------------------
Monitor where process was killed goes into degrade and ceph reports a health warn.

Actual Behavior
----------------
All nodes available and ceph reported okay

Reproducibility
---------------
Seems reproducible.

System Configuration
--------------------
Storage

Branch/Pull Time/Commit
-----------------------
master as of 2018-12-07_20-18-00

Timestamp/Logs
--------------
2018-12-09 13:57:27,619

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Assigning to Ovidiu to comment on whether this is a bug or expected behavior given the recent ceph changes for AIO

Changed in starlingx:
assignee: nobody → Ovidiu Poncea (ovidiu.poncea)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

As per Wei's investigation, this is a duplicate of https://bugs.launchpad.net/starlingx/+bug/1807444

Changed in starlingx:
assignee: Ovidiu Poncea (ovidiu.poncea) → Wei Zhou (wzhou007)
importance: Undecided → Medium
status: New → Triaged
tags: added: stx.2019.03 stx.config
Ken Young (kenyis)
tags: added: stx.2019.05
removed: stx.2019.03
Revision history for this message
Frank Miller (sensfan22) wrote :
Changed in starlingx:
status: Triaged → Fix Released
Ken Young (kenyis)
tags: added: stx.2.0
removed: stx.2019.05
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.