AIO - Simplex alarm "Loss of replication"

Bug #1846287 reported by Cristopher Lemus
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Stefan Dinescu

Bug Description

Brief Description
-----------------
AIO simplex configuration is reporting an alarm "Loss of replication in replication group group-0: peer host down". There is no peer for a simplex system.

Severity
--------
Minor: System is stable.

Steps to Reproduce
------------------
Provision a Simplex system following up documentation. System logs the alarm, and it doesn't clear even when all other indicators are OK.

Expected Behavior
------------------
Simplex system doesn't have a peer, should not create this alarm.

Actual Behavior
----------------
Simplex system reports the alarm.

Reproducibility
---------------
100%

System Configuration
--------------------
Simplex system.

Branch/Pull Time/Commit
-----------------------
BUILD_ID="20190930T230000Z"

Last Pass
---------
This was discussed on today's (Oct 1st 2019) test meeting, it was discovered last week.

Timestamp/Logs
--------------
Full collect attached.
http://paste.openstack.org/show/780649/

Test Activity
-------------
Sanity

Revision history for this message
Cristopher Lemus (cjlemusc) wrote :
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as stx.3.0 gating - agree there shouldn't be an alarm referring to a peer node on a simplex system.

Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
assignee: nobody → Stefan Dinescu (stefandinescu)
tags: added: stx.3.0 stx.storage
Revision history for this message
Stefan Dinescu (stefandinescu) wrote :

This bug was introduced by https://review.opendev.org/#/c/681246/

The code before the above change was hiding the issue by clearing the alarm as soon as it was raised. Still the code shouldn't raise the alarm in the first place and I am working on a solution to this.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to utilities (master)

Fix proposed to branch: master
Review: https://review.opendev.org/687282

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to utilities (master)

Reviewed: https://review.opendev.org/687282
Committed: https://git.openstack.org/cgit/starlingx/utilities/commit/?id=d838e22ac64c40eade0e2fd7805621150c40a782
Submitter: Zuul
Branch: master

commit d838e22ac64c40eade0e2fd7805621150c40a782
Author: Stefan Dinescu <email address hidden>
Date: Tue Oct 8 15:18:29 2019 +0300

    Don't raise missing ceph peer alarm on simplex

    A missing ceph peer alarm was raised on an AIO-SX node.
    Since an AIO-SX node only has one node, such an alarm makes
    no sense on this kind of setup and it should not be raised.

    Note: also added some extra empty lines for improved code
          readability.

    Change-Id: Ie6debccad4667e840859eba3d5e369e8932e782a
    Closes-bug: 1846287
    Signed-off-by: Stefan Dinescu <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.