400.002: Controller-services group has no members; 2 controllers online

Bug #2059979 reported by Brendan McShane
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Invalid
Low
Luiz Felipe Kina

Bug Description

After installing StarlingX AIO duplex, an alarm stating "400.002 | Service group controller-services has no active members available; expected 1 active member" appears. There are two controllers in the cluster that are currently online. I suspect that this may be caused by the `controller-services` group having the state `go-active` instead of the expected state `active`.

Thanks in advance!

Tags: stx.ha
Revision history for this message
Brendan McShane (bman46) wrote :
Revision history for this message
Brendan McShane (bman46) wrote :
Ghada Khalil (gkhalil)
tags: added: stx.ha
Revision history for this message
Eliud Kyale (ekyale) wrote :

Brendan can you collect logs on the system and attach to this Bug

See: https://docs.starlingx.io/fault-mgmt/kubernetes/troubleshooting-log-collection.html

Changed in starlingx:
assignee: nobody → Eliud Kyale (ekyale)
Revision history for this message
Brendan McShane (bman46) wrote :

Sure, i'll attach the logs. I did some experimenting and found that this occurs when ceph-rook is used instead of ceph. It appears to be caused by the `rook-mon-exit` service being stuck in the `initial` state. I should also note that the helm chart does not appear under `system application-list`. When I reinstalled without rook, the deployment completed successfully.

Revision history for this message
Brendan McShane (bman46) wrote :

Just tried this again with stx V9 and the `controller-services` group is also stuck in the `go-active` state. Similarly, the Rook helm chart was not uploaded and the `rook-mon-exit` service is in the `initial` state.

Revision history for this message
Brendan McShane (bman46) wrote :

After downloading the rook-mon-exit.sh script from https://opendev.org/starlingx/rook-ceph/src/branch/r/stx.9.0/stx-rook-ceph/files, moving it to `/etc/init.d/rook-mon-exit` and running `sudo sm-restart rook-mon-exit` I was able to get the alarm cleared and controller-services in the active state.

It appears rook-ceph is not installed by default? https://opendev.org/starlingx/rook-ceph/src/branch/r/stx.9.0/debian_iso_image.inc Any reason it appears to be disabled?

Revision history for this message
Ghada Khalil (gkhalil) wrote :

@Renata, can you please answer the questions related to ceph rook? Thanks

Changed in starlingx:
assignee: Eliud Kyale (ekyale) → Renata Merino Rodrigues Barbosa (rrodrig1)
Changed in starlingx:
assignee: Renata Merino Rodrigues Barbosa (rrodrig1) → Luiz Felipe Kina (leiskeki)
Revision history for this message
Luiz Felipe Kina (leiskeki) wrote :

Rook-ceph is not available and functional on stx 9.0 that is why it is not installed by default, it will be available and functioning starting on stx 10.

Changed in starlingx:
importance: Undecided → Low
Changed in starlingx:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.