client relations are never processed when configuration change (monitor-count, expected-osd-count) causes cluster to become ready

Bug #1732491 reported by Edward Hope-Morley on 2017-11-15
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack ceph-mon charm
Medium
Trent Lloyd

Bug Description

If i deploy 1 unit of ceph-mon with monitor-count=3 it failes to bootstrap as expected. If i then set monitor-count=1 i would expect it to then bootstap but instead it does nothing and I have to remove and re-add the ceph client relations to get it to happen.

James Page (james-page) on 2017-11-29
Changed in charm-ceph-mon:
status: New → Triaged
James Page (james-page) on 2017-12-01
Changed in charm-ceph-mon:
milestone: 17.11 → 18.02
Ryan Beisner (1chb1n) on 2018-03-09
Changed in charm-ceph-mon:
milestone: 18.02 → 18.05
David Ames (thedac) on 2018-06-11
Changed in charm-ceph-mon:
milestone: 18.05 → 18.08
Edward Hope-Morley (hopem) wrote :

I have now found some additional issues with this logic. If I start by deploying 3 units with monitor-count=1 (which I know if wrong) what happens is that all 3 units will bootstrap independently i.e. at the point at which they have what they consider to be sufficient hosts i.e. 1 a.k.a themselves. The problem is that if I then set monitor-count=3 it remains wedged and the only way to fix it is to delete 2 units and start again. The same is also True if I have N mon units with monitor-count=N and i scale out my ceph-mon application before updating monitor-count. I think the charm should be able to manage these scenarios somehow.

I suspect that the best way to manage some of these issues would be to only allow the juju leader to bootstrap, as there can be only one ;-)

James Page (james-page) on 2018-09-12
Changed in charm-ceph-mon:
milestone: 18.08 → 18.11
David Ames (thedac) on 2018-11-20
Changed in charm-ceph-mon:
milestone: 18.11 → 19.04
Trent Lloyd (lathiat) wrote :

The problem here is that both client_relation_joined and client_relation_changed hooks (which provides ceph keys & processes broker requests to create pools etc) check for "if ready_for_service()" before processing those requests.

If not ready for service, the hooks do nothing. However charm hooks are not re-run just because we weren't ready to process them.

The charm either needs to queue a list of such relations to process later, or, needs to iteratively check all such relations in some way when it later decides it is ready for service.

This seems to be a common charm design problem that probably needs some documentation written for charm authors to describe how to handle this situation cleanly. As it's not 100% obvious.

The ceph-mon charm does indeed have such a function to re-process the hook requests, which is notify_client(). However this function is not re-run in the case of config_changed. Presently it appears to happen only in the case of osd_relation (presumably to check if expected-monitor-count is now exceeded), upgrade_charm and also mon_relation.

As a result in the originally described situation (changing monitor-count=1) the client relations are not re-triggered. It also doesn't happen if expected-osd-count is updated, which is the situation I hit this same issue. expected-osd-count was not hit, so the relations were skipped and then even if you update the config they are not reprocessed.

Trent Lloyd (lathiat) wrote :

I also filed a bug that if expected-osd-count is not meant (or otherwise ready_for_service check fails), there is no juju status information to convey that information to the admin:
https://bugs.launchpad.net/charm-ceph-mon/+bug/1807652

Trent Lloyd (lathiat) on 2018-12-10
summary: - failed bootstrap due to incorrect monitor-count is unresolvable
+ client relations are never processed when configuration change (monitor-
+ count, expected-osd-count) causes cluster to become ready
Trent Lloyd (lathiat) wrote :
Changed in charm-ceph-mon:
assignee: nobody → Trent Lloyd (lathiat)
status: Triaged → In Progress
David Ames (thedac) on 2019-04-17
Changed in charm-ceph-mon:
milestone: 19.04 → 19.07
David Ames (thedac) on 2019-08-12
Changed in charm-ceph-mon:
milestone: 19.07 → 19.10
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers