Comment 4 for bug 1732491

Revision history for this message
Trent Lloyd (lathiat) wrote : Re: failed bootstrap due to incorrect monitor-count is unresolvable

The problem here is that both client_relation_joined and client_relation_changed hooks (which provides ceph keys & processes broker requests to create pools etc) check for "if ready_for_service()" before processing those requests.

If not ready for service, the hooks do nothing. However charm hooks are not re-run just because we weren't ready to process them.

The charm either needs to queue a list of such relations to process later, or, needs to iteratively check all such relations in some way when it later decides it is ready for service.

This seems to be a common charm design problem that probably needs some documentation written for charm authors to describe how to handle this situation cleanly. As it's not 100% obvious.

The ceph-mon charm does indeed have such a function to re-process the hook requests, which is notify_client(). However this function is not re-run in the case of config_changed. Presently it appears to happen only in the case of osd_relation (presumably to check if expected-monitor-count is now exceeded), upgrade_charm and also mon_relation.

As a result in the originally described situation (changing monitor-count=1) the client relations are not re-triggered. It also doesn't happen if expected-osd-count is updated, which is the situation I hit this same issue. expected-osd-count was not hit, so the relations were skipped and then even if you update the config they are not reprocessed.