hook error when units present on rbd-mirror relation before bootstrap is complete

Bug #1819852 reported by Frode Nordahl
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceph Monitor Charm
Fix Released
High
Frode Nordahl

Bug Description

2019-03-13 08:49:05 DEBUG install Created symlink /etc/systemd/system/ceph-create-keys.service → /dev/null.
2019-03-13 08:49:09 INFO juju-log still waiting for leader to setup keys
2019-03-13 08:49:13 DEBUG juju-log Hardening function 'config_changed'
2019-03-13 08:49:14 DEBUG juju-log No hardening applied to 'config_changed'
2019-03-13 08:49:14 INFO juju-log Ceph is not bootstrapped, skipping upgrade checks.
2019-03-13 08:49:14 INFO juju-log Monitor hosts are ['10.219.3.224:6789']
2019-03-13 08:49:14 DEBUG juju-log Updating sysctl_file: /etc/sysctl.d/50-ceph-charm.conf values: {'kernel.pid_max': 2097152, 'vm.max_map_count': 524288, 'kernel.threads-max': 2097152}
2019-03-13 08:49:14 DEBUG config-changed sysctl: permission denied on key 'kernel.pid_max'
2019-03-13 08:49:14 DEBUG config-changed sysctl: permission denied on key 'vm.max_map_count'
2019-03-13 08:49:14 DEBUG config-changed sysctl: permission denied on key 'kernel.threads-max'
2019-03-13 08:49:15 INFO juju-log still waiting for leader to setup keys
2019-03-13 08:49:28 INFO juju-log Making dir /var/lib/charm/ceph-mon-b ceph:ceph 555
2019-03-13 08:49:31 DEBUG juju-log Writing file /var/lib/charm/ceph-mon-b/ceph.conf root:root 644
2019-03-13 08:49:31 DEBUG leader-settings-changed update-alternatives: using /var/lib/charm/ceph-mon-b/ceph.conf to provide /etc/ceph/ceph.conf (ceph.conf) in auto mode
2019-03-13 08:49:31 INFO juju-log Not enough mons (1), punting.
2019-03-13 08:49:36 DEBUG juju-log rbd-mirror:20: mon cluster is not in quorum
2019-03-13 08:49:42 DEBUG juju-log rbd-mirror:20: mon cluster is not in quorum
2019-03-13 08:49:49 DEBUG juju-log rbd-mirror:21: mon cluster is not in quorum
2019-03-13 08:49:58 DEBUG juju-log rbd-mirror:20: mon cluster is not in quorum
2019-03-13 08:50:08 DEBUG juju-log rbd-mirror:21: mon cluster is not in quorum
2019-03-13 08:50:17 DEBUG juju-log rbd-mirror:21: mon cluster is not in quorum
2019-03-13 08:50:35 INFO juju-log mon:1: Making dir /var/lib/charm/ceph-mon-b ceph:ceph 555
2019-03-13 08:50:39 DEBUG mon-relation-changed unable to get monitor info from DNS SRV with service name: ceph-mon
2019-03-13 08:50:39 DEBUG mon-relation-changed no monitors specified to connect to.
2019-03-13 08:50:39 DEBUG mon-relation-changed 2019-03-13 08:50:39.323 7f358c7cb700 -1 failed for service _ceph-mon._tcp
2019-03-13 08:50:39 DEBUG mon-relation-changed 2019-03-13 08:50:39.323 7f358c7cb700 -1 monclient: get_monmap_and_config cannot identify monitors to contact
2019-03-13 08:50:39 DEBUG mon-relation-changed [errno 2] error connecting to the cluster
2019-03-13 08:50:39 DEBUG mon-relation-changed Traceback (most recent call last):
2019-03-13 08:50:39 DEBUG mon-relation-changed File "/var/lib/juju/agents/unit-ceph-mon-b-0/charm/hooks/mon-relation-changed", line 947, in <module>
2019-03-13 08:50:39 DEBUG mon-relation-changed hooks.execute(sys.argv)
2019-03-13 08:50:39 DEBUG mon-relation-changed File "/var/lib/juju/agents/unit-ceph-mon-b-0/charm/hooks/charmhelpers/core/hookenv.py", line 909, in execute
2019-03-13 08:50:39 DEBUG mon-relation-changed self._hooks[hook_name]()
2019-03-13 08:50:39 DEBUG mon-relation-changed File "/var/lib/juju/agents/unit-ceph-mon-b-0/charm/hooks/mon-relation-changed", line 408, in mon_relation
2019-03-13 08:50:39 DEBUG mon-relation-changed emit_cephconf()
2019-03-13 08:50:39 DEBUG mon-relation-changed File "/var/lib/juju/agents/unit-ceph-mon-b-0/charm/hooks/mon-relation-changed", line 211, in emit_cephconf
2019-03-13 08:50:39 DEBUG mon-relation-changed render('ceph.conf', charm_ceph_conf, get_ceph_context(), perms=0o644)
2019-03-13 08:50:39 DEBUG mon-relation-changed File "/var/lib/juju/agents/unit-ceph-mon-b-0/charm/hooks/mon-relation-changed", line 190, in get_ceph_context
2019-03-13 08:50:39 DEBUG mon-relation-changed rbd_features = get_rbd_features()
2019-03-13 08:50:39 DEBUG mon-relation-changed File "/var/lib/juju/agents/unit-ceph-mon-b-0/charm/hooks/utils.py", line 219, in get_rbd_features
2019-03-13 08:50:39 DEBUG mon-relation-changed return add_rbd_mirror_features(get_default_rbd_features())
2019-03-13 08:50:39 DEBUG mon-relation-changed File "/var/lib/juju/agents/unit-ceph-mon-b-0/charm/hooks/utils.py", line 190, in get_default_rbd_features
2019-03-13 08:50:39 DEBUG mon-relation-changed universal_newlines=True)
2019-03-13 08:50:39 DEBUG mon-relation-changed File "/usr/lib/python3.6/subprocess.py", line 336, in check_output
2019-03-13 08:50:39 DEBUG mon-relation-changed **kwargs).stdout
2019-03-13 08:50:39 DEBUG mon-relation-changed File "/usr/lib/python3.6/subprocess.py", line 418, in run
2019-03-13 08:50:39 DEBUG mon-relation-changed output=stdout, stderr=stderr)
2019-03-13 08:50:39 DEBUG mon-relation-changed subprocess.CalledProcessError: Command '['ceph', '-c', '/dev/null', '--show-config']' returned non-zero exit status 1.
2019-03-13 08:50:39 ERROR juju.worker.uniter.operation runhook.go:132 hook "mon-relation-changed" failed: exit status 1

Frode Nordahl (fnordahl)
Changed in charm-ceph-mon:
assignee: nobody → Frode Nordahl (fnordahl)
status: New → Triaged
importance: Undecided → High
status: Triaged → In Progress
Revision history for this message
Frode Nordahl (fnordahl) wrote :

It appears the ``ceph`` command line tool requires a connection to a cluster even when it will not use it.

Alternatives:
1) Install python3-rados and get default like this:
    import rados

    r = rados.Rados(conffile='/dev/null')
    r.conf_get('rbd_default_features') # = '61'

2) Hard code '61' in the charm

Add unit test that validates the value against latest version of python3-rados so we keep up to date.

Revision history for this message
Frode Nordahl (fnordahl) wrote :

3) Guard call to get_rbd_features() with if ceph.is_bootstrapped()

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

We just ripped out a bunch of code that uses librados in favor of using the command line :-/

Revision history for this message
Frode Nordahl (fnordahl) wrote :

Alternative 3 appears to not be robust enough.

I guess we could catch the exception in get_rbd_features() and return None. There will be enough hook executions after the bootstrap is complete to have the correct value propagated.

Revision history for this message
James Page (james-page) wrote :

ceph-conf -D

Revision history for this message
Frode Nordahl (fnordahl) wrote :

Parsing output from ceph-conf -c /dev/null -D will do the trick

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-ceph-mon (master)

Fix proposed to branch: master
Review: https://review.openstack.org/643000

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-ceph-mon (master)

Reviewed: https://review.openstack.org/643000
Committed: https://git.openstack.org/cgit/openstack/charm-ceph-mon/commit/?id=f3acd81a314d4c2b491a44dd3df37661910e6c17
Submitter: Zuul
Branch: master

commit f3acd81a314d4c2b491a44dd3df37661910e6c17
Author: Frode Nordahl <email address hidden>
Date: Wed Mar 13 11:26:19 2019 +0100

    Use ``ceph-conf`` to retrieve default values

    The ``ceph`` command expects connection to a running cluster even
    if it does not use it.

    Change-Id: Ied3edf63706e2d48d2ea09056bc6d6508e9e3e0f
    Closes-Bug: #1819852

Changed in charm-ceph-mon:
status: In Progress → Fix Committed
James Page (james-page)
Changed in charm-ceph-mon:
milestone: none → 19.04
David Ames (thedac)
Changed in charm-ceph-mon:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.