ceph-mon rbd mirror relation intermittently fails to get quota

Bug #2021967 reported by Peter Sabaini
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ceph Monitor Charm
In Progress
Undecided
Unassigned

Bug Description

Can see the below traceback in CI when testing the RBD mirror charm. It doesn't seem to be a systematic issue though, I've had this on half the test runs or so.

I think this might be a race in setting up auth versus creating the RBD relation

2023-05-26 16:14:24 ERROR unit.ceph-mon/1.juju-log server.go:316 mon:0: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "./src/charm.py", line 308, in <module>
    main(CephMonCharm)
  File "/var/lib/juju/agents/unit-ceph-mon-1/charm/venv/ops/main.py", line 441, in main
    _emit_charm_event(charm, dispatcher.event_name)
  File "/var/lib/juju/agents/unit-ceph-mon-1/charm/venv/ops/main.py", line 149, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-ceph-mon-1/charm/venv/ops/framework.py", line 354, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-ceph-mon-1/charm/venv/ops/framework.py", line 830, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-ceph-mon-1/charm/venv/ops/framework.py", line 919, in _reemit
    custom_handler(event)
  File "./src/charm.py", line 132, in on_mon_relation
    if hooks.mon_relation():
  File "/var/lib/juju/agents/unit-ceph-mon-1/charm/src/ceph_hooks.py", line 520, in mon_relation
    notify_relations()
  File "/var/lib/juju/agents/unit-ceph-mon-1/charm/src/ceph_hooks.py", line 597, in notify_relations
    notify_osds()
  File "/var/lib/juju/agents/unit-ceph-mon-1/charm/src/ceph_hooks.py", line 618, in notify_osds
    osd_relation(relid=relid, unit=unit)
  File "/var/lib/juju/agents/unit-ceph-mon-1/charm/src/ceph_hooks.py", line 868, in osd_relation
    notify_rbd_mirrors()
  File "/var/lib/juju/agents/unit-ceph-mon-1/charm/src/ceph_hooks.py", line 630, in notify_rbd_mirrors
    rbd_mirror_relation(relid=relid, unit=unit, recurse=False)
  File "/var/lib/juju/agents/unit-ceph-mon-1/charm/src/ceph_hooks.py", line 1014, in rbd_mirror_relation
    'pools': json.dumps(ceph.list_pools_detail(), sort_keys=True),
  File "/var/lib/juju/agents/unit-ceph-mon-1/charm/venv/charms_ceph/utils.py", line 3166, in list_pools_detail
    'quota': get_pool_quota(pool),
  File "/var/lib/juju/agents/unit-ceph-mon-1/charm/venv/charms_ceph/utils.py", line 3103, in get_pool_quota
    output = subprocess.check_output(
  File "/usr/lib/python3.8/subprocess.py", line 415, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ceph', '--id', 'admin', 'osd', 'pool', 'get-quota', '2023-05-26T16:14:18.657+0000 7fe7f65fe700 -1 monclient: get_auth_request but no auth handler is set up']' returned non-zero exit status 2.
2023-05-26 16:14:24 ERROR juju.worker.uniter.operation runhook.go:153 hook "mon-relation-changed" (via hook dispatching script: dispatch) failed: exit status 1

https://openstack-ci-reports.ubuntu.com/artifacts/e92/884510/2/check/focal-yoga/e9215db/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-ceph-mon (master)
Changed in charm-ceph-mon:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-ceph-mon (master)

Reviewed: https://review.opendev.org/c/openstack/charm-ceph-mon/+/884986
Committed: https://opendev.org/openstack/charm-ceph-mon/commit/af2323c4578ff7f4b352acfe350279eddd1ce0aa
Submitter: "Zuul (22348)"
Branch: master

commit af2323c4578ff7f4b352acfe350279eddd1ce0aa
Author: Peter Sabaini <email address hidden>
Date: Wed May 31 13:07:11 2023 +0200

    rbd mirror relation: be persistent in getting pool info

    Auth for getting pool details can fail initially if we set up a rbd
    mirror relation at cloud bootstrap. Add some retry to give it another
    chance

    Change-Id: I2f5ac561120b1abe52ea0621bb472bc78495fa97
    Partial-Bug: #2021967

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-ceph-mon (stable/quincy.2)

Fix proposed to branch: stable/quincy.2
Review: https://review.opendev.org/c/openstack/charm-ceph-mon/+/885918

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-ceph-mon (stable/quincy.2)

Reviewed: https://review.opendev.org/c/openstack/charm-ceph-mon/+/885918
Committed: https://opendev.org/openstack/charm-ceph-mon/commit/1b6701a440ceb76f41fd9856c76d3c76886dc399
Submitter: "Zuul (22348)"
Branch: stable/quincy.2

commit 1b6701a440ceb76f41fd9856c76d3c76886dc399
Author: Peter Sabaini <email address hidden>
Date: Wed May 31 13:07:11 2023 +0200

    rbd mirror relation: be persistent in getting pool info

    Auth for getting pool details can fail initially if we set up a rbd
    mirror relation at cloud bootstrap. Add some retry to give it another
    chance

    Change-Id: I2f5ac561120b1abe52ea0621bb472bc78495fa97
    Partial-Bug: #2021967
    (cherry picked from commit af2323c4578ff7f4b352acfe350279eddd1ce0aa)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.