microceph pool setting fails

Bug #2067247 reported by Marian Gasparovic
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Snap
Fix Released
Critical
Unassigned

Bug Description

2023.2/candidate (rev 509)
While testing this candidate we hit this issue in all our test runs.

When running `sunbeam cluster join` microceph charm goes to error

microceph/0* active idle 0 10.246.164.174
microceph/1 error idle 1 10.246.164.175 hook failed: "peers-relation-changed"
microceph/2 error idle 2 10.246.165.13 hook failed: "ceph-relation-joined"
microceph/3 error idle 3 10.246.165.86 hook failed: "peers-relation-changed"
microceph/4 error idle 4 10.246.165.14 hook failed: "peers-relation-changed"
microceph/5 active executing 5 10.246.167.160
microceph/6 error idle 6 10.246.164.113 hook failed: "peers-relation-changed"
microceph/7 error idle 7 10.246.164.114 hook failed: "peers-relation-changed"
microceph/8 blocked idle 8 10.246.165.85 (workload) Error in charm (see logs): Command '['microceph', 'cluster', 'join', 'eyJuYW1lIjoibWljcm9jZXBoLzgiLCJzZWNy...

Units fail because of
subprocess.CalledProcessError: Command '['sudo', 'microceph', 'pool', 'set-rf', '--size', '1', '']' returned non-zero exit status 1

charm - microceph reef/candidate rev 47
snap - microceph 18.2.0+snap450240f5dd 975 reef/stable

One of failed test runs - https://oil-jenkins.canonical.com/artifacts/66675a8b-8d56-477f-9e4a-5731c88135c2/index.html

Tags: cdo-qa
Revision history for this message
Peter Sabaini (peter-sabaini) wrote :
Download full text (4.2 KiB)

Some more context from https://oil-jenkins.canonical.com/artifacts/66675a8b-8d56-477f-9e4a-5731c88135c2/generated/generated/sunbeam/juju_debug_log.txt

It looks like we might be calling `ceph config set` prematurely before the ceph.conf is fully populated

unit-microceph-1: 12:15:42 ERROR unit.microceph/1.juju-log peers:1: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-microceph-1/charm/./src/charm.py", line 343, in <module>
    main(MicroCephCharm)
  File "/var/lib/juju/agents/unit-microceph-1/charm/venv/ops/main.py", line 544, in main
    manager.run()
  File "/var/lib/juju/agents/unit-microceph-1/charm/venv/ops/main.py", line 520, in run
    self._emit()
  File "/var/lib/juju/agents/unit-microceph-1/charm/venv/ops/main.py", line 509, in _emit
    _emit_charm_event(self.charm, self.dispatcher.event_name)
  File "/var/lib/juju/agents/unit-microceph-1/charm/venv/ops/main.py", line 143, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-microceph-1/charm/venv/ops/framework.py", line 350, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-microceph-1/charm/venv/ops/framework.py", line 849, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-microceph-1/charm/venv/ops/framework.py", line 939, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-microceph-1/charm/src/relation_handlers.py", line 261, in on_changed
    self._rel_changed_nonldr(event)
  File "/var/lib/juju/agents/unit-microceph-1/charm/src/relation_handlers.py", line 254, in _rel_changed_nonldr
    self.on.node_added.emit(**event_args)
  File "/var/lib/juju/agents/unit-microceph-1/charm/venv/ops/framework.py", line 350, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-microceph-1/charm/venv/ops/framework.py", line 849, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-microceph-1/charm/venv/ops/framework.py", line 939, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-microceph-1/charm/src/relation_handlers.py", line 325, in _on_node_added
    self.callback_f(event)
  File "/var/lib/juju/agents/unit-microceph-1/charm/./src/charm.py", line 113, in configure_charm
    self.configure_ceph(event)
  File "/var/lib/juju/agents/unit-microceph-1/charm/./src/charm.py", line 339, in configure_ceph
    raise e
  File "/var/lib/juju/agents/unit-microceph-1/charm/./src/charm.py", line 327, in configure_ceph
    microceph.set_pool_size("", str(default_rf))
  File "/var/lib/juju/agents/unit-microceph-1/charm/src/microceph.py", line 251, in set_pool_size
    _run_cmd(cmd)
  File "/var/lib/juju/agents/unit-microceph-1/charm/src/microceph.py", line 44, in _run_cmd
    raise e
  File "/var/lib/juju/agents/unit-microceph-1/charm/src/microceph.py", line 39, in _run_cmd
    process = subprocess.run(cmd, capture_output=True, text=True, check=True, timeout=180)
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['sudo', 'microceph', 'pool', 'set-rf', '--size', '1', '']' returned non-ze...

Read more...

Revision history for this message
Peter Sabaini (peter-sabaini) wrote :
Revision history for this message
Peter Sabaini (peter-sabaini) wrote :

Ticket: CEPH-736

James Page (james-page)
Changed in snap-openstack:
status: New → Triaged
importance: Undecided → Critical
Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :
Changed in snap-openstack:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.