Comment 0 for bug 1952282

Revision history for this message
Bas de Bruijne (basdbruijne) wrote :

Run fails on juju wait timeout because ceph dashboard dies:

------------------------------------------------
ceph-mon/0* waiting executing 0/lxd/1 10.246.65.28 Monitor bootstrapped but waiting for number of OSDs to reach expected-osd-count (6)
  ceph-dashboard/0* blocked idle 10.246.65.28 Dashboard is not enabled
  logrotated/15 active idle 10.246.65.28 Unit is ready.
ceph-mon/1 active executing 2/lxd/1 10.246.65.55 Unit is ready and clustered
  ceph-dashboard/1 error idle 10.246.65.55 hook failed: "dashboard-relation-changed"
  logrotated/20 active idle 10.246.65.55 Unit is ready.
ceph-mon/2 active executing 4/lxd/1 10.246.65.52 Unit is ready and clustered
  ceph-dashboard/2 error idle 10.246.65.52 hook failed: "dashboard-relation-changed"
  logrotated/22 active idle 10.246.65.52 Unit is ready.
------------------------------------------------

Ceph dashboard log:
------------------------------------------------
2021-11-24 17:26:00 INFO unit.ceph-dashboard/2.juju-log server.go:327 dashboard:74: Requesting a CA certificate. Common name: juju-af480f-4-lxd-1.prodymcprodface.solutionsqa, SANS: ['10.246.65.20', 'juju-af480f-4-lxd-1']
2021-11-24 17:26:01 ERROR unit.ceph-dashboard/2.juju-log server.go:327 dashboard:74: Command failed: b"Error ENOTSUP: Module 'dashboard' is not enabled (required by command 'dashboard debug'): use `ceph mgr module enable dashboard` to enable it\n"
Traceback (most recent call last):
  File "./src/charm.py", line 378, in _run_cmd
    output = subprocess.check_output(cmd, stderr=subprocess.STDOUT)
  File "/usr/lib/python3.8/subprocess.py", line 415, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ceph', 'dashboard', 'debug', 'disable']' returned non-zero exit status 95.
------------------------------------------------

Later:
------------------------------------------------
2021-11-24 17:26:11 ERROR unit.ceph-dashboard/2.juju-log server.go:327 dashboard:74: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "./src/charm.py", line 597, in <module>
    main(CephDashboardCharm)
  File "/var/lib/juju/agents/unit-ceph-dashboard-2/charm/venv/ops/main.py", line 406, in main
    _emit_charm_event(charm, dispatcher.event_name)
  File "/var/lib/juju/agents/unit-ceph-dashboard-2/charm/venv/ops/main.py", line 140, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-ceph-dashboard-2/charm/venv/ops/framework.py", line 278, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-ceph-dashboard-2/charm/venv/ops/framework.py", line 722, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-ceph-dashboard-2/charm/venv/ops/framework.py", line 767, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-ceph-dashboard-2/charm/src/interface_dashboard.py", line 50, in on_changed
    self.on.mon_ready.emit()
  File "/var/lib/juju/agents/unit-ceph-dashboard-2/charm/venv/ops/framework.py", line 278, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-ceph-dashboard-2/charm/venv/ops/framework.py", line 722, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-ceph-dashboard-2/charm/venv/ops/framework.py", line 767, in _reemit
    custom_handler(event)
  File "./src/charm.py", line 427, in _configure_dashboard
    self._configure_tls()
  File "./src/charm.py", line 539, in _configure_tls
    ceph_utils.dashboard_set_ssl_certificate(
  File "/var/lib/juju/agents/unit-ceph-dashboard-2/charm/venv/charms_ceph/utils.py", line 3527, in _dashboard_set_ssl_artifact
    subprocess.check_call(cmd)
  File "/usr/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['ceph', 'dashboard', 'set-ssl-certificate', 'juju-af480f-4-lxd-1', '-i', PosixPath('/etc/ceph/ceph-dashboard.crt')]' returned non-zero exit status 95.
2021-11-24 17:26:11 ERROR juju.worker.uniter.operation runhook.go:146 hook "dashboard-relation-changed" (via hook dispatching script: dispatch) failed: exit status 1
------------------------------------------------

Testruns where this happened (3 today):
https://solutions.qa.canonical.com/testruns/testRun/6a264856-6a03-4995-b531-fa5f936ba7ad
https://solutions.qa.canonical.com/testruns/testRun/eaf26a04-7428-41e4-8fb7-93941b0112eb
https://solutions.qa.canonical.com/testruns/testRun/abf5ac6c-3b5a-405b-a565-719123d8703d

With artifacts respectively:
https://oil-jenkins.canonical.com/artifacts/6a264856-6a03-4995-b531-fa5f936ba7ad/index.html
https://oil-jenkins.canonical.com/artifacts/eaf26a04-7428-41e4-8fb7-93941b0112eb/index.html
https://oil-jenkins.canonical.com/artifacts/abf5ac6c-3b5a-405b-a565-719123d8703d/index.html

We also had a run with the same configuration where this did not happen and ceph-dashboard is happy:
https://solutions.qa.canonical.com/testruns/testRun/276098a9-923d-4d92-86e0-3d177b13e6b9
with artifacts: https://oil-jenkins.canonical.com/artifacts/276098a9-923d-4d92-86e0-3d177b13e6b9/index.html