service will not start when using ceph pacific

Bug #1931745 reported by Jeff Hillman
28
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Ceph Monitor Charm
Invalid
Undecided
Unassigned
Prometheus Ceph Exporter Charm
Won't Fix
Medium
Unassigned

Bug Description

Ceph Pacific 16.2.0 (focal-wallaby)

When deploying prometheus-ceph-exporter, the pce service will not start. From syslog:

```
Jun 11 17:36:18 juju-b11cc4-35-lxd-3 systemd[1]: Stopped Service for snap application prometheus-ceph-exporter.ceph-exporter.
Jun 11 17:36:18 juju-b11cc4-35-lxd-3 systemd[1]: Started Service for snap application prometheus-ceph-exporter.ceph-exporter.
Jun 11 17:36:18 juju-b11cc4-35-lxd-3 prometheus-ceph-exporter.ceph-exporter[28111]: * Running /snap/prometheus-ceph-exporter/20/bin/ceph_exporter with args: -ceph.user prometheus-ceph-exporter
Jun 11 17:36:18 juju-b11cc4-35-lxd-3 prometheus-ceph-exporter.ceph-exporter[28135]: 2021/06/11 17:36:18 cannot connect to ceph cluster: rados: Operation not supported
Jun 11 17:36:18 juju-b11cc4-35-lxd-3 systemd[1]: snap.prometheus-ceph-exporter.ceph-exporter.service: Main process exited, code=exited, status=1/FAILURE
Jun 11 17:36:18 juju-b11cc4-35-lxd-3 systemd[1]: snap.prometheus-ceph-exporter.ceph-exporter.service: Failed with result 'exit-code'.
Jun 11 17:36:18 juju-b11cc4-35-lxd-3 systemd[1]: snap.prometheus-ceph-exporter.ceph-exporter.service: Scheduled restart job, restart counter is at 2.
Jun 11 17:36:18 juju-b11cc4-35-lxd-3 systemd[1]: Stopped Service for snap application prometheus-ceph-exporter.ceph-exporter.
Jun 11 17:36:18 juju-b11cc4-35-lxd-3 systemd[1]: Started Service for snap application prometheus-ceph-exporter.ceph-exporter.
Jun 11 17:36:18 juju-b11cc4-35-lxd-3 prometheus-ceph-exporter.ceph-exporter[28151]: * Running /snap/prometheus-ceph-exporter/20/bin/ceph_exporter with args: -ceph.user prometheus-ceph-exporter
Jun 11 17:36:18 juju-b11cc4-35-lxd-3 prometheus-ceph-exporter.ceph-exporter[28175]: 2021/06/11 17:36:18 cannot connect to ceph cluster: rados: Operation not supported
Jun 11 17:36:18 juju-b11cc4-35-lxd-3 systemd[1]: snap.prometheus-ceph-exporter.ceph-exporter.service: Main process exited, code=exited, status=1/FAILURE
Jun 11 17:36:18 juju-b11cc4-35-lxd-3 systemd[1]: snap.prometheus-ceph-exporter.ceph-exporter.service: Failed with result 'exit-code'.
Jun 11 17:36:18 juju-b11cc4-35-lxd-3 systemd[1]: snap.prometheus-ceph-exporter.ceph-exporter.service: Scheduled restart job, restart counter is at 3.
```

From the charm unit logs:

```
2021-06-11 20:19:28 WARNING unit.prometheus-ceph-exporter/0.ceph-relation-joined logger.go:60 Traceback (most recent call last):
2021-06-11 20:19:28 WARNING unit.prometheus-ceph-exporter/0.ceph-relation-joined logger.go:60 File "/var/lib/juju/agents/unit-prometheus-ceph-exporter-0/charm/hooks/ceph-relation-joined", line 22, in <module>
2021-06-11 20:19:28 WARNING unit.prometheus-ceph-exporter/0.ceph-relation-joined logger.go:60 main()
2021-06-11 20:19:28 WARNING unit.prometheus-ceph-exporter/0.ceph-relation-joined logger.go:60 File "/var/lib/juju/agents/unit-prometheus-ceph-exporter-0/.venv/lib/python3.8/site-packages/charms/reactive/__init__.py", line 74, in main
2021-06-11 20:19:28 WARNING unit.prometheus-ceph-exporter/0.ceph-relation-joined logger.go:60 bus.dispatch(restricted=restricted_mode)
2021-06-11 20:19:28 WARNING unit.prometheus-ceph-exporter/0.ceph-relation-joined logger.go:60 File "/var/lib/juju/agents/unit-prometheus-ceph-exporter-0/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 390, in dispatch
2021-06-11 20:19:28 WARNING unit.prometheus-ceph-exporter/0.ceph-relation-joined logger.go:60 _invoke(other_handlers)
2021-06-11 20:19:28 WARNING unit.prometheus-ceph-exporter/0.ceph-relation-joined logger.go:60 File "/var/lib/juju/agents/unit-prometheus-ceph-exporter-0/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 359, in _invoke
2021-06-11 20:19:28 WARNING unit.prometheus-ceph-exporter/0.ceph-relation-joined logger.go:60 handler.invoke()
2021-06-11 20:19:28 WARNING unit.prometheus-ceph-exporter/0.ceph-relation-joined logger.go:60 File "/var/lib/juju/agents/unit-prometheus-ceph-exporter-0/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 181, in invoke
2021-06-11 20:19:28 WARNING unit.prometheus-ceph-exporter/0.ceph-relation-joined logger.go:60 self._action(*args)
2021-06-11 20:19:28 WARNING unit.prometheus-ceph-exporter/0.ceph-relation-joined logger.go:60 File "/var/lib/juju/agents/unit-prometheus-ceph-exporter-0/charm/reactive/prometheus_ceph_exporter.py", line 144, in configure_exporter
2021-06-11 20:19:28 WARNING unit.prometheus-ceph-exporter/0.ceph-relation-joined logger.go:60 raise ServiceError("Service didn't start: {}".format(SVC_NAME))
2021-06-11 20:19:28 WARNING unit.prometheus-ceph-exporter/0.ceph-relation-joined logger.go:60 reactive.prometheus_ceph_exporter.ServiceError: Service didn't start: snap.prometheus-ceph-exporter.ceph-exporter
2021-06-11 20:19:28 ERROR juju.worker.uniter.operation runhook.go:139 hook "ceph-relation-joined" (via explicit, bespoke hook script) failed: exit status 1
2021-06-11 20:19:28 INFO juju.worker.uniter resolver.go:144 awaiting error resolution for "relation-joined" hook
2021-06-11 20:21:25 INFO juju.worker.uniter resolver.go:144 awaiting error resolution for "relation-joined" hook
2021-06-11 20:24:28 INFO juju.worker.uniter resolver.go:144 awaiting error resolution for "relation-joined" hook
```

There is no radosgw in this environment, FYI.

When manually attempting to start the service with `sudo /snap/prometheus-ceph-exporter/20/bin/ceph_exporter with args: -ceph.user prometheus-ceph-exporter` the message of :

```
/snap/prometheus-ceph-exporter/20/bin/ceph_exporter: error while loading shared libraries: librados.so.2: cannot open shared object file: No such file or directory
```

appears. Installing the package of librados2 and re-running the ceph-exporter commands returns this message:

```
2021/06/11 20:46:06 cannot read ceph config file: rados: No such file or directory
```

Tags: cpe-onsite

Related branches

Revision history for this message
Jeff Hillman (jhillman) wrote :

subscribed field-high

Revision history for this message
Xav Paice (xavpaice) wrote :

Note: the upstream project does not yet support Octopus - see https://github.com/digitalocean/ceph_exporter/issues/182.

The Ceph project itself now has Prometheus metrics: https://docs.ceph.com/en/latest/mgr/prometheus/

TODO:
* Update the Ceph charms to enable them for an exporter (design required), and have them export dashboards
* update the prometheus-ceph-exporter charm to prevent it attempting to run if Ceph is Octopus+

Revision history for this message
Xav Paice (xavpaice) wrote :

Added ceph-mon charm as the requirement for Prometheus metrics export is not something we can achieve with the prometheus-ceph-exporter for newer versions.

Revision history for this message
Nobuto Murata (nobuto) wrote :

> * Update the Ceph charms to enable them for an exporter (design required), and have them export dashboards

Enabling the embedded exporter and adding relations with prometheus is already possible with the ceph-mon charm. Dashboards are not automatically set up though:
https://bugs.launchpad.net/charm-ceph-mon/+bug/1912557

Edin S (exsdev)
Changed in charm-prometheus-ceph-exporter:
importance: Undecided → Medium
Revision history for this message
Billy Olsen (billy-olsen) wrote :

Marking task on charm-ceph-mon as invalid as it does not apply to the ceph-mon charm. This is an issue with the ceph exporter, which does not support pacific.

Changed in charm-ceph-mon:
status: New → Invalid
James Troup (elmo)
Changed in charm-prometheus-ceph-exporter:
status: New → Incomplete
status: Incomplete → In Progress
Revision history for this message
Shunde Zhang (shunde-zhang) wrote :

It should work if change to edge channel in snap.

juju config prometheus-ceph-exporter snap_channel=edge

The edge channel has a newer version which works with nautilus+.

$ snap info prometheus-ceph-exporter
......
channels:
  latest/stable: 2.0.0 2018-07-11 (20) 8MB -
  latest/candidate: 3.0.0-nautilus 2021-07-09 (22) 11MB -
  latest/beta: 2.0.0 2018-07-11 (20) 8MB -
  latest/edge: 3.0.0-nautilus 2021-07-09 (22) 11MB -

Revision history for this message
Mathew Clarke (matclarke) wrote :

I'm still seeing a ton of "Sep 9 06:33:34 juju-78f91c-5-lxd-21 prometheus-ceph-exporter.ceph-exporter[373]: 2022/09/09 06:33:34 failed sending PG command {"format":"json","pgid":"20.1ca","prefix":"query"}: rados: Permission denied" errors which are filling up our graylog/elasticsearch.

I can version 4.0 has now been released (https://github.com/digitalocean/ceph_exporter/issues/182) and now officially supports Nautilus, Octopus, and Pacific.

Revision history for this message
Eric Chen (eric-chen) wrote :

This charm (ceph-exporter) is no longer being actively maintained. Please consider using the new Canonical Observability Stack instead. (https://charmhub.io/topics/canonical-observability-stack)

Changed in charm-prometheus-ceph-exporter:
status: In Progress → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.