ceph-dashboard and k8s dashboards have conflicting telegraf hostname requirements

Bug #2022385 reported by Chris Johnston
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceph Dashboard Charm
Triaged
Medium
Unassigned
Kubernetes Control Plane Charm
Triaged
Medium
Unassigned

Bug Description

Opening this bug against ceph-dashboard and kubernetes-control-plane to start the discussions.

When deploying Ceph and Kubernetes in a (hyper)converged architecture, grafana dashboards aren't able to work for both Ceph and Kubernetes.

The new ceph-dashboard charm has a requirement that telegraf `hostname` be set to `{host}`. The dashboards provided by the kubernetes-control-plane charm are using [2]:

"expr":"cpu_usage_idle{cpu=\"cpu-total\",host=~\".*kubernetes-master.*\"}",
"expr":"cpu_usage_idle{cpu=\"cpu-total\",host=~\".*kubernetes-worker.*\"}",

At a bare minimum, the ceph-dashboard docs should be updated to explain that changing the default hostname config in telegraf may break other dashboards. A better solution would be that dashboards should not be dependent on a config option to display data.

[1] https://opendev.org/openstack/charm-ceph-dashboard/src/commit/d24c6fe6f261a8c56ad259d00c8ab5eb56e0f4da/README.md#L129
[2] https://github.com/charmed-kubernetes/charm-kubernetes-control-plane/blob/c82221e61f510cd5a638e301b3e10a3f4bdd0695/templates/grafana/autoload/kubernetes.json#L895

Revision history for this message
Peter Sabaini (peter-sabaini) wrote :

Confirm for ceph-dashboard. Agreed we should at least add a warning around setting the telegraf hostname option

Changed in charm-ceph-dashboard:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
George Kraft (cynerva) wrote :

It looks like kubernetes-control-plane is related to telegraf via a `juju-info` relation which won't really include the info we need to generate the dashboard correctly. I don't see any other relation endpoints for telegraf that would be more appropriate.

So to fix this, I think we would either need to:
1. Add a `telegraf-hostname` config option to kubernetes-control-plane, or
2. Develop a `telegraf:kubernetes` endpoint to use instead of juju-info so we can get the hostname config from telegraf.

Realistically, I suspect this is a Won't Fix as we look toward the future of observability via the COS stack based on the grafana-agent charm[1] for exporting metrics. I'll leave this issue open for now until we have a better idea what's coming.

[1]: https://charmhub.io/grafana-agent

Changed in charm-kubernetes-master:
importance: Undecided → Medium
status: New → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.