need mechanism to report error when action-pruner is dead

Bug #2009879 reported by Tianqi Xiao
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
Wishlist
Unassigned

Bug Description

We have a model under an HA controller (version 2.9.38.1). The action-pruner of this model is not running on any of these controllers. The engine report shows the following messages:

```
        redacted_model_uuid:
          report:
            manifolds:
              action-pruner:
                error: ...
                inputs:
                ...
                state: stopped

```

We were expecting to see an error reported by the `/health` endpoint, but there were none.

```
$ curl -k https://subnet.{controller_ips}:17070/health
running
running
running
```

It would be proper to consider the engine status when evaluating controller's health and the `/health` endpoint should report on this kind of failures.

Revision history for this message
Ian Booth (wallyworld) wrote :

The /health endpoint is quite simple and reflects a binary status - can the controller serve api requests. It can, so comes back healthy. An error would indicate that the controller itself is totally broken, which is not the case.

Tools like the engine report can be access via the introspection api to provide extra detail on the state of various components. Development work is currently underway on a "controller" charm which will provide various endpoints to expose this type of information.

Revision history for this message
Joseph Phillips (manadart) wrote :

Version 2.9.38.1 would tend to indicate a locally built agent.

If that's the case, we can't really be sure of the hash that it was built from...

Revision history for this message
Juan M. Tirado (tiradojm) wrote :

According to @wallyword comments I will set this as invalid

Changed in juju:
status: New → Invalid
Revision history for this message
Ian Booth (wallyworld) wrote :

I don't think the request is invalid per se - the ask is to provide better visibility to controller internals that may have stopped working etc. I'll adjust the bug title and mark as Wishlist.

Changed in juju:
importance: Undecided → Wishlist
status: Invalid → Triaged
summary: - /health endpoint doesn't report error when action-pruner is dead
+ need mechanism to report error when action-pruner is dead
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.