Set blocked status for deferred restarts to indicate operator intervention required

Bug #1934406 reported by Garrett Neugent
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Neutron Gateway Charm
New
Undecided
Unassigned
OpenStack Neutron Open vSwitch Charm
New
Undecided
Unassigned
OpenStack RabbitMQ Server Charm
New
Undecided
Unassigned
charm-ovn-central
New
Undecided
Unassigned
charm-ovn-chassis
New
Undecided
Unassigned
charm-ovn-dedicated-chassis
New
Undecided
Unassigned

Bug Description

Currently, when a restart is deferred, it shows up as active and idle in juju status. However, restarts require operator intervention to resolve, and so the charm hook should set this to a blocked status as well. This applies to all charms with enable-auto-restarts, not just ovn-chassis.

Thanks!

Tags: bseng-147
Revision history for this message
Andrea Ieri (aieri) wrote :

On top of setting status to blocked or notifying the operator via juju status in some other way, it would probably be useful to also set up nrpe alerting. After all, you eventually do want to apply any pending restart or hook so it would seem appropriate to treat a deferred restart as a type of "error" condition.

Andrea Ieri (aieri)
tags: added: bseng-147
Revision history for this message
Felipe Reyes (freyes) wrote : Re: [Bug 1934406] Re: Set blocked status for deferred restarts to indicate operator intervention required

On Mon, 2022-05-30 at 23:14 +0000, Andrea Ieri wrote:
> On top of setting status to blocked or notifying the operator via juju
> status in some other way, it would probably be useful to also set up
> nrpe alerting. After all, you eventually do want to apply any pending
> restart or hook so it would seem appropriate to treat a deferred restart
> as a type of "error" condition.
>
this is one of those scenarios where a nagios check is too strict (OK or not),
so proably the check should encapsulate some logic around the number of deferred
restarts and the age of those events to transition from OK -> Warning ->
Critical.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.