Comment 7 for bug 1952126

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to monitoring (master)

Reviewed: https://review.opendev.org/c/starlingx/monitoring/+/819137
Committed: https://opendev.org/starlingx/monitoring/commit/a9f84e13b13b0f20f1511d0503bbcf2df9f0fced
Submitter: "Zuul (22348)"
Branch: master

commit a9f84e13b13b0f20f1511d0503bbcf2df9f0fced
Author: Bin Qian <email address hidden>
Date: Thu Nov 18 15:05:51 2021 -0500

    Add new collectd plugin to monitor a service status

    When openldap service status return 160, raise a major alarm
    for the service is approaching its FD limit. When 161 is returned
    raise critical alarm for the limit is reached.

    SM will degrade the node when the FD reaches the limit.
    Ref SM changes:
    https://review.opendev.org/c/starlingx/ha/+/819130

    TC passed:
    Alarm is raised when FD limit is reached, or above 95% (approaching).
    Alarm is cleared when FD usage is below 95% threshold.
    Upgrade test. New alarm raised on controller-1 (N+1).
    Alarm is cleared when collectd restarts or node reboot (alarm will
    be re-raised if alarming situation is dected again)
    SM detects 161 status code and degraded the node with service
    degraded alarm.
    Alarm raised after fm comes back up after being not available.
    Alarm is cleared after fm comes backup after being not available.

    Closes-bug: 1952126
    Depends-on: https://review.opendev.org/c/starlingx/fault/+/819132

    Change-Id: I78bb6ed6f24570d68f62818e1242286d638fd835
    Signed-off-by: Bin Qian <email address hidden>